OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Steven Whitehouse	c2932e03db	[GFS2] Remove "reclaim limit" This call to reclaim glocks is not needed, and in particular we don't want it in the fast path for locking glocks. The limit was entirely arbitrary anyway and we can't expect users to adjust things like this, the remaining code will do the right thing on its own. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:37 +00:00
Steven Whitehouse	60b0d08779	[GFS2] Remove unused variables These haven't been used for some time, remove them. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:35 +00:00
Steven Whitehouse	47e83b5091	[GFS2] Use correct include file in ops_address.c Something changed in the upstream kernel, and it needs this one-liner to allow ops_address.c to build. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:32 +00:00
Steven Whitehouse	c41d4f09f1	[GFS2] Don't hold page lock when starting transaction This is an addendum to the new AOPs work which moves the point at which we take the page lock so that we don't get it until the last possible moment. This resolves a conflict between starting transactions and the page lock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:30 +00:00
Steven Whitehouse	b8e7cbb65b	[GFS2] Add writepages for GFS2 jdata This patch resolves a lock ordering issue where we had been getting a transaction lock in the wrong order with respect to the page lock. By using writepages rather than just writepage, it is then possible to start a transaction before locking the page, and thus matching the locking order elsewhere in the code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:28 +00:00
Steven Whitehouse	9ff8ec32e5	[GFS2] Split gfs2_writepage into three cases This patch splits gfs2_writepage into separate functions for each of the three cases: writeback, ordered and journalled. As a result it becomes a lot easier to see what each one is doing. The common code is moved into gfs2_writepage_common. This fixes a performance bug where we were doing more work than strictly required in the ordered write case. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:25 +00:00
Steven Whitehouse	5561093e2c	[GFS2] Introduce gfs2_set_aops() Just like ext3 we now have three sets of address space operations to cover the cases of writeback, ordered and journalled data writes. This means that the individual operations can now become less complicated as we are able to remove some of the tests for file data mode from the code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:23 +00:00
Steven Whitehouse	bf36a71316	[GFS2] Add gfs2_is_writeback() This adds a function "gfs2_is_writeback()" along the lines of the existing "gfs2_is_jdata()" in order to clean up the code and make the various tests for the inode mode more obvious. It also fixes the PageChecked() logic where we were resetting the flag too early in the case of an error path. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:21 +00:00
Steven Whitehouse	e7e36f1435	[GFS2] Remove unused field in struct gfs2_inode Removes a field that is not used. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:18 +00:00
Steven Whitehouse	f91a0d3e24	[GFS2] Remove useless i_cache from inodes The i_cache was designed to keep references to the indirect blocks used during block mapping so that they didn't have to be looked up continually. The idea failed because there are too many places where the i_cache needs to be freed, and this has in the past been the cause of many bugs. In addition there was no performance benefit being gained since the disk blocks in question were cached anyway. So this patch removes it in order to simplify the code to prepare for other changes which would otherwise have had to add further support for this feature. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:16 +00:00
Steven Whitehouse	3cc3f710ce	[GFS2] Use ->page_mkwrite() for mmap() This cleans up the mmap() code path for GFS2 by implementing the page_mkwrite function for GFS2. We are thus able to use the generic filemap_fault function for our ->fault() implementation. This now means that shared writable mappings will be much more efficiently shared across the cluster if there is a reasonable proportion of read activity (the greater proportion, the better). As a side effect, it also reduces the size of the code, removes special cases from readpage and readpages, and makes the code path easier to follow. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:13 +00:00
Steven Whitehouse	51ff87bdd9	[GFS2] Clean up internal read function As requested by Christoph, this patch cleans up GFS2's internal read function so that it no longer uses the do_generic_mapping_read function. This function is obsolete and GFS2 is the last user of it. As a side effect the internal read code gets smaller and easier to read and gfs2_readpage is split into two. One function has the locking and the other function has the rest of the logic. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@infradead.org>	2008-01-25 08:07:11 +00:00
Wendy Cheng	cc7e79b168	[GFS2] Handle multiple glock demote requests Fix a race condition where multiple glock demote requests are sent to a node back-to-back. This patch does a check inside handle_callback() to see whether a demote request is in progress. If true, it sets a flag to make sure run_queue() will loop again to handle the new request, instead of erronously setting gl_demote_state to a different state. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-01-25 08:07:09 +00:00
Greg Kroah-Hartman	197b12d679	Kobject: convert fs/* from kobject_unregister() to kobject_put() There is no need for kobject_unregister() anymore, thanks to Kay's kobject cleanup changes, so replace all instances of it with kobject_put(). Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman	901195ed7f	Kobject: change GFS2 to use kobject_init_and_add Stop using kobject_register, as this way we can control the sending of the uevent properly, after everything is properly initialized. Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:26 -08:00
Greg Kroah-Hartman	0ff21e4663	kobject: convert kernel_kset to be a kobject kernel_kset does not need to be a kset, but a much simpler kobject now that we have kobj_attributes. We also rename kernel_kset to kernel_kobj to catch all users of this symbol with a build error instead of an easy-to-ignore build warning. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman	bd35b93d80	kset: convert kernel_subsys to use kset_create Dynamically create the kset instead of declaring it statically. We also rename kernel_subsys to kernel_kset to catch all users of this symbol with a build error instead of an easy-to-ignore build warning. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman	136a27507f	kset: convert gfs2 dlm to use kset_create Dynamically create the kset instead of declaring it statically. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman	9bec101a0c	kset: convert gfs2 to use kset_create Dynamically create the kset instead of declaring it statically. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:13 -08:00
Greg Kroah-Hartman	00d2666623	kobject: convert main fs kobject to use kobject_create This also renames fs_subsys to fs_kobj to catch all current users with a build error instead of a build warning which can easily be missed. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:13 -08:00
Greg Kroah-Hartman	3514faca19	kobject: remove struct kobj_type from struct kset We don't need a "default" ktype for a kset. We should set this explicitly every time for each kset. This change is needed so that we can make ksets dynamic, and cleans up one of the odd, undocumented assumption that the kset/kobject/ktype model has. This patch is based on a lot of help from Kay Sievers. Nasty bug in the block code was found by Dave Young <hidave.darkstar@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Christoph Hellwig	3965516440	exportfs: make struct export_operations const Now that nfsd has stopped writing to the find_exported_dentry member we an mark the export_operations const Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:21 -07:00
Christoph Hellwig	34c0d15424	gfs2: new export ops Convert gfs2 to the new ops. Uses a similar structure to the generic helpers, but gfs2 has it's own file handle formats. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Alan Cox	a9c62a18a2	fs: correct SuS compliance for open of large file without options The early LFS work that Linux uses favours EFBIG in various places. SuSv3 specifically uses EOVERFLOW for this as noted by Michael (Bug 7253) [EOVERFLOW] The named file is a regular file and the size of the file cannot be represented correctly in an object of type off_t. We should therefore transition to the proper error return code Signed-off-by: Alan Cox <alan@redhat.com> Cc: Theodore Tso <tytso@mit.edu> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Christoph Lameter	4ba9b9d0ba	Slab API: remove useless ctor parameter and reorder parameters Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void object, struct kmem_cache s, unsigned long flags) to ctor(struct kmem_cache s, void object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Steven Whitehouse	7765ec26ae	gfs2: convert to new aops Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:42:55 -07:00
Linus Torvalds	541010e4b8	Merge branch 'locks' of git://linux-nfs.org/~bfields/linux * 'locks' of git://linux-nfs.org/~bfields/linux: nfsd: remove IS_ISMNDLCK macro Rework /proc/locks via seq_files and seq_list helpers fs/locks.c: use list_for_each_entry() instead of list_for_each() NFS: clean up explicit check for mandatory locks AFS: clean up explicit check for mandatory locks 9PFS: clean up explicit check for mandatory locks GFS2: clean up explicit check for mandatory locks Cleanup macros for distinguishing mandatory locks Documentation: move locks.txt in filesystems/ locks: add warning about mandatory locking races Documentation: move mandatory locking documentation to filesystems/ locks: Fix potential OOPS in generic_setlease() Use list_first_entry in locks_wake_up_blocks locks: fix flock_lock_file() comment Memory shortage can result in inconsistent flocks state locks: kill redundant local variable locks: reverse order of posix_locks_conflict() arguments	2007-10-15 16:07:40 -07:00
Linus Torvalds	efefc6eb38	Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (75 commits) PM: merge device power-management source files sysfs: add copyrights kobject: update the copyrights kset: add some kerneldoc to help describe what these strange things are Driver core: rename ktype_edd and ktype_efivar Driver core: rename ktype_driver Driver core: rename ktype_device Driver core: rename ktype_class driver core: remove subsystem_init() sysfs: move sysfs file poll implementation to sysfs_open_dirent sysfs: implement sysfs_open_dirent sysfs: move sysfs_dirent->s_children into sysfs_dirent->s_dir sysfs: make sysfs_root a regular directory dirent sysfs: open code sysfs_attach_dentry() sysfs: make s_elem an anonymous union sysfs: make bin attr open get active reference of parent too sysfs: kill unnecessary NULL pointer check in sysfs_release() sysfs: kill unnecessary sysfs_get() in open paths sysfs: reposition sysfs_dirent->s_mode. sysfs: kill sysfs_update_file() ...	2007-10-12 15:49:37 -07:00
Greg Kroah-Hartman	34980ca8fa	Drivers: clean up direct setting of the name of a kset A kset should not have its name set directly, so dynamically set the name at runtime. This is needed to remove the static array in the kobject structure which will be changed in a future patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-10-12 14:51:02 -07:00
Linus Torvalds	f26e51f67a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (51 commits) [DLM] block dlm_recv in recovery transition [DLM] don't overwrite castparam if it's NULL [GFS2] Get superblock a different way [GFS2] Don't try to remove buffers that don't exist [GFS2] Alternate gfs2_iget to avoid looking up inodes being freed [GFS2] Data corruption fix [GFS2] Clean up journaled data writing [GFS2] GFS2: chmod hung - fix race in thread creation [DLM] Make dlm_sendd cond_resched more [GFS2] Move inode deletion out of blocking_cb [GFS2] flocks from same process trip kernel BUG at fs/gfs2/glock.c:1118! [GFS2] Clean up gfs2_trans_add_revoke() [GFS2] Use slab operations for all gfs2_bufdata allocations [GFS2] Replace revoke structure with bufdata structure [GFS2] Fix ordering of dirty/journal for ordered buffer unstuffing [GFS2] Clean up ordered write code [GFS2] Move pin/unpin into lops.c, clean up locking [GFS2] Don't mark jdata dirty in gfs2_unstuffer_page() [GFS2] Introduce gfs2_remove_from_ail [GFS2] Correct lock ordering in unlink ...	2007-10-12 09:14:51 -07:00
Al Viro	782e3b3b38	Fix up more bio fallout Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-12 00:29:50 -07:00
Steven Whitehouse	5a60c532c9	[GFS2] Get superblock a different way The mapping may be NULL by the time the I/O has completed, so we now get the superblock by a different route (via the bd and glock) to avoid this problem. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Wendy Cheng <wcheng@redhat.com>	2007-10-10 08:56:34 +01:00
Steven Whitehouse	891ba6d4a5	[GFS2] Don't try to remove buffers that don't exist Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:31 +01:00
Benjamin Marzinski	7a9f53b3c1	[GFS2] Alternate gfs2_iget to avoid looking up inodes being freed There is a possible deadlock between two processes on the same node, where one process is deleting an inode, and another process is looking for allocated but unused inodes to delete in order to create more space. process A does an iput() on inode X, and it's i_count drops to 0. This causes iput_final() to be called, which puts an inode into state I_FREEING at generic_delete_inode(). There no point between when iput_final() is called, and when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is set, no other process on that node can successfully look up that inode until the delete finishes. process B locks the the resource group for the same inode in get_local_rgrp(), which is called by gfs2_inplace_reserve_i() process A tries to lock the resource group for the inode in gfs2_dinode_dealloc(), but it's already locked by process B process B waits in find_inode for the inode to have the I_FREEING state cleared. Deadlock. This patch solves the problem by adding an alternative to gfs2_iget(), gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING state.o The alternate test function is just like the original one, except that it fails if the inode is being freed, and sets a skipped flag. The alternate set function is just like the original, except that it fails if the skipped flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of gfs2_iget(). Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:29 +01:00
Wendy Cheng	de986e859a	[GFS2] Data corruption fix * GFS2 has been using i_cache array to store its indirect meta blocks. Its flush routine doesn't correctly clean up all the entries. The problem would show while multiple nodes do simultaneous writes to the same file. Upon glock exclusive lock transfer, if the file is a sparse file with large file size where the indirect meta blocks span multiple array entries with "zero" entries in between. The flush routine prematurely stops the flushing that leaves old (stale) entries around. This leads to several nasty issues, including data corruption. * Fix gfs2_get_block_noalloc checking to correctly return EIO upon unmapped buffer. Signed-off-by: Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:26 +01:00
Steven Whitehouse	16615be18c	[GFS2] Clean up journaled data writing This patch cleans up the code for writing journaled data into the log. It also removes the need to allocate a small "tag" structure for each block written into the log. Instead we just keep count of the outstanding I/O so that we can be sure that its all been written at the correct time. Another result of this patch is that a number of ll_rw_block() calls have become submit_bh() calls, closing some races at the same time. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:24 +01:00
Bob Peterson	55c0c4ac0b	[GFS2] GFS2: chmod hung - fix race in thread creation The problem boiled down to a race between the gdlm_init_threads() function initializing thread1 and its setting of blist = 1. Essentially, "if (current == ls->thread1)" was checked by the thread before the thread creator set ls->thread1. Since thread1 is the only thread who is allowed to work on the blocking queue, and since neither thread thought it was thread1, no one was working on the queue. So everything just sat. This patch reuses the ls->async_lock spin_lock to fix the race, and it fixes the problem. I've done more than 2000 iterations of the loop that was recreating the failure and it seems to work. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> --	2007-10-10 08:56:22 +01:00
Wendy Cheng	49e61f2ef6	[GFS2] Move inode deletion out of blocking_cb Move inode deletion code out of blocking_cb handle_callback route to avoid racy conditions that end up blocking lock_dlm1 thread. Fix bugzilla 286821. Signed-off-by: Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:17 +01:00
Abhijith Das	b4c20166dc	[GFS2] flocks from same process trip kernel BUG at fs/gfs2/glock.c:1118! This patch adds a new flag to the gfs2_holder structure GL_FLOCK. It is set on holders of glocks representing flocks. This flag is checked in add_to_queue() and a process is permitted to queue more than one holder onto a glock if it is set. This solves the issue of a process not being able to do multiple flocks on the same file. Through a single descriptor, a process can now promote and demote flocks. Through multiple descriptors a process can now queue multiple flocks on the same file. There's still the problem of a process deadlocking itself (because gfs2 blocking locks are not interruptible) by queueing incompatible deadlock. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:14 +01:00
Steven Whitehouse	1ad38c437f	[GFS2] Clean up gfs2_trans_add_revoke() The following alters gfs2_trans_add_revoke() to take a struct gfs2_bufdata as an argument. This eliminates the memory allocation which was previously required by making use of the already existing struct gfs2_bufdata. It makes some sanity checks to ensure that the gfs2_bufdata has been removed from all the lists before its recycled as a revoke structure. This saves one memory allocation and one free per revoke structure. Also as a result, and to simplify the locking, since there is no longer any blocking code in gfs2_trans_add_revoke() we must hold the log lock whenever this function is called. This reduces the amount of times we take and unlock the log lock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:12 +01:00
Steven Whitehouse	0820ab517e	[GFS2] Use slab operations for all gfs2_bufdata allocations The old revoke structure was allocated using kalloc/kfree but there is a slab cache for gfs2_bufdata, so we should use that now that the structures have been converted. This is part two of the patch series to merge the revoke and gfs2_bufdata structures. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:10 +01:00
Steven Whitehouse	82e86087bb	[GFS2] Replace revoke structure with bufdata structure Both the revoke structure and the bufdata structure are quite similar. They are basically small tags which are put on lists. In addition to which the revoke structure is always allocated when there is a bufdata structure which is (or can be) freed. As such it should be possible to reduce the number of frees and allocations by using the same structure for both purposes. This patch is the first step along that path. It replaces existing uses of the revoke structure with the bufdata structure. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:07 +01:00
Bob Peterson	8475487bef	[GFS2] Fix ordering of dirty/journal for ordered buffer unstuffing Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:05 +01:00
Steven Whitehouse	d7b616e252	[GFS2] Clean up ordered write code The following patch removes the ordered write processing from databuf_lo_before_commit() and moves it to log.c. This has the effect of greatly simplyfying databuf_lo_before_commit() and well as potentially making the ordered write code more efficient. As a side effect of this, its now possible to remove ordered buffers from the ordered buffer list at any time, so we now make use of this in invalidatepage and releasepage to ensure timely release of these buffers. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:03 +01:00
Steven Whitehouse	9b9107a5a8	[GFS2] Move pin/unpin into lops.c, clean up locking gfs2_pin and gfs2_unpin are only used in lops.c, despite being defined in meta_io.c, so this patch moves them into lops.c and makes them static. At the same time, its possible to clean up the locking in the buf and databuf _lo_add() functions so that we only need to grab the spinlock once. Also we have to move lock_buffer() around the _lo_add() functions since we can't do that in gfs2_pin() any more since we hold the spinlock for the duration of that function. As a result, the code shrinks by 12 lines and we do far fewer operations when adding buffers to the log. It also makes the code somewhat easier to read & understand. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:00 +01:00
Steven Whitehouse	eaf965270f	[GFS2] Don't mark jdata dirty in gfs2_unstuffer_page() Journaled data is marked dirty by gfs2_unpin and should not be marked dirty here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:58 +01:00
Steven Whitehouse	1e1a3d03e9	[GFS2] Introduce gfs2_remove_from_ail This collects together the operations required to remove a gfs2_bufdata from the ail lists. Its only called from two places to start with, but expect to see more of this function in future. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:55 +01:00
Steven Whitehouse	8497a46e17	[GFS2] Correct lock ordering in unlink This patch corrects the lock ordering in unlink to be the same as that in the rest of GFS2, i.e. parent -> child -> rgrp. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:53 +01:00
Wendy Cheng	e9bd2b3baf	[GFS2] fix inode meta data corruption Fix a nasty inode meta data corruption issue by keeping the buffer head in icache array. This buffer needs to stay in memory until journal flush occurs Otherwise, gfs2_meta_inode_buffer could do a disk read before the inode hits disk. It ends up with meta data corruptions. The buffer will be released as part of the existing journal flush logic. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:51 +01:00
Benjamin Marzinski	c4f68a130f	[GFS2] delay glock demote for a minimum hold time When a lot of IO, with some distributed mmap IO, is run on a GFS2 filesystem in a cluster, it will deadlock. The reason is that do_no_page() will repeatedly call gfs2_sharewrite_nopage(), because each node keeps giving up the glock too early, and is forced to call unmap_mapping_range(). This bumps the mapping->truncate_count sequence count, forcing do_no_page() to retry. This patch institutes a minimum glock hold time a tenth a second. This insures that even in heavy contention cases, the node has enough time to get some useful work done before it gives up the glock. A second issue is that when gfs2_glock_dq() is called from within a page fault to demote a lock, and the associated page needs to be written out, it will try to acqire a lock on it, but it has already been locked at a higher level. This patch puts makes gfs2_glock_dq() use the work queue as well, to avoid this issue. This is the same patch as Steve Whitehouse originally proposed to fix this issue, execpt that gfs2_glock_dq() now grabs a reference to the glock before it queues up the work on it. Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:48 +01:00
Abhijith Das	d1e2777d4f	[GFS2] panic after can't parse mount arguments When you try to mount gfs2 with -o garbage, the mount fails and the gfs2 superblock is deallocated and becomes NULL. The vfs comes around later on and calls gfs2_kill_sb. At this point the hidden gfs2 superblock pointer (sb->s_fs_info) is NULL and dereferencing it through gfs2_meta_syncfs causes the panic. (the other function call to gfs2_delete_debugfs_file() succeeds because this function already checks for a NULL pointer) Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:46 +01:00
Bob Peterson	ec217e0ece	[GFS2] Patch to protect sd_log_num_jdata This is a patch to GFS2 to protect sd_log_num_jdata with the gfs2_log_lock. Without this patch, there is a timing window where you can get hit the following assert from function gfs2_log_flush(): gfs2_assert_withdraw(sdp, sdp->sd_log_num_buf + sdp->sd_log_num_jdata == sdp->sd_log_commited_buf + sdp->sd_log_commited_databuf); I've tested it on my roth cluster and it fixes the problem. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:43 +01:00
Abhijith Das	a947e03356	[GFS2] Wendy's dump lockname in hex & fix glock dump With this patch, gfs2 glockdump through the debugfs filesystem will only dump glocks for the specified filesystem instead of all glocks. Also, to aid debugging, the glock number is dumped in hex instead of decimal. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Abhijith Das <adas@redhat.com>	2007-10-10 08:55:41 +01:00
Wendy Cheng	a13b8c5f23	[GFS2] Reduce truncate IO traffic Current GFS2 setattr call unconditionally invokes do_shrink even the requested size and actual file size are equal. This has generated large amount of extra IOs found during NFS benchmark runs. This patch moves the relevant logic out of shrink code path. Since setattr is a system call, the time stamps update is still required. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:36 +01:00
Benjamin Marzinski	9a5ad13856	[GFS2] Add NULL entry to token table match_token() was returning garbage data instead of a fail value. This data happened to match a valid option id for an option that required an argument (in this case, lockproto=%s) For match_token() to correctly fail if the option doesn't match any of the tokens, the token table must end with a NULL entry. This patch adds the NULL entry. Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:34 +01:00
Steven Whitehouse	382e6e256b	[GFS2] Add a missing gfs2_trans_add_bh() This was missing from the dir_split_leaf() function although in most cases its not a problem due to other functions having already previously called gfs2_trans_add_bh. This makes certain that it is correct. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Wendy Cheng <wcheng@redhat.com>	2007-10-10 08:55:32 +01:00
Steven Whitehouse	bb3b0e3df5	[GFS2] Clean up invalidatepage/releasepage This patch fixes some bugs relating to journaled data files by cleaning up the gfs2_invalidatepage() and gfs2_releasepage() functions. We now never block during gfs2_releasepage(), instead we always either release or refuse to release depending on the status of the buffers. This fixes Red Hat bugzillas #248969 and #252392. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com>	2007-10-10 08:55:29 +01:00
Abhijith Das	2d9a4bbf6d	[GFS2] Fix quota do_list operation hang This is the filesystem part of the patches to fix this bz. There are additional userland patches (gfs2_quota, libgfs2) for the complete solution. This patch adds a new field qu_ll_next to the gfs2_quota structure. This field allows us to create linked lists of quotas in the ondisk quota inode. Instead of scanning through the entire sparse quota file for valid quotas, we can now simply walk through the user and group quota linked lists to perform the do_list operation. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:27 +01:00
Denis Cheng	34eaae398e	[GFS2] fixed a NULL pointer assignment BUG Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:24 +01:00
Abhijith Das	0fd5355470	[GFS2] Force unstuff of hidden quota inode This patch forcibly unstuffs (if stuffed) the hidden quota inode at the first availble opportunity. In any practical scenario the quota inode won't be stuffed, so this is ok to do. Unstuffing the quota inode allows us to ignore the case of a stuffed quota inode in gfs2_adjust_quota(). Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:22 +01:00
Denis Cheng	5d35e31f43	[GFS2] better code for translating characters the original code could work, but I think this code could work better. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:20 +01:00
Denis Cheng	2d3ba1ea97	[GFS2] unneeded typecast sb->s_fs_info is a void pointer, thus the type cast is not needed. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:17 +01:00
Denis Cheng	adb4ec13cd	[GFS2] use list_for_each_entry instead Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:15 +01:00
Bob Peterson	75be73a824	[GFS2] Ensure journal file cache is flushed after recovery This is for bugzilla bug #248176: GFS2: invalid metadata block Patches 1 thru 3 were accepted upstream, but there were problems with 4 and 5. Those issues have been resolved and now the recovery tests are passing without errors. This code has gone through 41 * 3 successful gfs2 recovery tests before it hit an unrelated (openais) problem. I'm continuing to test it. This is a complete rewrite of patch 5 for bug #248176, written by Steve Whitehouse. This is referred to in the bugzilla record as "new 6" and "a different solution". The problem was that the journal inodes, although protected by a glock, were not synched with the other nodes because they don't use the inode glock synch operations (i.e. no "glops" were defined). Therefore, journal recovery on a journal-recovering node were causing the blocks to get out of sync with the node that was actually trying to use that journal as it comes back up from a reboot. There are two possible solutions: (1) To make the journals use the normal inode glock sync operations, or (2) To make the journal operations take effect immediately (i.e. no caching). Although option 1 works, it turns out to be a lot more code. Steve opted for option 2, which is much simpler and therefore less prone to regression errors. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> --	2007-10-10 08:55:12 +01:00
Bob Peterson	5f3eae7546	[GFS2] invalid metadata block - REVISED This is for bugzilla bug #248176: GFS2: invalid metadata block Patches 1 thru 3 were accepted upstream, but there were problems with 4 and 5. Those issues have been resolved and now the recovery tests are passing without errors. This code has gone through 41 * 3 successful gfs2 recovery tests before it hit an unrelated (openais) problem. This is a complete rewrite of patch 4 for bug #248176. Part of the problem was that inodes were being recycled before their buffers were flushed to the journal logs. Another problem was that the clone bitmaps were being searched for deleted inodes to recycle, but only the "real" bitmaps should be searched for that purpose. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:10 +01:00
Steven Whitehouse	8fbbfd214c	[GFS2] Reduce number of gfs2_scand processes to one We only need a single gfs2_scand process rather than the one per filesystem which we had previously. As a result the parameter determining the frequency of gfs2_scand runs becomes a module parameter rather than a mount parameter as it was before. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:08 +01:00
Denis Cheng	ca5a939b33	[GFS2] use the declaration of gfs2_dops in the header file instead Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:05 +01:00
Denis Cheng	4ef290025c	[GFS2] mark struct _operations const these struct _operations are all method tables, thus should be const. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:03 +01:00
Bob Peterson	0f8468c8be	[GFS2] Detach buf data during in-place writeback This is patch 5 of 5 for bug #248176 Metadata corruption was occurring because page references weren't being removed in all cases. I previously added a function called detach_bufdata, but I discovered there already WAS a function out there to do the job. It's called gfs2_meta_cache_flush. So I added a call to that to remove the page references. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:55:01 +01:00
Denis Cheng	cee23c79d0	[GFS2] use an temp variable to reduce a spin_unlock this is more clear. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:54:58 +01:00
Bob Peterson	6760bdcd03	[GFS2] Prevent infinite loop in try_rgrp_unlink() This is patch three of five for bug #248176. The try_rgrp_unlink code in rgrp.c had an infinite loop. This was caused because the bitmap function rgblk_search can return a block less than the "goal" block, in which case it was looping. The fix is to make it always march forward as needed. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:54:56 +01:00
Bob Peterson	693ddeabbb	[GFS2] Revert part of earlier log.c changes This is patch 2 of 5 for bug #248176. The list_move code previously concocted in log.c for bug #238162 (see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=238162#c23) never runs as bh can now never be NULL at this point. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:54:53 +01:00
Bob Peterson	905d2aefa9	[GFS2] Move some code inside the log lock This is the first of five patches for bug #248176: There were still some critical variables being manipulated outside the log_lock spinlock. That usually resulted in a hang. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:54:51 +01:00
Steven Whitehouse	7b08fc6201	[GFS2] Fix an oops in glock dumping This fixes an oops which was occurring during glock dumping due to the seq file code not taking a reference to the glock. Also this fixes a memory leak which occurred in certain cases, in turn preventing the filesystem from unmounting. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:54:49 +01:00
Steve French	afd0942d98	[GFS2] GFS2 not checking pointer on create when running under nfsd When looking at an unrelated problem, I noticed that nfsd does not set nameidata pointer on create (ie nd is NULL). This should cause an oops in some cases in which when NFSd is mounted over GFS2. Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:54:46 +01:00
Jesper Juhl	aa0481e58a	[GFS2] Clean up duplicate includes in fs/gfs2/ This patch cleans up duplicate includes in fs/gfs2/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:54:44 +01:00
Josef Whiter	26caee5bc6	[GFS2] Fix calculation of demote state If a glock is in the exclusive state and a request for demote to deferred has been received, then further requests for demote to shared are being ignored. This patch fixes that by ensuring that we demote to unlocked in that case. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:54:42 +01:00
Steven Whitehouse	87124e581b	[GFS2] Fix two races relating to glock callbacks One of the races relates to referencing a variable while not holding its protecting spinlock. The patch simply moves the test inside the spin lock. The other races occurs when a demote to unlocked request occurs during the time a demote to shared request is already running. This of course only happens in the case that the lock was in the exclusive mode to start with. The patch adds a check to see if another demote request has occurred in the mean time and if it has, then it performs a second demote. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:54:39 +01:00
NeilBrown	6712ecf8f6	Drop 'size' argument from bio_endio and bi_end_io As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:57 +02:00
Pavel Emelyanov	7afaac6202	GFS2: clean up explicit check for mandatory locks The __mandatory_lock(inode) function makes the same check, but makes the code more readable. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-10-09 18:32:46 -04:00
Steven Whitehouse	d18c4d687d	[GFS2] Revert remounting w/o acl option leaves acls enabled This reverts commit `569a7b6c2e`. The code was correct originally. The default setting for ACLs after a remount should be to be the same as before the remount. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:34:40 +01:00
Steven Whitehouse	b9af7ca6d3	[GFS2] Fix setting of inherit jdata attr Due to a mix up between the jdata attribute and inherit jdata attribute it has not been possible to set the inherit jdata attribute on directories. This is now fixed and the ioctl will report the inherit jdata attribute for directories rather than the jdata attribute as it did previously. This stems from our need to have the one bit in the ioctl attr flags mean two different things according to whether the underlying inode is a directory or not. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:34:11 +01:00
Steven Whitehouse	a867bb28c1	[GFS2] Fix incorrect error path in prepare_write() The error path in prepare_write() was incorrect in the (very rare) event that the transaction fails to start. The following prevents a NULL pointer dereference, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:33:44 +01:00
Steven Whitehouse	6eefaf61f6	[GFS2] Fix incorrect return code in rgrp.c The following patch fixes a bug where 0 was being used as a return code to indicate "nothing to do" when in fact 0 was a valid block location which might be returned by the function. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:33:15 +01:00
Bob Peterson	24c7387333	[GFS2] soft lockup in rgblk_search This patch seems to fix the problem described in bugzilla bug 246114. It was written by Steve Whitehouse with some tweaking by me. The code was looping in the relatively new section of code designed to search for and reuse unlinked inodes. In cases where it was finding an appropriate inode to reuse, it was looping around and finding the same block over and over because a "<=" check should have been a "<" when comparing the goal block to the last unlinked block found. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:32:43 +01:00
Bob Peterson	bdcb88562c	[GFS2] soft lockup detected in databuf_lo_before_commit This is part 2 of the patch for bug #245832, part 1 of which is already in the git tree. The problem was that sdp->sd_log_num_databuf was not always being protected by the gfs2_log_lock spinlock, but the sd_log_le_databuf (which it is supposed to reflect) was protected. That meant there was a timing window during which gfs2_log_flush called databuf_lo_before_commit and the count didn't match what was really on the linked list in that window. So when it ran out of items on the linked list, it decremented total_dbuf from 0 to -1 and thus never left the "while(total_dbuf)" loop. The solution is to protect the variable sdp->sd_log_num_databuf so that the value will always match the contents of the linked list, and therefore the number will never go negative, and therefore, the loop will be exited properly. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:32:04 +01:00
Christoph Hellwig	0af1a45046	rename setlease to generic_setlease Make it a little more clear that this is the default implementation for the setleast operation. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:43 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Nick Piggin	83c54070ee	mm: fault feedback #2 This patch completes Linus's wish that the fault return codes be made into bit flags, which I agree makes everything nicer. This requires requires all handle_mm_fault callers to be modified (possibly the modifications should go further and do things like fault accounting in handle_mm_fault -- however that would be for another patch). [akpm@linux-foundation.org: fix alpha build] [akpm@linux-foundation.org: fix s390 build] [akpm@linux-foundation.org: fix sparc build] [akpm@linux-foundation.org: fix sparc64 build] [akpm@linux-foundation.org: fix ia64 build] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Still apparently needs some ARM and PPC loving - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Nick Piggin	d0217ac04c	mm: fault feedback #1 Change ->fault prototype. We now return an int, which contains VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte. FAULT_RET_ code tells the VM whether a page was found, whether it has been locked, and potentially other things. This is not quite the way he wanted it yet, but that's changed in the next patch (which requires changes to arch code). This means we no longer set VM_CAN_INVALIDATE in the vma in order to say that a page is locked which requires filemap_nopage to go away (because we can no longer remain backward compatible without that flag), but we were going to do that anyway. struct fault_data is renamed to struct vm_fault as Linus asked. address is now a void __user * that we should firmly encourage drivers not to use without really good reason. The page is now returned via a page pointer in the vm_fault struct. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Nick Piggin	54cb8821de	mm: merge populate and nopage into fault (fixes nonlinear) Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes the virtual address -> file offset differently from linear mappings. ->populate is a layering violation because the filesystem/pagecache code should need to know anything about the virtual memory mapping. The hitch here is that the ->nopage handler didn't pass down enough information (ie. pgoff). But it is more logical to pass pgoff rather than have the ->nopage function calculate it itself anyway (because that's a similar layering violation). Having the populate handler install the pte itself is likewise a nasty thing to be doing. This patch introduces a new fault handler that replaces ->nopage and ->populate and (later) ->nopfn. Most of the old mechanism is still in place so there is a lot of duplication and nice cleanups that can be removed if everyone switches over. The rationale for doing this in the first place is that nonlinear mappings are subject to the pagefault vs invalidate/truncate race too, and it seemed stupid to duplicate the synchronisation logic rather than just consolidate the two. After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in pagecache. Seems like a fringe functionality anyway. NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no users have hit mainline yet. [akpm@linux-foundation.org: cleanup] [randy.dunlap@oracle.com: doc. fixes for readahead] [akpm@linux-foundation.org: build fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Nick Piggin	d00806b183	mm: fix fault vs invalidate race for linear mappings Fix the race between invalidate_inode_pages and do_no_page. Andrea Arcangeli identified a subtle race between invalidation of pages from pagecache with userspace mappings, and do_no_page. The issue is that invalidation has to shoot down all mappings to the page, before it can be discarded from the pagecache. Between shooting down ptes to a particular page, and actually dropping the struct page from the pagecache, do_no_page from any process might fault on that page and establish a new mapping to the page just before it gets discarded from the pagecache. The most common case where such invalidation is used is in file truncation. This case was catered for by doing a sort of open-coded seqlock between the file's i_size, and its truncate_count. Truncation will decrease i_size, then increment truncate_count before unmapping userspace pages; do_no_page will read truncate_count, then find the page if it is within i_size, and then check truncate_count under the page table lock and back out and retry if it had subsequently been changed (ptl will serialise against unmapping, and ensure a potentially updated truncate_count is actually visible). Complexity and documentation issues aside, the locking protocol fails in the case where we would like to invalidate pagecache inside i_size. do_no_page can come in anytime and filemap_nopage is not aware of the invalidation in progress (as it is when it is outside i_size). The end result is that dangling (->mapping == NULL) pages that appear to be from a particular file may be mapped into userspace with nonsense data. Valid mappings to the same place will see a different page. Andrea implemented two working fixes, one using a real seqlock, another using a page->flags bit. He also proposed using the page lock in do_no_page, but that was initially considered too heavyweight. However, it is not a global or per-file lock, and the page cacheline is modified in do_no_page to increment _count and _mapcount anyway, so a further modification should not be a large performance hit. Scalability is not an issue. This patch implements this latter approach. ->nopage implementations return with the page locked if it is possible for their underlying file to be invalidated (in that case, they must set a special vm_flags bit to indicate so). do_no_page only unlocks the page after setting up the mapping completely. invalidation is excluded because it holds the page lock during invalidation of each page (and ensures that the page is not mapped while holding the lock). This also allows significant simplifications in do_no_page, because we have the page locked in the right place in the pagecache from the start. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Marc Eshel	60446067ba	gfs2: stop giving out non-cluster-coherent leases Since gfs2 can't prevent conflicting opens or leases on other nodes, we probably shouldn't allow it to give out leases at all. Put the newly defined lease operation into use in gfs2 by turning off lease, unless we're using the "nolock' locking module (in which case all locking is local anyway). Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Steven Whitehouse <swhiteho@redhat.com>	2007-07-18 19:17:19 -04:00
Satyam Sharma	3bd858ab1c	Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check Introduce is_owner_or_cap() macro in fs.h, and convert over relevant users to it. This is done because we want to avoid bugs in the future where we check for only effective fsuid of the current task against a file's owning uid, without simultaneously checking for CAP_FOWNER as well, thus violating its semantics. [ XFS uses special macros and structures, and in general looked ... untouchable, so we leave it alone -- but it has been looked over. ] The (current->fsuid != inode->i_uid) check in generic_permission() and exec_permission_lite() is left alone, because those operations are covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone. Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Cc: Al Viro <viro@ftp.linux.org.uk> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-17 12:00:03 -07:00
Christoph Hellwig	a569425512	knfsd: exportfs: add exportfs.h header currently the export_operation structure and helpers related to it are in fs.h. fs.h is already far too large and there are very few places needing the export bits, so split them off into a separate header. [akpm@linux-foundation.org: fix cifs build] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Neil Brown <neilb@suse.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-17 10:23:06 -07:00
Alexey Dobriyan	aa0ac36518	Remove capability.h from mm.h I forgot to remove capability.h from mm.h while removing sched.h! This patch remedies that, because the only inline function which was using CAP_something was made out of line. Cross-compile tested without regressions on: all powerpc defconfigs all mips defconfigs all m68k defconfigs all arm defconfigs all ia64 defconfigs alpha alpha-allnoconfig alpha-defconfig alpha-up arm i386 i386-allnoconfig i386-defconfig i386-up ia64 ia64-allnoconfig ia64-defconfig ia64-up m68k mips parisc parisc-allnoconfig parisc-defconfig parisc-up powerpc powerpc-up s390 s390-allnoconfig s390-defconfig s390-up sparc sparc-allnoconfig sparc-defconfig sparc-up sparc64 sparc64-allnoconfig sparc64-defconfig sparc64-up um-x86_64 x86_64 x86_64-allnoconfig x86_64-defconfig x86_64-up as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-16 09:05:45 -07:00
Linus Torvalds	1b21f458dd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (57 commits) [GFS2] Accept old format NFS filehandles [GFS2] Small fixes to logging code [DLM] dump more lock values [GFS2] Remove i_mode passing from NFS File Handle [GFS2] Obtaining no_formal_ino from directory entry [GFS2] git-gfs2-nmw-build-fix [GFS2] System won't suspend with GFS2 file system mounted [GFS2] remounting w/o acl option leaves acls enabled [GFS2] inode size inconsistency [DLM] Telnet to port 21064 can stop all lockspaces [GFS2] Fix gfs2_block_truncate_page err return [GFS2] Addendum to the journaled file/unmount patch [GFS2] Simplify multiple glock aquisition [GFS2] assertion failure after writing to journaled file, umount [GFS2] Use zero_user_page() in stuffed_readpage() [GFS2] Remove bogus '\0' in rgrp.c [GFS2] Journaled file write/unstuff bug [DLM] don't require FS flag on all nodes [GFS2] Fix deallocation issues [GFS2] return conflicts for GETLK ...	2007-07-10 13:56:13 -07:00
Steven Whitehouse	3ebf44902f	[GFS2] Accept old format NFS filehandles On Tue, 2007-07-10 at 10:06 +0100, Christoph Hellwig wrote: > > -#define GFS2_LARGE_FH_SIZE 10 > > - > > -struct gfs2_fh_obj { > > - struct gfs2_inum_host this; > > - u32 imode; > > -}; > > +#define GFS2_LARGE_FH_SIZE 8 > > Because gfs2_decode_fh only accepts file handles with GFS2_LARGE_FH_SIZE > or GFS2_LARGE_FH_SIZE you don't accept filehandles sent out by and older > gfs version anymore. Stale filehandles because of a new kernel version > are a big no-no, so please add back code to handle the old filehandles > on the decode side. > This should fix that problem I think since its only relating to end of the fh we can just ignore that field in order to accept the older format. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Wendy Cheng <wcheng@redhat.com>	2007-07-10 12:28:27 +01:00
Jens Axboe	5ffc4ef45b	sendfile: remove .sendfile from filesystems that use generic_file_sendfile() They can use generic_file_splice_read() instead. Since sys_sendfile() now prefers that, there should be no change in behaviour. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:13 +02:00
Steven Whitehouse	a0a24741ca	[GFS2] Small fixes to logging code This reverts part of an earlier patch which tried to reclaim gfs2_bufdata structures too early and resulted in a "use after free" case (this bit from me). Also a change to not write out log headers unless we really need to (in the case of flushing nothing we don't need a header) from Bob. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2007-07-09 15:43:07 +01:00
Wendy Cheng	35dcc52e3a	[GFS2] Remove i_mode passing from NFS File Handle GFS2 has been passing i_mode within NFS File Handle. Other than the wrong assumption that there is always room for this extra 16 bit value, the current gfs2_get_dentry doesn't really need the i_mode to work correctly. Note that GFS2 NFS code does go thru the same lookup code path as direct file access route (where the mode is obtained from name lookup) but gfs2_get_dentry() is coded for different purpose. It is not used during lookup time. It is part of the file access procedure call. When the call is invoked, if on-disk inode is not in-memory, it has to be read-in. This makes i_mode passing a useless overhead. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:24:11 +01:00
Wendy Cheng	bb9bcf0616	[GFS2] Obtaining no_formal_ino from directory entry GFS2 lookup code doesn't ask for inode shared glock. This implies during in-memory inode creation for existing file, GFS2 will not disk-read in the inode contents. This leaves no_formal_ino un-initialized during lookup time. The un-initialized no_formal_ino is subsequently encoded into file handle. Clients will get ESTALE error whenever it tries to access these files. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:24:08 +01:00
Abhijith Das	b365762924	[GFS2] System won't suspend with GFS2 file system mounted The kernel threads in gfs2, namely gfs2_scand, gfs2_logd, gfs2_quotad, gfs2_glockd, gfs2_recoverd weren't doing anything when the suspend mechanism was trying to freeze them. I put in calls to refrigerator() in the loops for all the daemons and suspend works as expected. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:24:04 +01:00
Bob Peterson	569a7b6c2e	[GFS2] remounting w/o acl option leaves acls enabled This patch is for bugzilla bug #245663. This crosswrites a fix from gfs1 (bz #210369) so that the mount options are reset properly upon remount. This was tested on system trin-10. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:24:01 +01:00
Wendy Cheng	090ffaa55d	[GFS2] inode size inconsistency This should have been part of the NFS patch #1 but somehow I missed it when packaging the patches. It is not a critical issue as the others (I hope). RHEL 5.1 31.el5 kernel runs fine without this change. Our truncate code is chopped into two parts, one for vfs inode changes (in vmtruncate()) and one of gfs inode (in gfs2_truncatei()). These two operatons are, unfortunately, not atomic. So it could happens that vmtruncate() succeeds (inode->i_size is changed) but gfs2_truncatei fails (say kernel temporarily out of memory). This would leave gfs inode i_di.di_size out of sync with vfs inode i_size. It will later confuse gfs2_commit_write() if a write is issued. Last time I checked, it will cause file corruption. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:59 +01:00
S. Wendy Cheng	1875f2f31b	[GFS2] Fix gfs2_block_truncate_page err return Code segment inside gfs2_block_truncate_page() doesn't set the return code correctly. This causes NFSD erroneously returns EIO back to client with setattr procedure call (truncate error). Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:54 +01:00
Robert Peterson	773ed1a044	[GFS2] Addendum to the journaled file/unmount patch This patch is an addendum to the previous journaled file/unmount patch. It fixes a problem discovered during testing. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:52 +01:00
Steven Whitehouse	eaf5bd3cac	[GFS2] Simplify multiple glock aquisition There is a bug in the code which acquires multiple glocks where if the initial out-of-order attempt fails part way though we can land up trying to acquire the wrong number of glocks. This is part of the fix for red hat bz #239737. The other part of the bz doesn't apply to upstream kernels since it was fixed by: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d3717bdf8f08a0e1039158c8bab2c24d20f492b6 Since the out-of-order code doesn't appear to add anything to the performance of GFS2, this patch just removed it rather than trying to fix it. It should be much easier to see whats going on here now. In addition, we don't allocate any memory unless we are using a lot of glocks (which is a relatively uncommon case). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:50 +01:00
Robert Peterson	2332c4435b	[GFS2] assertion failure after writing to journaled file, umount This patch passes all my nasty tests that were causing the code to fail under one circumstance or another. Here is a complete summary of all changes from today's git tree, in order of appearance: 1. There are now separate variables for metadata buffer accounting. 2. Variable sd_log_num_hdrs is no longer needed, since the header accounting is taken care of by the reserve/refund sequence. 3. Fixed a tiny grammatical problem in a comment. 4. Added a new function "calc_reserved" to calculate the reserved log space. This isn't entirely necessary, but it has two benefits: First, it simplifies the gfs2_log_refund function greatly. Second, it allows for easier debugging because I could sprinkle the code with calls to this function to make sure the accounting is proper (by adding asserts and printks) at strategic point of the code. 5. In log_pull_tail there apparently was a kludge to fix up the accounting based on a "pull" parameter. The buffer accounting is now done properly, so the kludge was removed. 6. File sync operations were making a call to gfs2_log_flush that writes another journal header. Since that header was unplanned for (reserved) by the reserve/refund sequence, the free space had to be decremented so that when log_pull_tail gets called, the free space is be adjusted properly. (Did I hear you call that a kludge? well, maybe, but a lot more justifiable than the one I removed). 7. In the gfs2_log_shutdown code, it optionally syncs the log by specifying the PULL parameter to log_write_header. I'm not sure this is necessary anymore. It just seems to me there could be cases where shutdown is called while there are outstanding log buffers. 8. In the (data)buf_lo_before_commit functions, I changed some offset values from being calculated on the fly to being constants. That simplified some code and we might as well let the compiler do the calculation once rather than redoing those cycles at run time. 9. This version has my rewritten databuf_lo_add function. This version is much more like its predecessor, buf_lo_add, which makes it easier to understand. Again, this might not be necessary, but it seems as if this one works as well as the previous one, maybe even better, so I decided to leave it in. 10. In databuf_lo_before_commit, a previous data corruption problem was caused by going off the end of the buffer. The proper solution is to have the proper limit in place, rather than stopping earlier. (Thus my previous attempt to fix it is wrong). If you don't wrap the buffer, you're stopping too early and that causes more log buffer accounting problems. 11. In lops.h there are two new (previously mentioned) constants for figuring out the data offset for the journal buffers. 12. There are also two new functions, buf_limit and databuf_limit to calculate how many entries will fit in the buffer. 13. In function gfs2_meta_wipe, it needs to distinguish between pinned metadata buffers and journaled data buffers for proper journal buffer accounting. It can't use the JDATA gfs2_inode flag because it's sometimes passed the "real" inode and sometimes the "metadata inode" and the inode flags will be random bits in a metadata gfs2_inode. It needs to base its decision on which was passed in. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:47 +01:00
Steven Whitehouse	2840501ac8	[GFS2] Use zero_user_page() in stuffed_readpage() As suggested by Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Robert P. J. Day <rpjday@mindspring.com>	2007-07-09 08:23:45 +01:00
Steven Whitehouse	c4201214cb	[GFS2] Remove bogus '\0' in rgrp.c Not sure how it slipped in, but we don't want it anyway. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:43 +01:00
Robert Peterson	8fb68595d5	[GFS2] Journaled file write/unstuff bug This patch is for bugzilla bug 283162, which uncovered a number of bugs pertaining to writing to files that have the journaled bit on. These bugs happen most often when writing to the meta_fs because the files are always journaled. So operations like gfs2_grow were particularly vulnerable, although many of the problems could be recreated with normal files after setting the journaled bit on. The problems fixed are: -GFS2 wasn't ever writing unstuffed journaled data blocks to their in-place location on disk. Now it does. -If you unmounted too quickly after doing IO to a journaled file, GFS2 was crashing because you would discard a buffer whose bufdata was still on the active items list. GFS2 now deals with this gracefully. -GFS2 was losing track of the bufdata for journaled data blocks, and it wasn't getting freed, causing an error when you tried to unmount the module. GFS2 now frees all the bufdata structures. -There was a memory corruption occurring because GFS2 wrote twice as many log entries for journaled buffers. -It was occasionally trying to write journal headers in buffers that weren't currently mapped. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:40 +01:00
Abhijith Das	d93cfa9884	[GFS2] Fix deallocation issues There were two issues during deallocation of unlinked inodes. The first was relating to the use of a "try" lock which in the case of the inode lock wasn't trying hard enough to deallocate in all circumstances (now changed to a normal glock) and in the case of the iopen lock didn't wait for the demotion of the shared lock before attempting to get the exclusive lock, and thereby sometimes (timing dependent) not completing the deallocation when it should have done. The second issue related to the lack of a way to invalidate dcache entries on remote nodes (now fixed by this patch) which meant that unlinks were taking a long time to return disk space to the fs. By adding some code to invalidate the dcache entries across the cluster for unlinked inodes, that is now fixed. This patch was written jointly by Abhijith Das and Steven Whitehouse. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:36 +01:00
David Teigland	a7a2ff8a95	[GFS2] return conflicts for GETLK We weren't returning the correct result when GETLK found a conflict, which is indicated by userspace passing back a 1. Signed-off-by: Abhijith Das <adas redhat com> Signed-off-by: David Teigland <teigland redhat com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:33 +01:00
David Teigland	d88101d4d8	[GFS2] set plock owner in GETLK info Set the owner field in the plock info sent to userspace for GETLK. Without this, gfs_controld won't correctly see when the GETLK from a process matches one of the process's existing locks. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:31 +01:00
akpm@linux-foundation.org	037bcbb756	[GFS2] gfs2_lookupi() uninitialised var fix fs/gfs2/inode.c: In function 'gfs2_lookupi': fs/gfs2/inode.c:392: warning: 'error' may be used uninitialized in this function Looks like a real bug to me. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:29 +01:00
Steven Whitehouse	c8cdf47937	[GFS2] Recovery for lost unlinked inodes Under certain circumstances its possible (though rather unlikely) that inodes which were unlinked by one node while still open on another might get "lost" in the sense that they don't get deallocated if the node which held the inode open crashed before it was unlinked. This patch adds the recovery code which allows automatic deallocation of the inode if its found during block allocation (the sensible time to look for such inodes since we are scanning the rgrp's bitmaps anyway at this time, so it adds no overhead to do this). Since the inode will have had its i_nlink set to zero, all we need to trigger recovery is a lookup and an iput(), and the normal deallocation code takes care of the rest. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:26 +01:00
Robert Peterson	b35997d448	[GFS2] Can't mount GFS2 file system on AoE device This patch fixes bug 243131: Can't mount GFS2 file system on AoE device. When using AoE devices with lock_nolock, there is no locking table, so gfs2 (and gfs1) uses the superblock s_id. This turns out to be the device name in some cases. In the case of AoE, the device contains a slash, (e.g. "etherd/e1.1p2") which is an invalid character when we try to register the table in sysfs. This patch replaces the "/" with underscore. Rather than add a new variable to the stack, I'm just reusing a (char *) variable that's no longer used: table. This code has been tested on the failing system using a RHEL5 patch. The upstream code was tested by using gfs2_tool sb to interject a "/" into the table name of a clustered gfs2 file system. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:24 +01:00
Steven Whitehouse	e1cc86037b	[GFS2] Fix bug in error path of inode This fixes a bug in the ordering of operations in the error path of createi. Its not valid to do an iput() when holding the inode's glock since the iput() will (in this case) result in delete_inode() being called which needs to grab the lock itself. This was causing the recursive lock checking code to trigger. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:22 +01:00
Steven Whitehouse	ffed8ab342	[GFS2] Fix typo in rename of directories A typo caused us to pass a NULL pointer when renaming directories. It was accidentally introduced in: [GFS2] Clean up inode number handling Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:19 +01:00
Patrick Caulfield	44f487a553	[DLM] variable allocation Add a new flag, DLM_LSFL_FS, to be used when a file system creates a lockspace. This flag causes the dlm to use GFP_NOFS for allocations instead of GFP_KERNEL. (This updated version of the patch uses gfp_t for ls_allocation.) Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-Off-By: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:17 +01:00
Steven Whitehouse	4bd91ba181	[GFS2] Add nanosecond timestamp feature This adds a nanosecond timestamp feature to the GFS2 filesystem. Due to the way that the on-disk format works, older filesystems will just appear to have this field set to zero. When mounted by an older version of GFS2, the filesystem will simply ignore the extra fields so that it will again appear to have whole second resolution, so that its trivially backward compatible. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:12 +01:00
Steven Whitehouse	bb8d8a6f54	[GFS2] Fix sign problem in quota/statfs and cleanup _host structures This patch fixes some sign issues which were accidentally introduced into the quota & statfs code during the endianess annotation process. Also included is a general clean up which moves all of the _host structures out of gfs2_ondisk.h (where they should not have been to start with) and into the places where they are actually used (often only one place). Also those _host structures which are not required any more are removed entirely (which is the eventual plan for all of them). The conversion routines from ondisk.c are also moved into the places where they are actually used, which for almost every one, was just one single place, so all those are now static functions. This also cleans up the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__. The net result is a reduction of about 100 lines of code, many functions now marked static plus the bug fixes as mentioned above. For good measure I ran the code through sparse after making these changes to check that there are no warnings generated. This fixes Red Hat bz #239686 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:10 +01:00
Benjamin Marzinski	ddf4b426aa	[GFS2] fix jdata issues This is a patch for the first three issues of RHBZ #238162 The first issue is that when you allocate a new page for a file, it will not start off uptodate. This makes sense, since you haven't written anything to that part of the file yet. Unfortunately, gfs2_pin() checks to make sure that the buffers are uptodate. The solution to this is to mark the buffers uptodate in gfs2_commit_write(), after they have been zeroed out and have the data written into them. I'm pretty confident with this fix, although it's not completely obvious that there is no problem with marking the buffers uptodate here. The second issue is simply that you can try to pin a data buffer that is already on the incore log, and thus, already pinned. This patch checks to see if this buffer is already on the log, and exits databuf_lo_add() if it is, just like buf_lo_add() does. The third issue is that gfs2_log_flush() doesn't do it's block accounting correctly. Both metadata and journaled data are logged, but gfs2_log_flush() only compares the number of metadata blocks with the number of blocks to commit to the ondisk journal. This patch also counts the journaled data blocks. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:08 +01:00
Steven Whitehouse	89918647a4	[GFS2] Make the log reserved blocks depend on block size The number of blocks which we reserve in the log at the start of each transaction needs to depends upon the block size since the overhead is related to the number of "pointers" which can be fitted into a single block. This relates to Red Hat bz #240435 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:03 +01:00
Abhijith Das	1990e91765	[GFS2] Quotas non-functional - fix another bug This patch fixes a bug where gfs2 was writing update quota usage information to the wrong location in the quota file. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:01 +01:00
Abhijith Das	2a87ab0806	[GFS2] Quotas non-functional - fix bug This patch fixes an error in the quota code where a 'struct gfs2_quota_lvb' was being passed to gfs2_adjust_quota() instead of a 'struct gfs2_quota_data'. Also moved 'struct gfs2_quota_lvb' from fs/gfs2/incore.h to include/linux/gfs2_ondisk.h as per Steve's suggestion. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:22:26 +01:00
Steven Whitehouse	dbb7cae2a3	[GFS2] Clean up inode number handling This patch cleans up the inode number handling code. The main difference is that instead of looking up the inodes using a struct gfs2_inum_host we now use just the no_addr member of this structure. The tests relating to no_formal_ino can then be done by the calling code. This has advantages in that we want to do different things in different code paths if the no_formal_ino doesn't match. In the NFS patch we want to return -ESTALE, but in the ->lookup() path, its a bug in the fs if the no_formal_ino doesn't match and thus we can withdraw in this case. In order to later fix bz #201012, we need to be able to look up an inode without knowing no_formal_ino, as the only information that is known to us is the on-disk location of the inode in question. This patch will also help us to fix bz #236099 at a later date by cleaning up a lot of the code in that area. There are no user visible changes as a result of this patch and there are no changes to the on-disk format either. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:22:24 +01:00
Steven Whitehouse	41d7db0ab4	[GFS2] Reduce size of struct gdlm_lock This patch removes the completion (which is rather large) from struct gdlm_lock in favour of using the wait_on_bit() functions. We don't need to add any extra fields to the structure to do this, so we save 32 bytes (on x86_64) per structure. This adds up to quite a lot when we may potentially have millions of these lock structures, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: David Teigland <teigland@redhat.com>	2007-07-09 08:22:21 +01:00
Robert Peterson	cd81a4bac6	[GFS2] Addendum patch 2 for gfs2_grow This addendum patch 2 corrects three things: 1. It fixes a stupid mistake in the previous addendum that broke gfs2. Ref: https://www.redhat.com/archives/cluster-devel/2007-May/msg00162.html 2. It fixes a problem that Dave Teigland pointed out regarding the external declarations in ops_address.h being in the wrong place. 3. It recasts a couple more %llu printks to (unsigned long long) as requested by Steve Whitehouse. I would have loved to put this all in one revised patch, but there was a rush to get some patches for RHEL5. Therefore, the previous patches were applied to the git tree "as is" and therefore, I'm posting another addendum. Sorry. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:22:19 +01:00
Nate Diller	0507ecf50f	[GFS2] use zero_user_page Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller <nate.diller@gmail.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-07-09 08:22:17 +01:00
Robert Peterson	6c53267f05	[GFS2] Kernel changes to support new gfs2_grow command (part 2) To avoid code redundancy, I separated out the operational "guts" into a new function called read_rindex_entry. Then I made two functions: the closer-to-original gfs2_ri_update (without the special condition checks) and gfs2_ri_update_special that's designed with that condition in mind. (I don't like the name, but if you have a suggestion, I'm all ears). Oh, and there's an added benefit: we don't need all the ugly gotos anymore. ;) This patch has been tested with gfs2_fsck_hellfire (which runs for three and a half hours, btw). Signed-off-By: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:22:14 +01:00
Robert Peterson	7ae8fa8451	[GFS2] kernel changes to support new gfs2_grow command This is another revision of my gfs2 kernel patch that allows gfs2_grow to function properly. Steve Whitehouse expressed some concerns about the previous patch and I restructured it based on his comments. The previous patch was doing the statfs_change at file close time, under its own transaction. The current patch does the statfs_change inside the gfs2_commit_write function, which keeps it under the umbrella of the inode transaction. I can't call ri_update to re-read the rindex file during the transaction because the transaction may have outstanding unwritten buffers attached to the rgrps that would be otherwise blown away. So instead, I created a new function, gfs2_ri_total, that will re-read the rindex file just to total the file system space for the sake of the statfs_change. The ri_update will happen later, when gfs2 realizes the version number has changed, as it happened before my patch. Since the statfs_change is happening at write_commit time and there may be multiple writes to the rindex file for one grow operation. So one consequence of this restructuring is that instead of getting one kernel message to indicate the change, you may see several. For example, before when you did a gfs2_grow, you'd get a single message like: GFS2: File system extended by 247876 blocks (968MB) Now you get something like: GFS2: File system extended by 207896 blocks (812MB) GFS2: File system extended by 39980 blocks (156MB) This version has also been successfully run against the hours-long "gfs2_fsck_hellfire" test that does several gfs2_grow and gfs2_fsck while interjecting file system damage. It does this repeatedly under a variety Resource Group conditions. Signed-off-By: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:22:12 +01:00
Benjamin Marzinski	b524fe646c	[GFS2] flush the glock completely in inode_go_sync Fix for bz #231910 When filemap_fdatawrite() is called on the inode mapping in data=ordered mode, it will add the glock to the log. In inode_go_sync(), if you do the gfs2_log_flush() before this, after the filemap_fdatawrite() call, the glock and its associated data buffers will be on the log again. This means you can demote a lock from exclusive, without having it flushed from the log. The attached patch simply moves the gfs2_log_flush up to after the filemap_fdatawrite() call. Originally, I tried moving the gfs2_log_flush to after gfs2_meta_sync(), but that caused me to trip the following assert. GFS2: fsid=cypher-36:test.0: fatal: assertion "!buffer_busy(bh)" failed GFS2: fsid=cypher-36:test.0: function = gfs2_ail_empty_gl, file = fs/gfs2/glops.c, line = 61 It appears that gfs2_log_flush() puts some of the glocks buffers in the busy state and the filemap_fdatawrite() call is necessary to flush them. This makes me worry slightly that a related problem could happen because of moving the gfs2_log_flush() after the initial filemap_fdatawrite(), but I assume that gfs2_ail_empty_gl() would catch that case as well. Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:22:07 +01:00
Alexey Dobriyan	e8edc6e03a	Detach sched.h from mm.h First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-21 09:18:19 -07:00
Christoph Lameter	a35afb830f	Remove SLAB_CTOR_CONSTRUCTOR SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-17 05:23:04 -07:00
Randy Dunlap	e63340ae6b	header cleaning: don't include smp_lock.h when not used Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:07 -07:00
Guillaume Chazarain	3e9f45bd18	Factor outstanding I/O error handling Cleanup: setting an outstanding error on a mapping was open coded too many times. Factor it out in mapping_set_error(). Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:14:57 -07:00
Linus Torvalds	2d56d3c43c	Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux * 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux: gfs2: nfs lock support for gfs2 lockd: add code to handle deferred lock requests lockd: always preallocate block in nlmsvc_lock() lockd: handle test_lock deferrals lockd: pass cookie in nlmsvc_testlock lockd: handle fl_grant callbacks lockd: save lock state on deferral locks: add fl_grant callback for asynchronous lock return nfsd4: Convert NFSv4 to new lock interface locks: add lock cancel command locks: allow {vfs,posix}_lock_file to return conflicting lock locks: factor out generic/filesystem switch from setlock code locks: factor out generic/filesystem switch from test_lock locks: give posix_test_lock same interface as ->lock locks: make ->lock release private data before returning in GETLK case locks: create posix-to-flock helper functions locks: trivial removal of unnecessary parentheses	2007-05-07 12:34:24 -07:00
Linus Torvalds	5cefcab3db	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (34 commits) [GFS2] Uncomment sprintf_symbol calling code [DLM] lowcomms style [GFS2] printk warning fixes [GFS2] Patch to fix mmap of stuffed files [GFS2] use lib/parser for parsing mount options [DLM] Lowcomms nodeid range & initialisation fixes [DLM] Fix dlm_lowcoms_stop hang [DLM] fix mode munging [GFS2] lockdump improvements [GFS2] Patch to detect corrupt number of dir entries in leaf and/or inode blocks [GFS2] bz 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump) [DLM] fs/dlm/ast.c should #include "ast.h" [DLM] Consolidate transport protocols [DLM] Remove redundant assignment [GFS2] Fix bz 234168 (ignoring rgrp flags) [DLM] change lkid format [DLM] interface for purge (2/2) [DLM] add orphan purging code (1/2) [DLM] split create_message function [GFS2] Set drop_count to 0 (off) by default ...	2007-05-07 12:26:27 -07:00
Christoph Lameter	50953fe9e0	slab allocators: Remove SLAB_DEBUG_INITIAL flag I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:57 -07:00
Marc Eshel	586759f03e	gfs2: nfs lock support for gfs2 Add NFS lock support to GFS2. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-06 20:38:50 -04:00
Marc Eshel	9d6a8c5c21	locks: give posix_test_lock same interface as ->lock posix_test_lock() and ->lock() do the same job but have gratuitously different interfaces. Modify posix_test_lock() so the two agree, simplifying some code in the process. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>	2007-05-06 17:39:00 -04:00
Greg Kroah-Hartman	823bccfc40	remove "struct subsystem" as it is no longer needed We need to work on cleaning up the relationship between kobjects, ksets and ktypes. The removal of 'struct subsystem' is the first step of this, especially as it is not really needed at all. Thanks to Kay for fixing the bugs in this patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-05-02 18:57:59 -07:00
Steven Whitehouse	37fde8ca6c	[GFS2] Uncomment sprintf_symbol calling code Now that the patch from -mm has gone upstream, we can uncomment the code in GFS2 which uses sprintf_symbol. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Robert Peterson <rpeterso@redhat.com>	2007-05-01 09:51:39 +01:00
akpm@linux-foundation.org	f391a4ead6	[GFS2] printk warning fixes alpha: fs/gfs2/dir.c: In function 'gfs2_dir_read_leaf': fs/gfs2/dir.c:1322: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'sector_t' fs/gfs2/dir.c: In function 'gfs2_dir_read': fs/gfs2/dir.c:1455: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type '__u64' Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:11:48 +01:00
Steven Whitehouse	bf126aee6d	[GFS2] Patch to fix mmap of stuffed files If a stuffed file is mmaped and a page fault is generated at some offset above the initial page, we need to create a zero page to hang the buffer heads off before we can unstuff the file. This is a fix for bz #236087 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:11:46 +01:00
Josef Bacik	476c006be0	[GFS2] use lib/parser for parsing mount options This patch converts the mount option parsing to use the kernels lib/parser stuff like all of the other filesystems. I tested this and it works well. Thank you, Signed-off-by: Josef Bacik <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:11:43 +01:00
Robert Peterson	5f8820960c	[GFS2] lockdump improvements The patch below consists of the following changes (in code order): 1. I fixed a minor compiler warning regarding the printing of a kernel symbol address. 2. I implemented a suggestion from Dave Teigland that moves the debugfs information for gfs2 into a subdirectory so we can easily expand our use of debugfs in the future. The current code keeps the glock information in: /debug/gfs2/<fs> With the patch, the new code keeps the glock information in: /debug/gfs2/<fs>/glock That will allow us to create more debugfs files in the future. 3. This fixes a bug whereby a failed mount attempt causes the debugfs file to not be deleted. Failed mount attempts should always clean up after themselves, including deleting the debugfs file and/or directory. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:11:33 +01:00
Steven Whitehouse	bdd19a22f8	[GFS2] Patch to detect corrupt number of dir entries in leaf and/or inode blocks This patch detects when the number of entries in a leaf block or inode block (in the case of stuffed directories) is corrupt and informs the user. It prevents us from running off the end of the array thats been allocated for the sorting in this case, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:11:30 +01:00
Robert Peterson	7a0079d9e3	[GFS2] bz 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump) This is for Bugzilla Bug 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump) seen at the "gfs2 summit". This also fixes the bug that caused garbage to be printed by the "initialized at" field. I apologize for the kludge, but that code will all be ripped out anyway when the official sprint_symbol function becomes available in the Linux kernel. I also changed some formatting so that spaces are replaced by proper tabs. Signed-off-by: Robert Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:11:28 +01:00
Steven Whitehouse	a43a49066d	[GFS2] Fix bz 234168 (ignoring rgrp flags) Ths following patch makes GFS2 use the rgrp flags properly. Although there are also separate flags for both data and metadata as well, I've not implemented these as there seems little use for them. On the otherhand, the "noalloc" flag is generally useful for future changes we might which to make, so this ensures that we interpret it correctly. In addition I fixed the comment above the function which was incorrect. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:11:17 +01:00
Steven Whitehouse	f01963f264	[GFS2] Set drop_count to 0 (off) by default This sets the drop_count to 0 by default which is a better default for most people. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:11:05 +01:00
David Teigland	b9af8a788a	[GFS2] use log_error before LM_OUT_ERROR We always want to see the details of the error returned to gfs, but log_debug is often turned off, so use log_error (printk). Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:11:02 +01:00
Robert Peterson	04b933f27b	[GFS2] Red Hat bz 228540: owner references In Testing the previously posted and accepted patch for https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228540 I uncovered some gfs2 badness. It turns out that the current gfs2 code saves off a process pointer when glocks is taken in both the glock and glock holder structures. Those structures will persist in memory long after the process has ended; pointers to poisoned memory. This problem isn't caused by the 228540 fix; the new capability introduced by the fix just uncovered the problem. I wrote this patch that avoids saving process pointers and instead saves off the process pid. Rather than referencing the bad pointers, it now does process lookups. There is special code that makes the output nicer for printing holder information for processes that have ended. This patch also adds a stub for the new "sprint_symbol" function that exists in Andrew Morton's -mm patch set, but won't go into the base kernel until 2.6.22, since it adds functionality but doesn't fix a bug. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:10:55 +01:00
Benjamin Marzinski	172e045a7f	[GFS2] flush the log if a transaction can't allocate space This is a fix for bz #208514. When GFS2 frees up space, the freed blocks aren't available for reuse until the resource group is successfully written to the ondisk journal. So in rare cases, GFS2 operations will fail, saying that the filesystem is out of space, when in reality, you are just waiting for a log flush. For instance, on a 1Gig filesystem, if I continually write 10 Mb to a file, and then truncate it, after a hundred interations, the write will fail with -ENOSPC, even though the filesystem is just 1% full. The attached patch calls a log flush in these cases. I tested this patch fairly heavily to check if there were any locking issues that I missed, and it seems to work just fine. Also, this patch only does the log flush if get_local_rgrp makes a complete loop of resource groups without skipping any do to locking issues. The code would be slightly simpler if it just always did the log flush after the first failed pass, and you could only ever have to go through the loop twice, instead of up to three times. However, I guessed that failing to find a rg simply do to locking issues would be common enough to skip the log flush in that case, but I'm not certain that this is the right way to go. Either way, I don't suppose this code will be hit all that often. Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:10:52 +01:00
Benjamin Marzinski	6883562588	[GFS2] Fix log entry list corruption When glock_lo_add and rg_lo_add attempt to add an element to the log, they check to see if has already been added before locking the log. If another process adds that element to the log in this window between the check and locking the log, the element will be added to the list twice. This causes the log element list to become corrupted in such a way that the log element can never be successfully removed from the list. This patch pulls the list_empty() check inside the log lock, to remove this window. Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:10:50 +01:00
Steven Whitehouse	f35ac346bc	[GFS2] Speed up lock_dlm's locking (move sprintf) The following patch speeds up lock_dlm's locking by moving the sprintf out from the lock acquisition path and into the lock creation path. This reduces the amount of CPU time used in acquiring locks by a fair amount. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Acked-by: David Teigland <teigland@redhat.com>	2007-05-01 09:10:47 +01:00
Steven Whitehouse	420d2a1028	[GFS2] Fix a bug on i386 due to evaluation order Since gcc didn't evaluate the last two terms of the expression in glock.c:1881 as a constant expression, it resulted in an error on i386 due to the lack of a 64bit divide instruction. This adds some brackets to fix the problem. This was reported by Andrew Morton. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org>	2007-05-01 09:10:42 +01:00
Steven Whitehouse	3b8249f617	[GFS2] Fix bz 224480 and cleanup glock demotion code This patch prevents the printing of a warning message in cases where the fs is functioning normally by handing off responsibility for unlinked, but still open inodes, to another node for eventual deallocation. Also, there is now an improved system for ensuring that such requests to other nodes do not get lost. The callback on the iopen lock is only ever called when i_nlink == 0 and when a node is unable to deallocate it due to it still being in use on another node. When a node receives the callback therefore, it knows that i_nlink must be zero, so we mark it as such (in gfs2_drop_inode) in order that it will then attempt deallocation of the inode itself. As an additional benefit, queuing a demote request no longer requires a memory allocation. This simplifies the code for dealing with gfs2_holders as it removes one special case. There are two new fields in struct gfs2_glock. gl_demote_state is the state which the remote node has requested and gl_demote_time is the time when the request came in. Both fields are only valid when the GLF_DEMOTE flag is set in gl_flags. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:10:39 +01:00
Josef Whiter	1de9139092	[GFS2] Fix bz 231380, unlock page before dequeing glocks in gfs2_commit_write If we are writing a file, and in the middle of writing the file another node attempts to get a shared lock on that file (by doing a du for example) the process doing the writing will hang waiting on lock_page. The reason for this is because when we have waiters on a exclusive glock, we will go through and flush out all dirty pages associated with that inode and release the lock. The problem is that when we flush the dirty pages, we could hit a page that we have locked durring the generic_file_buffered_write part of this operation. This patch unlocks the page before we go to dequeue the lock and locks it immediatly afterwards, since generic_file_buffered_write needs the page locked when the commit_write is completed. This patch resolves the problem, however if somebody sees a better way to do this please don't hesistate to yell. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:10:37 +01:00
Josef Whiter	5c7342d894	[GFS2] fix bz 231369, gfs2 will oops if you specify an invalid mount option If you specify an invalid mount option when trying to mount a gfs2 filesystem, gfs2 will oops. The attached patch resolves this problem. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:10:32 +01:00
Robert Peterson	7c52b166c5	[GFS2] Add gfs2_tool lockdump support to gfs2 (bz 228540) The attached patch resolves bz 228540. This adds the capability for gfs2 to dump gfs2 locks through the debugfs file system. This used to exist in gfs1 as "gfs_tool lockdump" but it's missing from gfs2 because all the ioctls were stripped out. Please see the bugzilla for more history about the fix. This patch is also attached to the bugzilla record. The patch is against Steve Whitehouse's latest nmw git tree kernel (2.6.21-rc1) and has been tested on system trin-10. Signed-off-by: Robert Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:10:29 +01:00
Steven Whitehouse	c3f49bc209	[GFS2] Fix bz 229873, alternate test: assertion "!ip->i_inode.i_mapping->nrpages" failed The following removes an incorrect assertion from the GFS2 glops code. This fixes Red Hat bz 229873. Thanks to Abhijith Das for testing the patch and confirming the fix. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Abhijith Das <adas@redhat.com>	2007-03-07 14:03:53 -05:00
akpm@linux-foundation.org	95d97b7dd7	[GFS2] build fix fs/gfs2/glock.c:2198: error: 'THIS_MODULE' undeclared here (not in a function) Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-03-07 14:03:25 -05:00
Steven Whitehouse	631c42e170	[GFS2] go_drop_bh is never used, so remove it The ->go_drop_bh function is never used, so this removes it and the single caller, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-03-07 14:02:53 -05:00
Steven Whitehouse	04b159b132	[GFS2] Remove unused variable Remove an unused variable. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-03-07 14:02:30 -05:00
Steven Whitehouse	1be3867955	[GFS2] Fix bz 229831, lookup returns wrong inode The following patch fixes Red Hat bz 229831. Without this patch its possible for the wrong inode to be returned in certain cases. It is a pretty unusual event, so that its taken some time to track down. Thanks and due to Josef Whiter who did a lot of the testing required to thrack this down and fix it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-03-07 14:01:53 -05:00
Steven Whitehouse	cad5b93927	[GFS2] Fix bz 230143, incorrect flushing of rgrps The below patch fixes a problem where we were not flushing rgrps correctly. It only occurred in the specific case that a callback was received for an rgrp which was dirty and when a journal log flush had not already resulted in the rgrp being flushed anyway. This fixes Red Hat bz 230143, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-03-07 14:00:14 -05:00
Wendy Cheng	fb0d3bce8e	[GFS2] pass formal ino in do_filldir_main ok, the following is the minimum changes to get NFSD going before we settle down this issue .. would appreciate this in the tree so other NFS related works can get done in parallel. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-03-07 13:58:45 -05:00
Josef Whiter	a13cbe3753	[GFS2] fix hangup when multiple processes are trying to write to the same file This fixes a problem I encountered while running bonnie++. When you have one thread that opens a file and starts to write to it, and then another thread that tries to open and write to the same file, the second thread will loop forever trying to grab the inode lock for that inode. Basically we come in through generic_buffered_file_write, which calls gfs2_prepare_write, which then attempts to grab the glock. Because we don't own the lock, gfs2_prepare_write gets GLR_TRYFAILED, which returns AOP_TRUNCATED_PAGE to generic_buffered_file_write. At this point generic_buffered_file_write loops around again and immediately retries the prepare_write. This means that the second process never gets off of the processor in order to allow the process that holds the lock to finish its work and let go of the lock. This patch makes gfs2_glock_nq schedule() if it gets back a GLR_TRYFAILED, which resolves this problem. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-03-07 13:58:02 -05:00
Wendy Cheng	a7d2b2bdc9	[GFS2] NFS filehandle check File handle checking error found in '07 NFS connectathon. The fh_type and fh_len are not necessarily identical. Some of the client machines could fail mount with stale filehandle without this patch. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-03-07 13:57:34 -05:00
Richard Fearn	d5a6751b32	[GFS2] add newline to printk message Patch for the 2.6.20 stable tree that adds a missing newline to one of the printk messages in fs/gfs2/ops_fstype.c. Signed-off-by: Richard Fearn <richardfearn@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-03-07 13:57:10 -05:00
Josef Whiter	2e95b6653b	[GFS2] fix locking mistake This patch fixes a locking mistake in the quota code, we do a mutex_lock instead of a mutex_unlock. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-03-07 13:56:41 -05:00
Tim Schmielau	cd354f1ae7	[PATCH] remove many unneeded #includes of sched.h After Al Viro (finally) succeeded in removing the sched.h #include in module.h recently, it makes sense again to remove other superfluous sched.h includes. There are quite a lot of files which include it but don't actually need anything defined in there. Presumably these includes were once needed for macros that used to live in sched.h, but moved to other header files in the course of cleaning it up. To ease the pain, this time I did not fiddle with any header files and only removed #includes from .c-files, which tend to cause less trouble. Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha, arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig, allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all configs in arch/arm/configs on arm. I also checked that no new warnings were introduced by the patch (actually, some warnings are removed that were emitted by unnecessarily included header files). Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:54 -08:00
Josef 'Jeff' Sipek	ee9b6d61a2	[PATCH] Mark struct super_operations const This patch is inspired by Arjan's "Patch series to mark struct file_operations and struct inode_operations const". Compile tested with gcc & sparse. Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:47 -08:00
Arjan van de Ven	92e1d5be91	[PATCH] mark struct inode_operations const 2 Many struct inode_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:46 -08:00
Arjan van de Ven	00977a59b9	[PATCH] mark struct file_operations const 6 Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:45 -08:00
Robert P. J. Day	c376222960	[PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc(). Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the corresponding "kmem_cache_zalloc()" call. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Roland McGrath <roland@redhat.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Acked-by: Joel Becker <Joel.Becker@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Jan Kara <jack@ucw.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:27 -08:00
Adrian Bunk	a2cf822274	[GFS2] make gfs2_writepages() static On Mon, Jan 29, 2007 at 08:45:28PM -0800, Andrew Morton wrote: >... > Changes since 2.6.20-rc6-mm2: >... > git-gfs2-nmw.patch >... > git trees >... This patch makes the needlessly global gfs2_writepages() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-07 10:48:48 -05:00
Steven Whitehouse	2d72e7101c	[GFS2] Unlock page on prepare_write try lock failure When the try lock of the glock failed in prepare_write we were incorrectly exiting this function with the page still locked. This was resulting in further I/O to this page hanging. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-07 10:25:59 -05:00
Wendy Cheng	549ae0ac3d	[GFS2] nfsd readdirplus assertion failure Glock assertion failure found in '07 NFS connectathon. One of the NFSDs is doing a "readdirplus" procedure call. It passes the logic into gfs2_readdir() where it obtains its directory inode glock. This is then followed by filehandle construction that invokes lookup code. It hits the assertion failure while trying to obtain the inode glock again inside gfs2_drevalidate(). This patch bypasses the recursive glock call if caller already holds the lock. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-06 11:36:01 -05:00
Randy Dunlap	9beeb9f3c5	[DLM/GFS2] indent help text Indent help text as expected. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:20 -05:00
Russell Cattelan	ddee76089c	[GFS2] Fix unlink deadlocks Move the glock acquisition to outside of the transactions. Lock odering must be preserved in order to prevent ABBA deadlocks. The current gfs2_change_nlink code would tries to grab the glock after having started a transaction and thus is holding the log lock. This is inconsistent with other code paths in gfs that grab the resource group glock prior to staring a tranactions. One problem with this fix is that the resource group lock is always grabbed now even if the inode still has ref count and can not be marked for unlink. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:17 -05:00
Steven Whitehouse	61be084efc	[GFS2] Put back semaphore to avoid umount problem Dave Teigland fixed this bug a while back, but I managed to mistakenly remove the semaphore during later development. It is required to avoid the list of inodes changing during an invalidate_inodes call. I have made it an rwsem since the read side will be taken frequently during normal filesystem operation. The write site will only happen during umount of the file system. Also the bug only triggers when using the DLM lock manager and only then under certain conditions as its timing related. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: David Teigland <teigland@redhat.com>	2007-02-05 13:38:14 -05:00
Eric Sandeen	bbb28ab759	[GFS2] more CURRENT_TIME_SEC Whoops, quilt user error, missed this one in the previous patch. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:11 -05:00
Adrian Bunk	0011727785	[GFS2/DLM] fix GFS2 circular dependency On Sun, Jan 28, 2007 at 11:08:18AM +0100, Jiri Slaby wrote: > Andrew Morton napsal(a): > >Temporarily at > > > > http://userweb.kernel.org/~akpm/2.6.20-rc6-mm1/ > > Unable to select IPV6. Menuconfig doesn't offer it when INET is selected. > When it's not it appears in the menu, but after state change it gets away. > The same behaviour in xconfig, gconfig. > > $ mkdir ../a/tst > $ make O=../a/tst menuconfig > HOSTCC scripts/basic/fixdep > [...] > HOSTLD scripts/kconfig/mconf > scripts/kconfig/mconf arch/i386/Kconfig > Warning! Found recursive dependency: INET GFS2_FS_LOCKING_DLM SYSFS > OCFS2_FS INET > > Maybe this is the problem? Yes, patch below. > regards, cu Adrian <-- snip --> This patch fixes a circular dependency by letting GFS2_FS_LOCKING_DLM and DLM depend on instead of select SYSFS. Since SYSFS depends on EMBEDDED this change shouldn't cause any problems for users. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:08 -05:00
Randy Dunlap	67f55897ee	[GFS2/DLM] use sysfs With CONFIG_DLM=m, CONFIG_PROC_FS=n, and CONFIG_SYSFS=n, kernel build fails with: WARNING: "kernel_subsys" [fs/gfs2/locking/dlm/lock_dlm.ko] undefined! WARNING: "kernel_subsys" [fs/dlm/dlm.ko] undefined! WARNING: "kernel_subsys" [fs/configfs/configfs.ko] undefined! make[1]: * [__modpost] Error 1 make: * [modules] Error 2 Since fs/dlm/lockspace.c and fs/gfs2/locking/dlm/sysfs.c use kernel_subsys, they should either DEPEND on it or SELECT it. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:05 -05:00
David Teigland	ee32e4f3d3	[GFS2] make lock_dlm drop_count tunable in sysfs We want to be able to change or disable the default drop_count (number at which the dlm asks gfs to limit the the number of locks it's holding). Add it to the collection of sysfs tunables for an fs. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:01 -05:00
David Teigland	2f708649ba	[GFS2] increase default lock limit Increase the number of locks at which point the dlm begins asking gfs to reduce its lock usage. The default value is largely arbitrary, but the current value of 50,000 ends up limiting performance unnecessarily for too many users. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:59 -05:00
Steven Whitehouse	8bd9572769	[GFS2] Fix list corruption in lops.c The patch below appears to fix the list corruption that we are seeing on occasion. Although the transaction structure is private to a single thread, when the queued structures are dismantled during an in-core commit, its possible for a different thread to be trying to add the same structure to another, new, transaction at the same time. To avoid this, this patch takes the log spinlock during this operation. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:56 -05:00
Steven Whitehouse	d7c103d0bd	[GFS2] Fix recursive locking attempt with NFS In certain cases, its possible for NFS to call the lookup code while holding the glock (when doing a readdirplus operation) so we need to check for that and not try and lock the glock twice. This also fixes a typo in a previous NFS related GFS2 patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:53 -05:00
Steven Whitehouse	d043e1900c	[GFS2] Fix typo in glock.c This is a one letter typo fix in glock.c, spotted by Rob Kenna. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:41 -05:00
Eric Sandeen	ddfe062783	[GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 I was looking something else up and came across this... I don't honestly have a good reason to change it other than to make it like every other Linux filesystem in this regard. ;-) It doesn't functionally change anything, but makes some lines shorter. :) I'm also curious; why does gfs2 have 64-bits of on-disk timestamps, but not in timespec_t format, and only stores second resolutions? Seems like you're halfway to sub-second resolutions already. I suppose if that gets implemented then all of the below should instead be CURRENT_TIME not CURRENT_TIME_SEC. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:38 -05:00
Steven Whitehouse	90101c3186	[GFS2] Compile fix for glock.c This one liner got missed from the previous patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:35 -05:00
Steven Whitehouse	12132933c4	[GFS2] Remove queue_empty() function This function is not longer required since we do not do recursive locking in the glock layer. As a result all its callers can be replaceed with list_empty() calls. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:32 -05:00
Steven Whitehouse	b5d32bead1	[GFS2] Tidy up glops calls This patch doesn't make any changes to the ordering of the various operations related to glocking, but it does tidy up the calls to the glops.c functions to make the structure more obvious. The two functions: gfs2_glock_xmote_th() and gfs2_glock_drop_th() can be made static within glock.c since they are called by every set of glock operations. The xmote_th and drop_th glock operations are then made conditional upon those two routines existing and called from the previously mentioned functions in glock.c respectively. Also it can be seen that the go_sync operation isn't needed since it can easily be replaced by calls to xmote_bh and drop_bh respectively. This results in no longer (confusingly) calling back into routines in glock.c from glops.c and also reducing the glock operations by one member. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:26 -05:00
Steven Whitehouse	1c0f4872dc	[GFS2] Remove local exclusive glock mode Here is a patch for GFS2 to remove the local exclusive flag. In the places it was used, mutex's are always held earlier in the call path, so it appears redundant in the LM_ST_SHARED case. Also, the GFS2 holders were setting local exclusive in any case where the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock code where the flag was tested have been replaced with tests for the lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well as globally exclusive). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:20 -05:00
Steven Whitehouse	6bd9c8c2fb	[GFS2] Remove unused go_callback operation This is never used, so we might as well remove it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:17 -05:00
Steven Whitehouse	e5dab552c8	[GFS2] Remove the "greedy" function from glock.[ch] The "greedy" code was an attempt to retain glocks for a minimum length of time when they relate to mmap()ed files. The current implementation of this feature is not, however, ideal in that it required allocating memory in order to do this and its overly complicated. It also misses the mark by ignoring the other I/O operations which are just as likely to suffer from the same problem. So the plan is to remove this now and then add the functionality back as part of the glock state machine at a later date (and thus take into account all the possible users of this feature) Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:14 -05:00
Steven Whitehouse	fee852e374	[GFS2] Shrink gfs2_inode memory by half Here is something I spotted (while looking for something entirely different) the other day. Rather than using a completion in each and every struct gfs2_holder, this removes it in favour of hashed wait queues, thus saving a considerable amount of memory both on the stack (where a number of gfs2_holder structures are allocated) and in particular in the gfs2_inode which has 8 gfs2_holder structures embedded within it. As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to 1912 bytes, a saving of 576 bytes per inode (no thats not a typo!). In actual practice we get a much better result than that since now that a gfs2_inode is under the 2048 byte barrier, we get two per 4k slab page effectively halving the amount of memory required to store gfs2_inodes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:11 -05:00
Steven Whitehouse	330005c2b2	[GFS2] Remove max_atomic_write tunable This removes an unused sysfs tunable parameter. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:08 -05:00
Steven Whitehouse	3699e3a44b	[GFS2] Clean up/speed up readdir This removes the extra filldir callback which gfs2 was using to enclose an attempt at readahead for inodes during readdir. The code was too complicated and also hurts performance badly in the case that the getdents64/readdir call isn't being followed by stat() and it wasn't even getting it right all the time when it was. As a result, on my test box an "ls" of a directory containing 250000 files fell from about 7mins (freshly mounted, so nothing cached) to between about 15 to 25 seconds. When the directory content was cached, the time taken fell from about 3mins to about 4 or 5 seconds. Interestingly in the cached case, running "ls -l" once reduced the time taken for subsequent runs of "ls" to about 6 secs even without this patch. Now it turns out that there was a special case of glocks being used for prefetching the metadata, but because of the timeouts for these locks (set to 10 secs) the metadata was being timed out before it was being used and this the prefetch code was constantly trying to prefetch the same data over and over. Calling "ls -l" meant that the inodes were brought into memory and once the inodes are cached, the glocks are not disposed of until the inodes are pushed out of the cache, thus extending the lifetime of the glocks, and thus bringing down the time for subsequent runs of "ls" considerably. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:04 -05:00
Steven Whitehouse	a8d638e30e	[GFS2] Add writepages for "data=writeback" mounts It occurred to me that although a gfs2 specific writepages for ordered writes and journaled data would be tricky, by hooking writepages only for "data=writeback" mounts we could take advantage of not needing buffer heads (we don't use them on the read side, nor have we for some time) and create much larger I/Os for the block layer. Using blktrace both before and after, its possible to see that for large I/Os, most of the requests generated through writepages are now 1024 sectors after this patch is applied as opposed to 8 sectors before. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:01 -05:00
Adrian Bunk	03dc6a538e	[GFS2] make gfs2_change_nlink_i() static On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote: >... > Changes since 2.6.20-rc3-mm1: >... > git-gfs2-nmw.patch >... > git trees >... This patch makes the needlessly globlal gfs2_change_nlink_i() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:49 -05:00
Robert Peterson	7083146564	[GFS2] gfs2 knows of directories which it chooses not to display This is for Red Hat bugzilla bug bz #222302: Moving a virtual IP from node to node between two NFS-over-GFS2 servers was causing one of the GFS2 servers to become confused and reference a deleted inode. The problem was due to vfs dentries that did not reference the gfs2_dops and therefore didn't call the gfs2 revalidate code to revalidate a dentry after a directory had been deleted & recreated. This patch is a crosswrite from a RHEL4 bug found in GFS1 as bz #190756 and it is against the latest -nmw git tree. Signed-off-by: Robert Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:46 -05:00
S. Wendy Cheng	87d21e07f3	[GFS2] Fix gfs2_rename deadlock Second round of gfs2_rename lock re-ordering to allow Anaconda adding root partition on top of gfs2. Previous to this patch the recursive lock detector in glock.c can be triggered due to attempting to lock the rgrp twice. This fixes it by checking to see whether the rgrp is already locked. This fixes Red Hat bugzilla #221237 Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:31 -05:00
Russell Cattelan	6c93fd1e57	[GFS2] BZ 217008 fsfuzzer fix. Update the quilt header comments to match the code changes. Change gfs2_lookup_simple to return an error in the case of a NULL inode. The callers of gfs2_lookup_simple do not check for NULL in the no entry case and such would end up dereferencing a NULL ptr. This fixes: http://projects.info-pull.com/mokb/MOKB-15-11-2006.html Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:28 -05:00
Steven Whitehouse	49686f7106	[GFS2] Fix ordering of page disposal vs. glock_dq In case of unlinked files with dirty pages GFS2 wasn't clearing the pages in quite the right order. This patch clears the pages earlier (before the qlock_dq) to avoid the situation that the release of the glock results in attempting to write back data that has already been deallocated. This fixes Red Hat bugzilla: #220117 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:24 -05:00
S. Wendy Cheng	5509826f1e	[GFS2] Fix change nlink deadlock Bugzilla 215088 Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2 partition. The gfs2_rename() apparently needs block allocation for the new name (into the directory) where it requires rg locks. At the same time, while updating the nlink count for the replaced file, gfs2_change_nlink() tries to return the inode meta-data back to resource group where it needs rg locks too. Our logic doesn't allow process to acquire these locks recursively by the same process (RHEL installer) that results a BUG call. This only happens within rename code path and only if the destination file exists before the rename operation. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:15 -05:00
Steven Whitehouse	e1d5b18ae9	[GFS2] Fail over to readpage for stuffed files This is partially derrived from a patch written by Russell Cattelan. It fixes a bug where there is a race between readpages and truncate by ignoring readpages for stuffed files. This is ok because a stuffed file will never be more than one block (minus sizeof(struct gfs2_dinode)) in size and block size is always less than page size, so we do not lose anything efficiency-wise by not doing readahead for stuffed files. They will have already been "read ahead" by the action of reading the inode in, in the first place. This is the remaining part of the fix for Red Hat bugzilla #218966 which had not yet made it upstream. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Russell Cattelan <cattelan@redhat.com>	2007-02-05 13:36:12 -05:00
Steven Whitehouse	c7b3383437	[GFS2] Fix DIO deadlock This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs due to trying to take the i_mutex while holding a glock. The correct locking order is defined as i_mutex -> glock in all cases. I've left dealing with allocating writes. I know that we need to do that, but for now this should do the trick. We don't need to take the i_mutex on write, because the VFS has already taken it for us. On read we don't need it since the glock is enough protection. The reason that I've made some of the checks into a separate function is that we'll need to do the checks again in the allocating write case eventually, so this is partly in preparation for this. Likewise the return value test of != 1 might look a bit odd and thats because we'll need a third return value in case of requiring an allocation. I've made the change to deferred mode on the glock to ensure flushing read caches on other nodes. I notice that (using blktrace to look at whats going on) we appear to do a better job of large I/Os than ext3 after this patch (in terms of not splitting up the I/Os). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Wendy Cheng <wcheng@redhat.com>	2007-02-05 13:36:09 -05:00
David Teigland	c378051177	[GFS2] don't try to lockfs after shutdown If an fs has already been shut down, a lockfs callback should do nothing. An fs that's been shut down can't acquire locks or do anything with respect to the cluster. Also, remove FIXME comment in withdraw function. The missing bits of the withdraw procedure are now all done by user space. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:44 -05:00
David Chinner	f73ca1b76c	[PATCH] Revert bd_mount_mutex back to a semaphore Revert bd_mount_mutex back to a semaphore so that xfs_freeze -f /mnt/newtest; xfs_freeze -u /mnt/newtest works safely and doesn't produce lockdep warnings. (XFS unlocks the semaphore from a different task, by design. The mutex code warns about this) Signed-off-by: Dave Chinner <dgc@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-11 18:18:21 -08:00
Steven Whitehouse	1003f06953	[GFS2] Fix Kconfig Here is a patch to fix up the Kconfig so that we don't land up with problems when people disable the NET subsystem. Thanks for all the hints and suggestions that people have sent me regarding this. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Aleksandr Koltsoff <czr@iki.fi> Cc: Toralf Förster <toralf.foerster@gmx.de> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Adrian Bunk <bunk@stusta.de> Cc: Chris Zubrzycki <chris@middle--earth.org> Cc: Patrick Caulfield <pcaulfie@redhat.com>	2006-12-15 12:51:51 -05:00
Josef Sipek	81454098f7	[PATCH] struct path: convert gfs2 Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:45 -08:00
Linus Torvalds	1c1afa3c05	Merge master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (73 commits) [DLM] Clean up lowcomms [GFS2] Change gfs2_fsync() to use write_inode_now() [GFS2] Fix indent in recovery.c [GFS2] Don't flush everything on fdatasync [GFS2] Add a comment about reading the super block [GFS2] Mount problem with the GFS2 code [GFS2] Remove gfs2_check_acl() [DLM] fix format warnings in rcom.c and recoverd.c [GFS2] lock function parameter [DLM] don't accept replies to old recovery messages [DLM] fix size of STATUS_REPLY message [GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning [DLM] fix add_requestqueue checking nodes list [GFS2] Fix recursive locking in gfs2_getattr [GFS2] Fix recursive locking in gfs2_permission [GFS2] Reduce number of arguments to meta_io.c:getbuf() [GFS2] Move gfs2_meta_syncfs() into log.c [GFS2] Fix journal flush problem [GFS2] mark_inode_dirty after write to stuffed file [GFS2] Fix glock ordering on inode creation ...	2006-12-07 09:13:20 -08:00
Christoph Lameter	e18b890bb0	[PATCH] slab: remove kmem_cache_t Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name ".c" -o -name ".h"\|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:25 -08:00
Steven Whitehouse	34126f9f41	[GFS2] Change gfs2_fsync() to use write_inode_now() This is a bit better than the previous version of gfs2_fsync() although it would be better still if we were able to call a function which only wrote the inode & metadata. Its no big deal though that this will potentially write the data as well since the VFS has already done that before calling gfs2_fsync(). I've also added a comment to explain whats going on here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@osdl.org>	2006-12-07 09:13:14 -05:00
Steven Whitehouse	887bc5d00c	[GFS2] Fix indent in recovery.c As per comments from Andrew Morton and Jan Engelhardt, this fixes the indent and removes the "static" from a variable declaration since its not needed in this case (now allocated on the stack of the function in question). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Cc: Andrew Morton <akpm@osdl.org>	2006-12-05 13:34:17 -05:00
David Howells	9db7372445	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/ata/libata-scsi.c include/linux/libata.h Futher merge of Linus's head and compilation fixups. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-12-05 17:01:28 +00:00
Al Viro	bd01f843c3	[PATCH] severing skbuff.h -> poll.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-12-04 02:00:31 -05:00
Steven Whitehouse	33c3de3287	[GFS2] Don't flush everything on fdatasync The gfs2_fsync() function was doing a journal flush on each and every call. While this is correct, its also a lot of overhead. This patch means that on fdatasync flushes we rely on the VFS to flush the data for us and we don't do a journal flush unless we really need to. We have to do a journal flush for stuffed files though because they have the data and the inode metadata in the same block. Journaled files also need a journal flush too of course. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:37:44 -05:00
Steven Whitehouse	aac1a3c77a	[GFS2] Add a comment about reading the super block The comment explains why we use the bio functions to read the super block. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Srinivasa Ds <srinivasa@in.ibm.com>	2006-11-30 10:37:40 -05:00
Srinivasa Ds	0da3585e1e	[GFS2] Mount problem with the GFS2 code While mounting the gfs2 filesystem,our test team had a problem and we got this error message. ======================================================= GFS2: fsid=: Trying to join cluster "lock_nolock", "dasde1" GFS2: fsid=dasde1.0: Joined cluster. Now mounting FS... GFS2: not a GFS2 filesystem GFS2: fsid=dasde1.0: can't read superblock: -22 ========================================================================== On debugging further we found that problem is while reading the super block(gfs2_read_super) and comparing the magic number in it. When I replace the submit_bio() call(present in gfs2_read_super) with the sb_getblk() and ll_rw_block(), mount operation succeded. On further analysis we found that before calling submit_bio(), bio->bi_sector was set to "sector" variable. This "sector" variable has the same value of bh->b_blocknr(block number). Hence there is a need to multiply this valuwith (blocksize >> 9)(9 because,sector size 2^9,samething happens in ll_rw_block also, before calling submit_bio()). So I have developed the patch which solves this problem. Please let me know your comments. ================================================================ Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:37:36 -05:00
Steven Whitehouse	77386e1f66	[GFS2] Remove gfs2_check_acl() As pointed out by Adrian Bunk, the gfs2_check_acl() function is no longer used. This patch removes it and renamed gfs2_check_acl_locked() to gfs2_check_acl() since we only need one variant of that function now. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Adrian Bunk <bunk@stusta.de>	2006-11-30 10:37:32 -05:00
Randy Dunlap	0ac230699a	[GFS2] lock function parameter Fix function parameter typing: fs/gfs2/glock.c💯 warning: function declaration isn't a prototype Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:37:18 -05:00
Ryusuke Konishi	aed3255f22	[GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning Fix a printk format warning in fs/gfs2/log.c: fs/gfs2/log.c:322: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'sector_t' Signed-off-by: Ryusuke Konishi <ryusuke@osrg.net> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:37:04 -05:00
Steven Whitehouse	dcf3dd852f	[GFS2] Fix recursive locking in gfs2_getattr The readdirplus NFS operation can result in gfs2_getattr being called with the glock already held. In this case we do not want to try and grab the lock again. This fixes Red Hat bugzilla #215727 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:56 -05:00
Steven Whitehouse	300c7d75f3	[GFS2] Fix recursive locking in gfs2_permission Since gfs2_permission may be called either from the VFS (in which case we need to obtain a shared glock) or from GFS2 (in which case we already have a glock) we need to test to see whether or not a lock is required. The original test was buggy due to a potential race. This one should be safe. This fixes Red Hat bugzilla #217129 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:53 -05:00
Steven Whitehouse	cb4c031318	[GFS2] Reduce number of arguments to meta_io.c:getbuf() Since the superblock and the address_space are determined by the glock, we might as well just pass that as the argument since all the callers already have that available. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:50 -05:00
Steven Whitehouse	a25311c8e0	[GFS2] Move gfs2_meta_syncfs() into log.c By moving gfs2_meta_syncfs() into log.c, gfs2_ail1_start() can be made static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:45 -05:00
Steven Whitehouse	b004157ab5	[GFS2] Fix journal flush problem This fixes a bug which resulted in poor performance due to flushing the journal too often. The code path in question was via the inode_go_sync() function in glops.c. The solution is not to flush the journal immediately when inodes are ejected from memory, but batch up the work for glockd to deal with later on. This means that glocks may now live on beyond the end of the lifetime of their inodes (but not very much longer in the normal case). Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in calculation of the number of free journal blocks. The gfs2_logd process has been altered to be more responsive to the journal filling up. We now wake it up when the number of uncommitted journal blocks has reached the threshold level rather than trying to flush directly at the end of each transaction. This again means doing fewer, but larger, log flushes in general. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:42 -05:00
Steven Whitehouse	ae619320b2	[GFS2] mark_inode_dirty after write to stuffed file Writes to stuffed files were not being marked dirty correctly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:36 -05:00
Steven Whitehouse	28626e2078	[GFS2] Fix glock ordering on inode creation The lock order here should be parent -> child rather than numeric order. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:33 -05:00
Steven Whitehouse	1a14d3a68f	[GFS2] Simplify glops functions The go_sync callback took two flags, but one of them was set on every call, so this patch removes once of the flags and makes the previously conditional operations (on this flag), unconditional. The go_inval callback took three flags, each of which was set on every call to it. This patch removes the flags and makes the operations unconditional, which makes the logic rather more obvious. Two now unused flags are also removed from incore.h. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:30 -05:00
Steven Whitehouse	fa2ecfc5e1	[GFS2] Fix Kconfig wrt CRC32 GFS2 requires the CRC32 library function. This was reported by Toralf Förster. Cc: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:24 -05:00
Steven Whitehouse	5e7d65cd9d	[GFS2] Make sentinel dirents compatible with gfs1 When deleting directory entries, we set the inum.no_addr to zero in a dirent when its the first dirent in a block and thus cannot be merged into the previous dirent as is the usual case. In gfs1, inum.no_formal_ino was used instead. This patch changes gfs2 to set both inum.no_addr and inum.no_formal_ino to zero. It also changes the test from just looking at inum.no_addr to look at both inum.no_addr and inum.no_formal_ino and a sentinel is now considered to be a dirent in which _either_ (or both) of them is set to zero. This resolves Red Hat bugzillas: #215809, #211465 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:20 -05:00
Steven Whitehouse	dcd2479959	[GFS2] Remove unused function from inode.c The gfs2_glock_nq_m_atime function is unused in so far as its only ever called with num_gh = 1, and this falls through to the gfs2_glock_nq_atime function, so we might as well call that directly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:57 -05:00
Steven Whitehouse	175011cf6e	[GFS2] Remove unused sysfs files Four of the sysfs files are unused and can therefore be removed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:53 -05:00
Steven Whitehouse	4cf1ed8144	[GFS2] Tidy up bmap & fix boundary bug This moves the locking for bmap into the bmap function itself rather than using a wrapper function. It also fixes a bug where the boundary flag was set on the wrong bh. Also the flags on the mapped bh are reset earlier in the function to ensure that they are 100% correct on the error path. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:49 -05:00
Steven Whitehouse	ab923031ce	[GFS2] Fix memory allocation in glock.c Change from GFP_KERNEL to GFP_NOFS as this was causing a slow down when trying to push inodes from cache. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:46 -05:00
Russell Cattelan	61057c6bb3	[GFS2] Remove unused zero_readpage from stuffed_readpage Stuffed files only consist of a maximum of (gfs2 block size - sizeof(struct gfs2_dinode)) bytes. Since the gfs2 block size is always less than page size, we will never see a call to stuffed_readpage for anything other than the first page in the file. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:57 -05:00
Russell Cattelan	7020933156	[GFS2] Fix race in logging code The log lock is dropped prior to io submittion, but this exposes a hole in which the log data structures may be going away due to a truncate. Store the buffer head in a local pointer prior to dropping the lock and relay on the buffer_head lock for consitency on the buffer head. Signed-Off-By: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:55 -05:00
Steven Whitehouse	9e2dbdac3d	[GFS2] Remove gfs2_inode_attr_in This function wasn't really doing the right thing. There was no need to update the inode size at this point and the updating of the i_blocks field has now been moved to the places where di_blocks is updated. A result of this patch and some those preceeding it is that unlocking a glock is now a much more efficient process, since there is no longer any requirement to copy data from the gfs2 inode into the vfs inode at this point. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:52 -05:00
Steven Whitehouse	e7c698d74f	[GFS2] Inode number is constant Since the inode number is constant, we don't need to keep updating it everytime we refresh the other inode fields. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:48 -05:00
Steven Whitehouse	6b124d8dba	[GFS2] Only set inode flags when required We were setting the inode flags from GFS2's flags far too often, even when they couldn't possibly have changed. This patch reduces the amount of flag setting going on so that we do it only when the inode is read in or when the flags have changed. The create case is covered by the "when the inode is read in" case. This also fixes a bug where we didn't set S_SYNC correctly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:45 -05:00
Steven Whitehouse	2ca99501fa	[GFS2] Fix page lock/glock deadlock This fixes a race between the glock and the page lock encountered during truncate in gfs2_readpage and gfs2_prepare_write. The gfs2_readpages function doesn't need the same fix since it only uses a try lock anyway, so it will fail back to gfs2_readpage in the case of a potential deadlock. This bug was spotted by Russell Cattelan. Cc: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:43 -05:00
Steven Whitehouse	c594d88664	[GFS2] Remove unused GL_DUMP flag There is no way to set the GL_DUMP flag, and in any case the same thing can be done with systemtap if required for debugging, so this removes it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:40 -05:00
Steven Whitehouse	f6e58f01e8	[GFS2] Don't copy meta_header for rgrp in and out The meta_header for an ondisk rgrp never changes, so there is no point copying it in and back out to disk. Also there is no reason to keep a copy for each rgrp in memory. The code already checks to ensure that the header is correct before it calls the routine to copy the data in, so that we don't even need to check whether its correct on disk in the functions in ondisk.c Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:36 -05:00
Steven Whitehouse	294caaa3b8	[GFS2] Tidy up 0 initialisations in inode.c We don't need to use endian conversions for 0 initialisations when creating a new on-disk inode. Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:33 -05:00
Steven Whitehouse	bfded27ba0	[GFS2] Shrink gfs2_inode (8) - i_vn This shrinks the size of the gfs2_inode by 8 bytes by replacing the version counter with a one bit valid/invalid flag. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:30 -05:00
Steven Whitehouse	a9583c7983	[GFS2] Shrink gfs2_inode (7) - di_payload_format This is almost never used. Its there for backward compatibility with GFS1. It doesn't need its own field since it can always be calculated from the inode mode & flags. This saves a bit more space in the gfs2_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:26 -05:00
Steven Whitehouse	1a7b1eed58	[GFS2] Shrink gfs2_inode (6) - di_atime/di_mtime/di_ctime Remove the di_[amc]time fields and use inode->i_[amc]time fields instead. This saves 24 bytes from the gfs2_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:23 -05:00
Steven Whitehouse	4f56110a00	[GFS2] Shrink gfs2_inode (5) - di_nlink Remove the di_nlink field in favour of inode->i_nlink and update the nlink handling to use the proper macros. This saves 4 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:20 -05:00
Steven Whitehouse	2933f9254a	[GFS2] Shrink gfs2_inode (4) - di_uid/di_gid Remove duplicate di_uid/di_gid fields in favour of using inode->i_uid/inode->i_gid instead. This saves 8 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:17 -05:00
Steven Whitehouse	b60623c238	[GFS2] Shrink gfs2_inode (3) - di_mode This removes the duplicate di_mode field in favour of using the inode->i_mode field. This saves 4 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:14 -05:00
Steven Whitehouse	e7f14f4d09	[GFS2] Shrink gfs2_inode (2) - di_major/di_minor This removes the device numbers from this structure by using inode->i_rdev instead. It also cleans up the code in gfs2_mknod. It results in shrinking the gfs2_inode by 8 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:11 -05:00
Steven Whitehouse	af339c0241	[GFS2] Shrink gfs2_inode (1) - di_header/di_num The metadata header doesn't need to be stored in the incore struct gfs2_inode since its constant, and this patch removes it. Also, there is already a field for the inode's number in the struct gfs2_inode, so we don't need one in struct gfs2_dinode_host as well. This saves 28 bytes of space in the struct gfs2_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:07 -05:00
Steven Whitehouse	4cc14f0b88	[GFS2] Change argument to gfs2_dinode_print Change argument for gfs2_dinode_print in order to prepare for removal of duplicate fields between struct inode and struct gfs2_dinode_host. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:03 -05:00
Steven Whitehouse	ea744d01c6	[GFS2] Move gfs2_dinode_in to inode.c gfs2_dinode_in() is only ever called from one place, so move it to that place (in inode.c) and make it static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:00 -05:00
Steven Whitehouse	891ea14712	[GFS2] Change argument to gfs2_dinode_in This is a preliminary patch to enable the removal of fields in gfs2_dinode_host which are duplicated in struct inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:57 -05:00
Steven Whitehouse	539e5d6b7a	[GFS2] Change argument of gfs2_dinode_out Everywhere this was called, a struct gfs2_inode was available, but despite that, it was always called with a struct gfs2_dinode as an argument. By making this change it paves the way to start eliminating fields duplicated between the kernel's struct inode and the struct gfs2_dinode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:54 -05:00
Al Viro	9c9ab3d541	[GFS2] gfs2 __user misannotation fix Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:49 -05:00
Al Viro	b44b84d765	[GFS2] gfs2 misc endianness annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:46 -05:00
Al Viro	b62f963e1f	[GFS2] split and annotate gfs2_quota_change Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:41 -05:00
Al Viro	bd209cc017	[GFS2] split and annotate gfs2_statfs_change Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:38 -05:00
Al Viro	b5bc9e8b06	[GFS2] split and annotate gfs2_quota Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:35 -05:00
Al Viro	629a21e7ec	[GFS2] split and annotate gfs2_inum Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:32 -05:00
Al Viro	1e81c4c3e0	[GFS2] split and annotate gfs_rindex Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:29 -05:00
Al Viro	e928a76f95	[GFS2] split and annotate gfs2_meta_header Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:26 -05:00
Steven Whitehouse	2a2c98247b	[GFS2] Fix crc32 calculation in recovery.c Commit "[GFS2] split and annotate gfs2_log_head" resulted in an incorrect checksum calculation for log headers. This patch corrects the problem without resorting to copying the whole log header as the previous code used to. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:17 -05:00
Al Viro	5516762261	[GFS2] split and annotate gfs2_log_head Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:14 -05:00
Al Viro	e697264709	[GFS2] split and annotate gfs2_inum_range Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:11 -05:00
Al Viro	68826664d1	[GFS2] split and annotate gfs2_rgrp Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:07 -05:00
Al Viro	f50dfaf78c	[GFS2] split gfs2_sb Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:33:00 -05:00
Al Viro	5c6edb576f	[GFS2] gfs2_dinode_host fields are host-endian Annotated scalar fields, dropped unused ones. Note that it's not at all obvious that we want to convert all of them to host-endian... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:32:55 -05:00
Al Viro	3ca68df6ee	[GFS2] split gfs2_dinode into on-disk and host variants The latter is used as part of gfs2-private part of struct inode. It actually stores a lot of fields differently; for now the declaration is just cloned, inode field is swtiched and changes propagated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:32:50 -05:00
David Howells	c4028958b6	WorkStruct: make allyesconfig Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-11-22 14:57:56 +00:00
Steven Whitehouse	26d83dedf6	[GFS2] Fix OOM error handling Fix the OOM error handling in inode.c where it was possible for a NULL pointer to be dereferenced. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-06 08:59:42 -05:00
Steven Whitehouse	4a221953ed	[GFS2] Fix incorrect fs sync behaviour. This adds a sync_fs superblock operation for GFS2 and removes the journal flush from write_super in favour of sync_fs where it ought to be. This is more or less identical to the way in which ext3 does this. This bug was pointed out by Russell Cattelan <cattelan@redhat.com> Cc: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-06 08:59:16 -05:00
Alexey Dobriyan	eb1dc33aa2	[GFS2] don't panic needlessly First, SLAB_PANIC is unjustified. Second, all error propagating and backing out is in place. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-06 08:58:52 -05:00
OGAWA Hirofumi	7011774db8	[PATCH] gfs2: ->readpages() fixes This just ignore the remaining pages, and remove unneeded unlock_pages(). Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Steven French <sfrench@us.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-11-03 12:27:57 -08:00
Adrian Bunk	b7d8ac3e17	[GFS2] gfs2_dir_read_data(): fix uninitialized variable usage In the "if (extlen)" case, "bh" was used uninitialized. This patch changes the code to what seems to have been intended. Spotted by the Coverity checker. This patch also removes a pointless "bh = NULL" asignment (the variable is never accessed again after this point). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-20 09:16:20 -04:00
Adrian Bunk	bbbe451273	[GFS2] fs/gfs2/ops_fstype.c:fill_super_meta(): fix NULL dereference Don't dereference new->s_root when we do know it's NULL. Spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-20 09:15:57 -04:00
Adrian Bunk	348acd48f0	[GFS2] fs/gfs2/dir.c:gfs2_dir_write_data(): don't use an uninitialized variable In the "if (extlen)" case, "new" might be used uninitialized. Looking at the code, it should be initialized to 0. Spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-20 09:15:31 -04:00
Adrian Bunk	b0cb66955f	[GFS2] fs/gfs2/ops_fstype.c:gfs2_get_sb_meta(): remove unused variable The Coverity checker spotted this unused variable. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-20 09:15:19 -04:00
Adrian Bunk	abbdbd2065	[GFS2] fs/gfs2/dir.c:gfs2_dir_write_data(): remove dead code The Coverity checker spotted this obviously dead code. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-20 09:14:42 -04:00
Al Viro	a2d7d021d7	[GFS2] gfs2 endianness bug: be16 assigned to be32 field Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-20 09:14:08 -04:00
Steven Whitehouse	23591256d6	[GFS2] Fix bmap to map extents properly This fix means that bmap will map extents of the length requested by the VFS rather than guessing at it, or just mapping one block at a time. The other callers of gfs2_block_map are audited to ensure they send the correct max extent lengths (i.e. set bh->b_size correctly). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-20 09:13:40 -04:00
Russell Cattelan	c312c4fdc8	[GFS2] Pass the correct value to kunmap_atomic Pass kaddr rather than (incorrect) struct page to kunmap_atomic. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-12 17:11:13 -04:00
Steven Whitehouse	fe1a698ffe	[GFS2] Fix bug where lock not held The log lock needs to be held when manipulating the counter for the number of free journal blocks. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-12 17:10:55 -04:00
Steven Whitehouse	f5c54804d9	[GFS2] Fix uninitialised variable This fixes a bug where, in certain cases an uninitialised variable could cause a dereference of a NULL pointer in gfs2_commit_write(). Also a typo in a comment is fixed at the same time. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-12 17:10:15 -04:00
Russell Cattelan	52ae7b7935	[GFS2] Fix a size calculation error Fix a size calculation error. The size was incorrect being computed as a negative length and then being passed to an unsigned parameter. This in turn would cause the allocator to think it needed enough meta data to store a gigabyte file for every file created. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-12 17:09:54 -04:00
Al Viro	4b4fcaa1a9	[PATCH] misuse of strstr Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-11 11:17:06 -07:00
Steven Whitehouse	7ecdb70a0e	[GFS2] Fix endian bug for de_type Missing endian conversion for the de_type field. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-03 21:03:35 -04:00
Ryan O'Hara	fcb47e0bd2	[GFS2] Initialize SELinux extended attributes at inode creation time. This patch has gfs2_security_init declared as a static function, which is correct. As a result, the declaration of this function in inode.h is removed (and thus inode.h is unchanged). Also removed #include eaops.h, which is not needed. Signed-Off-By: Ryan O'Hara <rohara@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-03 11:57:35 -04:00
Steven Whitehouse	ddacfaf76d	[GFS2] Move logging code into log.c (mostly) This moves the logging code from meta_io.c into log.c and glops.c. As a result the routines can now be static and all the logging code is together in log.c, leaving meta_io.c with just metadata i/o code in it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-03 11:10:41 -04:00
Steven Whitehouse	f92a0b6ff4	[GFS2] Mark nlink cleared so VFS sees it happen This does nothing atm, but will be required for later support of r/o bind mounts. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-02 16:01:53 -04:00
Steven Whitehouse	409e185d23	[GFS2] Two redundant casts removed Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-02 14:20:43 -04:00
Steven Whitehouse	48516ced21	[GFS2] Remove uneeded endian conversion In many places GFS2 was calling the endian conversion routines for an inode even when only a single field, or a few fields might have changed. As a result we were copying lots of data needlessly. This patch replaces those calls with conversion of just the required fields in each case. This should be faster and easier to understand. There are still other places which suffer from this problem, but this is a start in the right direction. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-02 12:39:19 -04:00
Steven Whitehouse	3cf1e7bed4	[GFS2] Remove duplicate sb reading code For some reason we had two different sets of code for reading in the superblock. This removes one of them in favour of the other. Also we don't need the temporary buffer for the sb since we already have one in the gfs2 sb itself. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-02 11:49:41 -04:00
Steven Whitehouse	2e565bb69c	[GFS2] Mark metadata reads for blktrace Mark the metadata reads so that blktrace knows what they are. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-02 11:38:25 -04:00
Steven Whitehouse	128e5ebaf8	[GFS2] Remove iflags.h, use FS_ Update GFS2 in the light of David Howells' patch: [PATCH] BLOCK: Move common FS-specific ioctls to linux/fs.h [try #6] `36695673b0` which calls the filesystem independant flags FS_..._FL. As a result we no longer need the flags.h file and the conversion routine is moved into the GFS2 source code. Userland programs which used to include iflags.h should now include fs.h and use the new flag names. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-02 11:24:43 -04:00
Steven Whitehouse	d00223f169	[GFS2] Fix code style/indent in ops_file.c Fix a couple of minor issues. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-02 10:28:05 -04:00
Andrew Morton	930cc237d6	[GFS2] streamline-generic_file_-interfaces-and-filemap gfs fix Fix GFS for streamline-generic_file_-interfaces-and-filemap.patch Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-02 09:02:54 -04:00
Badari Pulavarty	9c9eb21eee	[GFS2] Remove readv/writev methods and use aio_read/aio_write instead (gfs bits) This patch removes readv() and writev() methods and replaces them with aio_read()/aio_write() methods. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-02 09:02:12 -04:00
Theodore Ts'o	825f9075d7	[GFS2] inode-diet: Eliminate i_blksize from the inode structure This eliminates the i_blksize field from struct inode. Filesystems that want to provide a per-inode st_blksize can do so by providing their own getattr routine instead of using the generic_fillattr() function. Note that some filesystems were providing pretty much random (and incorrect) values for i_blksize. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org>	2006-09-28 08:32:51 -04:00
Theodore Ts'o	bba9dfd835	[GFS2] inode_diet: Replace inode.u.generic_ip with inode.i_private (gfs) The following patches reduce the size of the VFS inode structure by 28 bytes on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction in the inode size on a UP kernel that is configured in a production mode (i.e., with no spinlock or other debugging functions enabled; if you want to save memory taken up by in-core inodes, the first thing you should do is disable the debugging options; they are responsible for a huge amount of bloat in the VFS inode structure). This patch: The filesystem or device-specific pointer in the inode is inside a union, which is pretty pointless given that all 30+ users of this field have been using the void pointer. Get rid of the union and rename it to i_private, with a comment to explain who is allowed to use the void pointer. This is just a cleanup, but it allows us to reuse the union 'u' for something something where the union will actually be used. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org>	2006-09-28 08:32:24 -04:00
Steven Whitehouse	7e18c02be7	[GFS2] Fix bug in Makefiles for lock modules The Makefile had the wrong CONFIG_ variable in it so that in case GFS2 was y and the lock modules were m, they were not getting built properly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-27 12:20:06 -04:00
Steven Whitehouse	907b9bceb4	[GFS2/DLM] Fix trailing whitespace As per Andrew Morton's request, removed trailing whitespace. Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-25 09:26:04 -04:00
Steven Whitehouse	7276b3b0c7	[GFS2] Tidy up meta_io code Fix a bug in the directory reading code, where we might have dereferenced a NULL pointer in case of OOM. Updated the directory code to use the new & improved version of gfs2_meta_ra() which now returns the first block that was being read. Previously it was releasing it requiring following code to grab the block again at each point it was called. Also turned off readahead on directory lookups since we are reading a hash table, and therefore reading the entries in order is very unlikely. Readahead is still used for all other calls to the directory reading function (e.g. when growing the hash table). Removed the DIO_START constant. Everywhere this was used, it was used to unconditionally start i/o aside from a couple of places, so I've removed it and made the couple of exceptions to this rule into separate functions. Also hunted through the other DIO flags and removed them as arguments from functions which were always called with the same combination of arguments. Updated gfs2_meta_indirect_buffer to be a bit more efficient and hopefully also be a bit easier to read. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-21 17:05:23 -04:00
Steven Whitehouse	56965536b8	[GFS2] Remove unused constants Three of the DIO constants were not being used, so remove them. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-20 15:48:09 -04:00
Steven Whitehouse	f0e522a901	[GFS2] Remove "NFS only" readdir path This code path shouldn't be needed, so remove it for now. This tidys things up. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-19 16:41:11 -04:00
Steven Whitehouse	74669416f7	[GFS2] Use list_for_each_entry_safe_reverse in gfs2_ail1_start() This is an attempt to fix Red Hat bz 204364. I don't hit it all the time, but with these changes, running postmark which used to trigger it on a regular basis no longer appears to. So I'm not saying that its 100% certain that its fixed, but it does look promising at the moment. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-19 11:17:38 -04:00
Fabio Massimo Di Nitto	7d308590ae	[GFS2] Export lm_interface to kernel headers lm_interface.h has a few out of the tree clients such as GFS1 and userland tools. Right now, these clients keeps a copy of the file in their build tree that can go out of sync. Move lm_interface.h to include/linux, export it to userland and clean up fs/gfs2 to use the new location. Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-19 08:45:18 -04:00
akpm@osdl.org	f3b30912e0	[GFS2] inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-vs-gfs2 i_blksize got removed in -mm. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-19 08:43:01 -04:00
Steven Whitehouse	07903c02d0	[GFS2] Tweek unlock test in readpage() This make the unlock test a bit simpler. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-18 17:30:05 -04:00
Russell Cattelan	dc41aeedef	[GFS2] Fix for mmap() bug in readpage Fix for Red Hat bz 205307. Don't need to lock in readpage if the higher level code has already grabbed the lock. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-18 17:26:48 -04:00
Steven Whitehouse	7a6bbacbb8	[GFS2] Map multiple blocks at once where possible This is a tidy up of the GFS2 bmap code. The main change is that the bh is passed to gfs2_block_map allowing the flags to be set directly rather than having to repeat that code several times in ops_address.c. At the same time, the extent mapping code from gfs2_extent_map has been moved into gfs2_block_map. This allows all calls to gfs2_block_map to map extents in the case that no allocation is taking place. As a result reads and non-allocating writes should be faster. A quick test with postmark appears to support this. There is a limit on the number of blocks mapped in a single bmap call in that it will only ever map blocks which are pointed to from a single pointer block. So in other words, it will never try to do additional i/o in order to satisfy read-ahead. The maximum number of blocks is thus somewhat less than 512 (the GFS2 4k block size minus the header divided by sizeof(u64)). I've further limited the mapping of "normal" blocks to 32 blocks (to avoid extra work) since readpages() will currently read a maximum of 32 blocks ahead (128k). Some further work will probably be needed to set a suitable value for DIO as well, but for now thats left at the maximum 512 (see ops_address.c:gfs2_get_block_direct). There is probably a lot more that can be done to improve bmap for GFS2, but this is a good first step. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-18 17:18:23 -04:00
David Teigland	65952fb4e9	[GFS2] print mount errors related to sysfs Print an error message if mount fails in setting up the sysfs files. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-18 09:43:23 -04:00
Steven Whitehouse	a8336344a5	[GFS2] Fix glock hash clearing A one liner bug fix to prevent the return value being wrong when more than one superblock is mounted. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-14 13:57:38 -04:00
Steven Whitehouse	faa31ce85f	[GFS2] Tidy up log.c Based upon previous feedback from lkml and also removing some commented out debugging which is no longer needed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-13 11:13:27 -04:00
Steven Whitehouse	16feb9fec0	[GFS2] Use atomic_t rather than kref in glock.c Use atomic_t as the ref count in glocks rather than a kref. This is another step towards using RCU for the glock hash. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-13 10:43:37 -04:00
Steven Whitehouse	b6397893a5	[GFS2] Use hlist for glock hash chains This results in smaller list heads, so that we can have more chains in the same amount of memory (twice as many). I've multiplied the size of the table by four though - this is because we are saving memory by not having one lock per chain any more. So we land up using about the same amount of memory for the hash table as we did before I started these changes, the difference being that we now have four times as many hash chains. The reason that I say "about the same amount of memory" is that the actual amount now depends upon the NR_CPUS and some of the config variables, so that its not exact and in some cases we do use more memory. Eventually we might want to scale the hash table size according to the size of physical ram as measured on module load. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-12 10:10:01 -04:00
Steven Whitehouse	2426443460	[GFS2] Rewrite of examine_bucket() The existing implementation of this function in glock.c was not very efficient as it relied upon keeping a cursor element upon the hash chain in question and moving it along. This new version improves upon this by using the current element as a cursor. This is possible since we only look at the "next" element in the list after we've taken the read_lock() subsequent to calling the examiner function. Obviously we have to eventually drop the ref count that we are then left with and we cannot do that while holding the read_lock, so we do that next time we drop the lock. That means either just before we examine another glock, or when the loop has terminated. The new implementation has several advantages: it uses only a read_lock() rather than a write_lock(), so it can run simnultaneously with other code, it doesn't need a "plug" element, so that it removes a test not only from this list iterator, but from all the other glock list iterators too. So it makes things faster and smaller. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-11 21:40:30 -04:00
Steven Whitehouse	94610610f1	[GFS2] Remove unused function from glock.c The callback for iopen locks is unused, so this removes it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-09 18:59:27 -04:00
Steven Whitehouse	a5e08a9ef5	[GFS2] Add consts to glock sorting function Add back the consts which were casted away in the glock sorting function. Also add early exit code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-09 17:07:05 -04:00
Steven Whitehouse	087efdd391	[GFS2] Make glock hash locks proportional to NR_CPUS Make the number of locks used for hash chains in glock.c proportional to NR_CPUS. Also move constants for the number of hash chains into glock.c from incore.h since they are not used outside of glock.c. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-09 16:59:11 -04:00
Steven Whitehouse	ff6af411ae	[GFS2] vfree should be kfree (II) The superblock is now created with kmalloc, not vmalloc. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-09 16:56:34 -04:00
Steven Whitehouse	37b2fa6a24	[GFS2] Move rwlocks in glock.c into their own array This splits the rwlocks guarding the hash chains of the glock hash table into their own array. This will reduce memory usage in some cases due to better alignment, although the real reason for doing it is to allow the two tables to be different sizes in future (i.e. the locks will be sized proportionally with the max number of CPUs and the hash chains sized proportinally with the size of physical memory) In order to allow this, the gl_bucket member of struct gfs2_glock has now become gl_hash, so we record the hash rather than a pointer to the bucket itself. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-08 13:35:56 -04:00
Steven Whitehouse	9b47c11d1c	[GFS2] Use void * instead of typedef for locking module interface As requested by Jan Engelhardt, this removes the typedefs in the locking module interface and replaces them with void *. Also since we are changing the interface, I've added a few consts as well. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-08 10:17:58 -04:00
Steven Whitehouse	a2c4580797	[GFS2] vfree should be kfree This was missed in an earlier patch when changing over from vmalloc to kmalloc for the superblock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-08 10:13:03 -04:00
Steven Whitehouse	5ce311ebdb	[GFS2] Remove unused sync_lvb code from lock modules This code is no longer used for anything and can be removed from the locking modules. The sync_lvb function is not required as this happens automatically with the current locking system. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-07 17:35:48 -04:00
Steven Whitehouse	1c089c325d	[GFS2] Remove one typedef This removes one of the typedefs from the locking interface. It is replaced by a forward declaration of the gfs2 superblock. The other two are not so easy to solve since in their case, they can refer to one of two possible structures. Cc: David Teigland <teigland@redhat.com> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-07 15:50:20 -04:00
Steven Whitehouse	b9201ce9a8	[GFS2] Forgot to remove unused include vmalloc.h Excatly as the subject line says. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-07 14:46:39 -04:00
Steven Whitehouse	85d1da67f7	[GFS2] Move glock hash table out of superblock There are several reasons why we want to do this: - Firstly its large and thus we'll scale better with multiple GFS2 fs mounted at the same time - Secondly its easier to scale its size as required (thats a plan for later patches) - Thirdly, we can use kzalloc rather than vmalloc when allocating the superblock (its now only 4888 bytes) - Fourth its all part of my plan to eventually be able to use RCU with the glock hash. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-07 14:40:21 -04:00
Steven Whitehouse	b8547856f9	[GFS2] Add gfs2 superblock to glock hash function This is another patch preparing for sharing of the glock hash table between different gfs2 mounts. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-07 13:12:27 -04:00
Steven Whitehouse	62f140c173	[GFS2] Add brackets in locking/dlm/sysfs.c As per Jan Engelhardt's request. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-07 09:54:55 -04:00
David Teigland	3204a6c055	[GFS2] use snprintf for sysfs show Use snprintf(buf, PAGE_SIZE, ...) instead of sprintf for sysfs show methods. Per instructions in Documentation/filesystems/sysfs.txt Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-07 09:43:34 -04:00
Jan Engelhardt	c53921248c	[GFS2] More style changes Remove redundant brackets Signed-off-by: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-07 09:42:56 -04:00
Steven Whitehouse	7b62536141	[GFS2] Add a comment in ops_export.c Ass a comment explaining the slightly odd construct used to pass error values back. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-05 15:56:17 -04:00
Steven Whitehouse	2c1e52aa90	[GFS2] More style fixes As per Jan Engelhardt's follow up emails, here are a few small fixes which were missed earlier. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-05 15:41:57 -04:00
Steven Whitehouse	48fac17909	[GFS2] Remove unused code from quota As per Jan Engelhardt's request, some unused code is removed and some consts added in the quota code. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-05 15:17:12 -04:00
Steven Whitehouse	a67cdbd457	[GFS2] Style changes in logging code As per Jan Engelhardt's comments, removed some unused code and removed some brackets which were not required. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-05 14:41:30 -04:00
Steven Whitehouse	cca195c5c0	[GFS2] Extended attribute code style changes As per Jan Engelhardt's request and also a few of my own. It has been possible to add a few most const to the code as a result of the change in gfs2_ea_name2type. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-05 13:15:18 -04:00
Steven Whitehouse	16910427e1	[GFS2] Style changes in rgrp.c Change one constant plus remove a redundant !!. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-05 11:15:45 -04:00
Steven Whitehouse	ea67eedb21	[GFS2] Fix end of multi-line structures As per Jan Engelhardt's request, I've added a ',' to the end of each of the multi-line structures which didn't already have one (most already did). Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-05 10:53:09 -04:00
Steven Whitehouse	f2f7ba5237	[GFS2] Make headers compile on their own As per Jan Engelhardt's comments, this should make all the headers compile on their own by including and/or declaring structures early. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-05 10:39:21 -04:00
Steven Whitehouse	2bdbc5d739	[GFS2] Directory code style changes As per comments from Jan Engelhardt, remove redundant casts, redundant endian conversions, add a smattering of const and rewrite the dirent_next function in order to avoid as many casts as possible. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-05 09:34:20 -04:00
Steven Whitehouse	5acd396734	[GFS2] Some further style changes Introduce a couple of new constants which make the NFS filehandle sizes that GFS2 uses a bit clearer. Also fix one or two minor issues as per Jan Engelhardt's sixth email. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 16:16:45 -04:00
Steven Whitehouse	26c1a57412	[GFS2] More code style updates As per Jan Engelhardt's fifth email. This has most of the changes recommended, which is the removal of casts which are not required, some indenting fixes and similar. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 15:32:10 -04:00
Steven Whitehouse	0bd5996a00	[GFS2] Style changes in ops_address.c As per the remainder of Jan Engelhardt's fourth email comments, remove an cast thats not required. Also tidy up the "limit" code in stuck_releasepage(). Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 14:59:35 -04:00
Steven Whitehouse	dd538c832a	[GFS2] Spelling sentinal -> sentinel A spelling mistake (one of mine). Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 14:53:30 -04:00
Steven Whitehouse	38c60ef228	[GFS2] Use const in endian conversion routines Use const in endian conversion and printing of on-disk structures. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 14:48:37 -04:00
Steven Whitehouse	82ffa51637	[GFS2] More style changes As per Jan Engelhardt's fourth email, this is the first part of the change set with a few minor style points. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 14:47:06 -04:00
Steven Whitehouse	c26687113a	[GFS2] Remove a cast, tidy gfs2_inode_attr_in The remains of the changes for Jan Engelhardt's third email. Remove a cast and tidy up gfs2_inode_attr_in. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 13:55:48 -04:00
Steven Whitehouse	cd915493fc	[GFS2] Change all types to uX style This makes all fixed size types have consistent names. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 12:49:07 -04:00
Steven Whitehouse	a91ea69ffd	[GFS2] Align all labels against LH side This makes everything consistent. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 12:04:26 -04:00
Steven Whitehouse	75d3b817a0	[GFS2] Tidy up bmap/inode code As per Jan Engelhardt's third set of comments, this make various code style changes and moves the structures from format.h into super.c, which was the only place that format.h was actually used. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 11:41:31 -04:00
Steven Whitehouse	5029996547	[GFS2] Tidy up locking code As per Jan Engelhardt's second email, this removes some unused code, and fixes up indenting in various places. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 09:49:55 -04:00
Steven Whitehouse	e9fc2aa091	[GFS2] Update copyright, tidy up incore.h As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this updates the copyright message to say "version" in full rather than "v.2". Also incore.h has been updated to remove forward structure declarations which are not required. The gfs2_quota_lvb structure has now had endianess annotations added to it. Also quota.c has been updated so that we now store the lvb data locally in endian independant format to avoid needing a structure in host endianess too. As a result the endianess conversions are done as required at various points and thus the conversion routines in lvb.[ch] are no longer required. I've moved the one remaining constant in lvb.h thats used into lm.h and removed the unused lvb.[ch]. I have not changed the HIF_ constants. That is left to a later patch which I hope will unify the gh_flags and gh_iflags fields of the struct gfs2_holder. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-01 11:05:15 -04:00
Steven Whitehouse	623d93555c	[GFS2] Fix releasepage bug (fixes direct i/o writes) This patch fixes three main bugs. Firstly the direct i/o get_block was returning the wrong return code in certain cases. Secondly, the GFS2's releasepage function was not dealing with cases when clean, ordered buffers were found still queued on a transaction (which can happen depending on the ordering of journal flushes). Thirdly, the journaling code itself needed altering to take account of the after effects of removing the clean ordered buffers from the transactions before a journal flush. The releasepage bug did also show up under "normal" buffered i/o as well, so its not just a fix for direct i/o. In fact its not normally used in the direct i/o path at all, except when flushing existing buffers after performing a direct i/o write, but that was the code path that led us to spot this. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-31 12:14:44 -04:00
Steven Whitehouse	899be4d3b7	[GFS2] Add superblock into key for glock lookups This adds the superblock as a key for glock lookups. Since the glocks are already stored in a per-superblock table, this has no effect at the moment. Later on this will change though. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-30 12:50:28 -04:00
Steven Whitehouse	d6a5372768	[GFS2] Use const on glock lookup key Use const for the glock name which is being used as a lookup key in the glock hash table. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-30 11:16:23 -04:00
Steven Whitehouse	ec45d9f583	[GFS2] Use slab properly with glocks We can take advantage of the slab allocator to ensure that all the list heads and the spinlock (plus one or two other fields) are initialised by slab to speed up allocation of glocks. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-30 10:36:52 -04:00
Steven Whitehouse	5e2b0613ed	[GFS2] Remove unused code from glock layer Remove the unused sync feature from glocks. This is currently done by calling the required functions to sync pages/blocks directly so this code isn't needed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-30 09:38:30 -04:00
Steven Whitehouse	8fb4b536e7	[GFS2] Make glock operations const For all the usual reasons of enforcing correctness and potentially reducing code size, this patch makes the glock operations const. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-30 09:30:00 -04:00
Abhijith Das	8638460540	[GFS2] Allow mounting of gfs2 and gfs2meta at the same time This patch allows the simultaneous mounting of gfs2meta and gfs2 filesystems. A restriction however is that a gfs2meta fs may only be mounted if its corresponding gfs2 filesystem is also mounted. Also, a gfs2 filesystem cannot be unmounted before its gfs2meta filesystem. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-25 17:19:55 -04:00
Benjamin Marzinski	5dc39fe621	[GFS2] Fix journal off-by-one error log_refund() incorrectly assumed that if a transaction had been touched, it always committed buffers to the incore log. Thus, when you got around to flushing the log, you would need one more block than you committed, to account for the header. So it automatically set reserved to 1, which had the effect of making sdp->sd_log_blks_reserved one greater when you got to gfs2_log_flush(). However, if you don't actually commit anything to the incore log between flushes, you don't need the header, because you aren't writing anything out. With this patch, log_refund() only increments reservered to account for the header if something has been committed since the last flush. Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-25 09:57:41 -04:00
Steven Whitehouse	a2242db090	[GFS2] Speed up scanning of glocks I noticed the gfs2_scand seemed to be taking a lot of CPU, so in order to cut that down a bit, here is a patch. Firstly the type of a glock is a constant during its lifetime, so that its possible to check this without needing locking. I've moved the (common) case of testing for an inode glock outside of the glmutex lock. Also there was a mutex left over from when the glock cache was master of the inode cache. That isn't required any more so I've removed that too. There is probably scope for further speed ups in the future in this area. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-24 17:03:05 -04:00
Steven Whitehouse	166afccd71	[GFS2] Tidy up error handling in gfs2_releasepage() This should clarify the logic in gfs2_releasepage() relating to error handling as well as making the response to errors a bit more graceful. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-24 15:59:40 -04:00
Steven Whitehouse	b8e1aabf21	[GFS2] Another list_del bug Another case where list_del should be list_del_init. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-22 16:25:50 -04:00
Steven Whitehouse	08867605e1	[GFS2] Fix to list_del in lops.c A list_del should have been a list_del_init in lops.c which was resulting in incorrect status returns from list_empty(). Signed-off-by: Steven Whitheouse <swhiteho@redhat.com>	2006-08-22 11:03:57 -04:00
Steven Whitehouse	15d00c0b91	[GFS2] Fix leak of gfs2_bufdata This fixes a memory leak of struct gfs2_bufdata and also some problems in the ordered write handling code. It needs a bit more testing, but I believe that the reference counting of ordered write buffers should now be correct. This is aimed at fixing Red Hat bugzilla: #201028 and #201082 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-18 15:51:09 -04:00
Russell Cattelan	8872187780	[GFS2] Fix a couple of refcount leaks. recovery.c add a brelse to deal with gfs2_replay_read_block being called twice on the same block. add a dput to drop the ref count on the root inode. This was causing lingering glocks and thus causing a mount failure to hang. Fix a endian conversion macro that was was swizzling 16bits when it should have been swizzling 32. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-10 17:18:59 -04:00
Steven Whitehouse	f4387149ec	[GFS2] Fix lack of buffers in writepage bug In some cases we can enter write page without there being buffers attached to the page. In this case the function to add gfs2_bufdata to the buffers fails sliently causing further failures down the stack. This fix ensures that we always add buffers in writepage if they didn't already exist (mmap is one way to trigger this). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-08 13:23:19 -04:00
Steven Whitehouse	2b557f6dc7	[GFS2] Fix gfs_ prefix in locking.c The previous patch didn't change all the gfs_ to gfs2_ so this is the remainder. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-07 11:12:30 -04:00
David Teigland	08eac93a68	[GFS2] match plock result with correct request When the result of a posix lock request is read it needs to be matched up with the correct waiting request. The owner field needs to be used in the comparison since more than one process may be waiting for locks on the same file. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-07 08:46:51 -04:00
David Teigland	3120ec541e	[GFS2] lockproto api prefix Use the gfs2_ prefix on the register/unregister functions for the lock modules. The gfs_ prefix was left from an old idea on how to share these with gfs1. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-07 08:46:19 -04:00
Steven Whitehouse	59a1cc6bda	[GFS2] Fix lock ordering bug in page fault path Mmapped files were able to trigger a lock ordering bug. Private maps do not need to take the glock so early on. Shared maps do unfortunately, however we can get around that by adding a flag into the flags for the struct gfs2_file. This only works because we are taking an exclusive lock at this point, so we know that nobody else can be racing with us. Fixes Red Hat bugzilla: #201196 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-04 15:41:22 -04:00
Steven Whitehouse	899bb26450	[GFS2] Fix bug in directory code This was a nasty bug which resulted in corruption of hash tables in the directory code with larger directories. We forgot to increment a pointer in the read/write routines internal to the directory code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-01 15:28:57 -04:00
David Teigland	de9b75d31e	[GFS2] add plock owner We need to use fl_owner instead of fl_pid to track the owner of a posix lock. Pass the owner value out to user space where cluster plocks are managed. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-31 15:44:29 -04:00
Steven Whitehouse	420b9e5e45	[GFS2] Tidy up in various files Tidy up some files and remove an unused routine in meta_io.h. Also added a bit of extra debugging in meta_io.h. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-31 15:42:17 -04:00
Steven Whitehouse	5dd9feafb3	[GFS2] Fix bug in clear_inode We should have been waiting for lock demotion to finish in clear_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-28 14:52:33 -04:00
Steven Whitehouse	2b98a54f79	[GFS2] Fix bug in super block reading code This gets the argument to submit_bio() correct. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-27 16:37:48 -04:00
Steven Whitehouse	dd894be8df	[GFS2] Change some allocations to GFP_NOFS Some allocations in rgrp.c should have been GFP_NOFS rather than GFP_KERNEL. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-27 14:29:00 -04:00
Steven Whitehouse	f45b7ddd2b	[GFS2] Use a bio to read the superblock This means that we don't need to create a special inode just to contain a struct address_space in order to read a single disk block. Instead we read the disk block directly. Its slightly faster, and uses slightly less memory, but the real reason for doing this is that it removes a special case from the glock code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-27 13:53:53 -04:00
Steven Whitehouse	ba7f72901c	[GFS2] Remove page.[ch] The remaining routines in page.c were all only used in one other file, so they are now moved into the files where they are referenced and made static. Thus page.[ch] are no longer required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-26 11:27:10 -04:00
Steven Whitehouse	f25ef0c1b4	[GFS2] Tidy gfs2_unstuffer_page Tidy up gfs2_unstuffer_page by: a) Moving it into bmap.c b) Making it static c) Calling it directly from gfs2_unstuff_dinode d) Updating all callers of gfs2_unstuff_dinode due to one less required argument. It doesn't change the behaviour at all. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-26 10:51:20 -04:00
Steven Whitehouse	a9e5f4d078	[GFS2] Alter direct I/O path As per comments received, alter the GFS2 direct I/O path so that it uses the standard read functions "out of the box". Needs a small change to one of the VFS functions. This reduces the size of the code quite a lot and also removes the need for one new export. Some more work remains to be done, but this is the bones of the thing. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-25 17:24:12 -04:00
Abhijith Das	52f341cf75	[GFS2] gfs2_set_flags double locking patch traced the "umount hang due to spurious glock" issue that I was having with gfs2meta. It's in the do_gfs2_set_flags function, which does a gfs2_holder_init as well as a gfs2_glock_nq_init (increases ref count by 2 instead of 1). Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-21 02:03:21 -04:00
David Teigland	c5921fd02e	[GFS2] fix typo in locking/dlm Typo causes the error value from the wrong lock to be checked. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-21 01:57:40 -04:00
Steven Whitehouse	e0f2bf780a	[GFS2] Fix endian conversion bug Fix an endian coversion bug in log.c spotted by Kevin Anderson. Cc: Kevin Anderson <kanderso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-17 09:36:28 -04:00
Steven Whitehouse	634ee0b9f4	[GFS2] Fix use after free bug in dir.c Fix a use after free bug in dir.c spotted by Kevin Anderson. Cc: Kevin Anderson <kanderso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-17 09:32:37 -04:00
Wendy Cheng	2eb168ca94	[GFS2] NFS update Update the NFS filehandles so that they contain the file type. Signed-off-by: Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-13 09:24:48 -04:00
Steven Whitehouse	4da3c6463e	[GFS2] Fix a coupls of warnings in dir.c Fix a couple of compiler warnings in dir.c caused by potentially uninitialised variables. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-11 13:19:13 -04:00
Abhijith Das	b2a580d87b	[PATCH] patch to init di_payload_format field in gfs2_dinode A missing initialisation when creating a new on disk inode. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-11 09:54:17 -04:00
Steven Whitehouse	f3bba03fd1	[GFS2] Fix deadlock in memory allocation We must not call GFP_KERNEL memory allocations while we are holding the log lock (read or write) since that may trigger a log flush resulting in a deadlock. Eventually we need to fix the locking in log.c, for now this solves the problem at the expense of freeing up memory as fast as we would like to. This needs to be revisited later on. Cc: Kevin Anderson <kanderso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-11 09:50:54 -04:00
Steven Whitehouse	4340fe6253	[GFS2] Add generation number This adds a generation number for the eventual use of NFS to the ondisk inode. Its backward compatible with the current code since it doesn't really matter what the generation number is to start with, and indeed since its set to zero, due to it being taken from padding in both the inode and rgrp header, it should be fine. The eventual plan is to use this rather than no_formal_ino in the NFS filehandles. At that point no_formal_ino will be unused. At the same time we also add a releasepages call back to the "normal" address space for gfs2 inodes. Also I've removed a one-linrer function thats not required any more. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-11 09:46:33 -04:00
Steven Whitehouse	ffeb874b2b	[GFS2] Bug fix to gfs2_readpages() This fixes a bug where we were releasing a page incorrectly sometimes when reading a stuffed file. This fixes the bug that Kevin reported when using Xen. Cc: Kevin Anderson <kanderso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-10 15:47:01 -04:00
Steven Whitehouse	dc3e130a08	[GFS2] Remove unused code from dir.c Remove a couple of commented out, and unused lines of code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-10 11:19:29 -04:00
Steven Whitehouse	29937ac6ca	[GFS2] Fixes to scanning of glocks (again) This really is the correct fix this time. We just ignore all glocks associated with inodes until the inodes are pushed from the inode cache. At that point the glocks are queued for reclaim, so we don't need to do it here. Also fix one or two other minor bugs. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-06 17:58:03 -04:00
Steven Whitehouse	627add2d13	[GFS2] Correct logic in glock scanner Under certain circumstances the glock scanning logic would demote locks which ought not to have been selected for demotion. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-05 13:16:19 -04:00
Steven Whitehouse	fd4de2d41a	[GFS2] Add cast for printk Cast a uint64_t to unsigned long long for a printk. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-05 13:14:59 -04:00
Steven Whitehouse	ecb1460dc4	[GFS2] Make GFS2 work with lock validator Change our one existing old-style lock initialiser to a new-style one. This allows the lock validator to work as intended. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-05 10:41:39 -04:00
Steven Whitehouse	faac9bd0e3	[GFS2] Fix locking for Direct I/O reads We need to hold i_mutex when doing direct i/o reads. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-05 08:24:34 -04:00
Steven Whitehouse	b0dd9308b7	[GFS2] Mark file_operations const As per Arjan's patches: http://www.kernel.org/git/?p=linux/kernel/git/steve/gfs2-2.6.git;a=commitdiff;h=99ac48f54a91d02140c497edc31dc57d4bc5c85d and http://www.kernel.org/git/?p=linux/kernel/git/steve/gfs2-2.6.git;a=commitdiff;h=4b6f5d20b04dcbc3d888555522b90ba6d36c4106 make the GFS2 file_operations structures const. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-03 13:47:02 -04:00
Steven Whitehouse	66de045d9f	[GFS2] Make our address_space_operations const As per Christoph's patch: http://www.kernel.org/git/?p=linux/kernel/git/steve/gfs2-2.6.git;a=commitdiff;h=f5e54d6e53a20cef45af7499e86164f0e0d16bb2 We mark struct address_space_operations const in GFS2. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-03 13:37:30 -04:00
Steven Whitehouse	0c0834a30c	[GFS2] API change for gfs2_statfs The kernel API for super_operations->statfs() has changed so this updates GFS2 to the new API. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-03 11:38:01 -04:00
Andrew Morton	ccd6efd0cd	[patch 1/1] gfs2: get_sb_dev() fix Update GFS2 for dhowells API changes. Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-03 11:23:09 -04:00
Steven Whitehouse	02630a12c7	[GFS2] Remove dependance on tty_write_message() This removes the call in GFS2 to tty_write_message and replaces it with a printk. As the export was added by GFS2, we remove this as well. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-03 11:20:06 -04:00
Steven Whitehouse	af18ddb886	[GFS2] Eliminate one instance of __GFP_NOFAIL This removes one instance of GFP_NOFAIL from the glock callback function. It also fixes a bug where a , was used at a line end rather than ; causing unintended results. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-24 15:42:21 -04:00
Steven Whitehouse	a53311d4d9	[GFS2] Use generic_file_sendfile directly Don't use a wrapper for generic_file_sendfile but call it directly. Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-23 16:16:29 -04:00
David Teigland	a464418425	[GFS2] gfs2/dlm: mailing list and web page List new development mailing list and correct web page url. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-22 15:29:57 -04:00
Steven Whitehouse	bdd512aeea	[GFS2] Remove unused flag The flag GIF_MIN_INIT is no longer used or required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-22 15:26:33 -04:00
Adrian Bunk	43f5d210a0	[GFS2] [-mm patch] fs/gfs2/: make code static This patch makes the following needlessly global code static: - eaops.c: struct gfs2_security_eaops - rgrp.c: gfs2_free_uninit_di() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-22 11:16:40 -04:00
Steven Whitehouse	faf450ef4a	[GFS2] Remove gfs2_repermission gfs2_repermission is just a wrapper for permission, so remove it and call permission directly where required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-22 10:59:10 -04:00
Steven Whitehouse	d9d1ca3050	[GFS2] Fix double locking problem in rename The rename inode operation was trying to lock the same inode twice in the case of renaming with the source and destination directories the same. We now test for this and just lock once. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-21 15:38:17 -04:00
Steven Whitehouse	0d42e54220	[GFS2] Remove unused ra_state variable As per Nick Piggin's comments on lkml, remove the unused ra_state variable. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au>	2006-06-20 16:13:49 -04:00
David Woodhouse	0239c4ae8a	[GFS2] Fix printk format warnings in DLM code fs/gfs2/locking/dlm/thread.c: In function ‘process_complete’: fs/gfs2/locking/dlm/thread.c:56: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’ fs/gfs2/locking/dlm/thread.c:69: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’ fs/gfs2/locking/dlm/thread.c:102: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’ fs/gfs2/locking/dlm/thread.c:124: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’ fs/gfs2/locking/dlm/thread.c:146: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’ fs/gfs2/locking/dlm/thread.c:148: warning: format ‘%llx’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’ Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2006-06-20 13:48:31 +01:00
David Woodhouse	695165dfba	[GFS2] Fix use of bitops on unsigned int (struct gfs2_holder->gh_iflags) fs/gfs2/glock.c: In function ‘gfs2_holder_get’: fs/gfs2/glock.c:439: warning: passing argument 2 of ‘set_bit’ from incompatible pointer type fs/gfs2/glock.c: In function ‘rq_promote’: fs/gfs2/glock.c:512: warning: passing argument 2 of ‘set_bit’ from incompatible pointer type fs/gfs2/glock.c:526: warning: passing argument 2 of ‘set_bit’ from incompatible pointer type ... Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2006-06-20 13:44:27 +01:00
Steven Whitehouse	b61dde795f	[GFS2] Always include glock in transaction Include the glock in the transaction, even when not journaling data in order that ordered write data will be correctly flushed when the lock is released. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-19 10:51:11 -04:00
Steven Whitehouse	3a8476dda1	[GFS2] Remove debugging printks A few of my printks slipped through last time. Also fix a couple of minor bugs. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-19 09:10:39 -04:00
Steven Whitehouse	feaa7bba02	[GFS2] Fix unlinked file handling This patch fixes the way we have been dealing with unlinked, but still open files. It removes all limits (other than memory for inodes, as per every other filesystem) on numbers of these which we can support on GFS2. It also means that (like other fs) its the responsibility of the last process to close the file to deallocate the storage, rather than the person who did the unlinking. Note that with GFS2, those two events might take place on different nodes. Also there are a number of other changes: o We use the Linux inode subsystem as it was intended to be used, wrt allocating GFS2 inodes o The Linux inode cache is now the point which we use for local enforcement of only holding one copy of the inode in core at once (previous to this we used the glock layer). o We no longer use the unlinked "special" file. We just ignore it completely. This makes unlinking more efficient. o We now use the 4th block allocation state. The previously unused state is used to track unlinked but still open inodes. o gfs2_inoded is no longer needed o Several fields are now no longer needed (and removed) from the in core struct gfs2_inode o Several fields are no longer needed (and removed) from the in core superblock There are a number of future possible optimisations and clean ups which have been made possible by this patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-14 15:32:57 -04:00
Steven Whitehouse	01eb7c0796	[GFS2] Fix warning on impossible event in eattr code The caller ensures that ea_list_i() is never called with an invalid type, so lets BUG() if we see one. This clears up a couple of compiler warnings too. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-06 17:31:30 -04:00
Steven Whitehouse	6b61b072a8	[GFS2] Move some fields around to reduce wasted space We can reclaim some space by moving fields in some structures in order to allow them to pack better on 64 bit architectures. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-06 14:49:39 -04:00
Ryan O'Hara	e70409f5f3	[GFS2] Fix for selinux support This should fix the mount problems with gfs2 and selinux. Signed-off-by: Ryan O'Hara <rohara@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-25 17:36:15 -04:00
Steven Whitehouse	382066da25	[GFS2] Casts for printing 64bit numbers Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-24 10:22:09 -04:00
David Teigland	9229f01349	[GFS2] Cast 64 bit printk args to unsigned long long. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-24 09:21:30 -04:00
Steven Whitehouse	90cdd2083a	[GFS2] Flag up issue in selinux code Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-22 10:36:25 -04:00
Ryan O'Hara	639b6d79b8	[GFS2] selinux support This adds support to GFS2 for selinux extended attributes. There is a known bug in gfs2_ea_get() which is believed to be independant of this patch. Further patches will follow once that bug is fixed in order to make GFS2 use as much of the generic eattr infrastructure as possible. Signed-off-by: Ryan O'Hara <rohara@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-22 10:08:35 -04:00
David Teigland	d2f222e631	[GFS2] setup lock_dlm kobject earlier Setup the lock_dlm kobject before setting up the dlm lockspace instead of after. We want to use the sysfs files to detect the mount without having to wait for the dlm setup which can take a while. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-19 08:24:02 -04:00
Steven Whitehouse	320dd101e2	[GFS2] glock debugging and inode cache changes This adds some extra debugging to glock.c and changes inode.c's deallocation code to call the debugging code at a suitable moment. I'm chasing down a particular bug to do with deallocation at the moment and the code can go again once the bug is fixed. Also this includes the first part of some changes to unify the Linux struct inode and GFS2's struct gfs2_inode. This transformation will happen in small parts over the next short period. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-18 16:25:27 -04:00
Steven Whitehouse	3a8a9a1034	[GFS2] Update copyright date to 2006 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-18 15:09:15 -04:00
Steven Whitehouse	bd8968010a	[GFS2] Remove semaphore.h from C files We no longer use semaphores, everything has been converted to mutex or rwsem, so we don't need to include this header any more. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-18 14:54:58 -04:00
Steven Whitehouse	1b50259bc3	[GFS2] Drop log lock on I/O error & tidy up This patch drops the log spinlock when an I/O error occurs to avoid any possible problems in case of blocking or recursion in the I/O error routine. It also has a few cosmetic changes to tidy up various other files. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-18 14:10:52 -04:00
Steven Whitehouse	02f211f4d0	[GFS2] Remove bits.c from the Makefile Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-18 14:03:43 -04:00
Steven Whitehouse	3efd7534a8	[GFS2] Make newly moved functions static The functions moved from bits.c can now be made static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-18 14:02:52 -04:00
Steven Whitehouse	88c8ab1fcb	[GFS2] Merge bits.[ch] into rgrp.c Since they are small and will be inlined by the complier, it makes sense to merge the contents of bits.[ch] into rgrp.c Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-18 13:52:39 -04:00
Steven Whitehouse	64c14ea73b	[GFS2] Fix ref count bug that used to bite us on umount The ref count of certain glock's got elevated too far during unlink which caused umount to fail. This fixes it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-16 13:37:11 -04:00
Steven Whitehouse	b9cb981310	[GFS2] Fix attributes setting logic The attributes logic for immutable was wrong so that there was not way to remove this attribute once set. This fixes the bug. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-12 17:07:56 -04:00
Steven Whitehouse	9801f6461e	[GFS2] Remove incorrect initialisation of gh_owner The gh_owner field shouldn't be set or reset outside the glock code. These were left over from when recursive locking was allowed. It isn't any more, so they are not needed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-12 14:06:02 -04:00
Steven Whitehouse	e90c01e148	[GFS2] Reverse block order in build_height The original code ordered the blocks allocated in the build_height routine backwards causing excessive disk seeks during a read of the metadata. This patch reverses the order to try and reduce disk seeks. Example: A five level metadata tree, I = Inode, P = Pointers, D = Data You need to read the blocks in the order: I P5 P4 P3 P2 P1 D in order to read a single data block. The new code now orders the blocks in this way. The old code used to order them as: I P1 P2 P3 P4 P5 D requiring two extra seeks on average. Note that for files which are grown by gradual extension rather than by truncate or by llseek/write at a large offset, this doesn't apply. In the case of writing to a file linearly, this routine will only be called upon to extend the height of the tree by one block at a time, so the ordering is determined by when its called rather than by the internals of the routine itself. Optimising that part of the ordering is a much harder problem. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-12 12:09:15 -04:00
Steven Whitehouse	fd88de569b	[GFS2] Readpages support This adds readpages support (and also corrects a small bug in the readpage error path at the same time). Hopefully this will improve performance by allowing GFS to submit larger lumps of I/O at a time. In order to simplify the setting of BH_Boundary, it currently gets set when we hit the end of a indirect pointer block. There is always a boundary at this point with the current allocation code. It doesn't get all the boundaries right though, so there is still room for improvement in this. See comments in fs/gfs2/ops_address.c for further information about readpages with GFS2. Signed-off-by: Steven Whitehouse	2006-05-05 16:59:11 -04:00
Robert S Peterson	5bb76af1e0	[GFS2] Set d_ops for root inode Well, I managed to track down the bug in gfs2 that was causing my grief. Below is a patch for the problem. Please incorporate as you see fit. Or should I say: as you see git. The problem was basically that you never set d_ops for the root inode, so the wrong hash algorithm was being used. But only for the root directory. Turns out that if I used subdirectories, it used the proper hash and my files were found just fine. Signed-off-by: Robert S Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-05 16:29:50 -04:00
Steven Whitehouse	d2d7b8a2a7	[GFS2] Fix bug in writepage() As pointed out by Wendy Cheng, the logic in GFS2's writepage() function wasn't quite right with respect to invalidating pages when a file has been truncated. This patch fixes that. CC: Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-02 12:09:42 -04:00
Steven Whitehouse	56409abbf8	[GFS2] Remove some unused code Remove some of the unused code flagged up by Adrian Bunk. Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse	2006-04-28 11:48:45 -04:00
Adrian Bunk	08bc2dbc73	[GFS2] [-mm patch] fs/gfs2/: possible cleanups This patch contains the following possible cleanups: - make needlessly global code static - #if 0 unused functions - remove the following global function that was both unused and unimplemented: - super.c: gfs2_do_upgrade() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-04-28 10:59:12 -04:00
Steven Whitehouse	363275216c	[GFS2] Reordering in deallocation to avoid recursive locking Despite my earlier careful search, there was a recursive lock left in the deallocation code. This removes it. It also should speed up deallocation be reducing the number of locking operations which take place by using two "try lock" operations on the two locks involved in inode deallocation which allows us to grab the locks out of order (compared with NFS which grabs the inode lock first and the iopen lock later). It is ok for us to fail while doing this since if it does fail it means that someone else is still using the inode and thus it wouldn't be possible to deallocate anyway. This fixes the bug reported to me by Rob Kenna. Cc: Rob Kenna <rkenna@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-04-28 10:46:21 -04:00

... 7 8 9 10 11 ...

927 Commits