OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Wendy Cheng	de986e859a	[GFS2] Data corruption fix * GFS2 has been using i_cache array to store its indirect meta blocks. Its flush routine doesn't correctly clean up all the entries. The problem would show while multiple nodes do simultaneous writes to the same file. Upon glock exclusive lock transfer, if the file is a sparse file with large file size where the indirect meta blocks span multiple array entries with "zero" entries in between. The flush routine prematurely stops the flushing that leaves old (stale) entries around. This leads to several nasty issues, including data corruption. * Fix gfs2_get_block_noalloc checking to correctly return EIO upon unmapped buffer. Signed-off-by: Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:26 +01:00
Steven Whitehouse	16615be18c	[GFS2] Clean up journaled data writing This patch cleans up the code for writing journaled data into the log. It also removes the need to allocate a small "tag" structure for each block written into the log. Instead we just keep count of the outstanding I/O so that we can be sure that its all been written at the correct time. Another result of this patch is that a number of ll_rw_block() calls have become submit_bh() calls, closing some races at the same time. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:24 +01:00
Steven Whitehouse	d7b616e252	[GFS2] Clean up ordered write code The following patch removes the ordered write processing from databuf_lo_before_commit() and moves it to log.c. This has the effect of greatly simplyfying databuf_lo_before_commit() and well as potentially making the ordered write code more efficient. As a side effect of this, its now possible to remove ordered buffers from the ordered buffer list at any time, so we now make use of this in invalidatepage and releasepage to ensure timely release of these buffers. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-10-10 08:56:03 +01:00
Steven Whitehouse	bb3b0e3df5	[GFS2] Clean up invalidatepage/releasepage This patch fixes some bugs relating to journaled data files by cleaning up the gfs2_invalidatepage() and gfs2_releasepage() functions. We now never block during gfs2_releasepage(), instead we always either release or refuse to release depending on the status of the buffers. This fixes Red Hat bugzillas #248969 and #252392. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com>	2007-10-10 08:55:29 +01:00
Steven Whitehouse	a867bb28c1	[GFS2] Fix incorrect error path in prepare_write() The error path in prepare_write() was incorrect in the (very rare) event that the transaction fails to start. The following prevents a NULL pointer dereference, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:33:44 +01:00
Nick Piggin	54cb8821de	mm: merge populate and nopage into fault (fixes nonlinear) Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes the virtual address -> file offset differently from linear mappings. ->populate is a layering violation because the filesystem/pagecache code should need to know anything about the virtual memory mapping. The hitch here is that the ->nopage handler didn't pass down enough information (ie. pgoff). But it is more logical to pass pgoff rather than have the ->nopage function calculate it itself anyway (because that's a similar layering violation). Having the populate handler install the pte itself is likewise a nasty thing to be doing. This patch introduces a new fault handler that replaces ->nopage and ->populate and (later) ->nopfn. Most of the old mechanism is still in place so there is a lot of duplication and nice cleanups that can be removed if everyone switches over. The rationale for doing this in the first place is that nonlinear mappings are subject to the pagefault vs invalidate/truncate race too, and it seemed stupid to duplicate the synchronisation logic rather than just consolidate the two. After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in pagecache. Seems like a fringe functionality anyway. NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no users have hit mainline yet. [akpm@linux-foundation.org: cleanup] [randy.dunlap@oracle.com: doc. fixes for readahead] [akpm@linux-foundation.org: build fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Steven Whitehouse	2840501ac8	[GFS2] Use zero_user_page() in stuffed_readpage() As suggested by Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Robert P. J. Day <rpjday@mindspring.com>	2007-07-09 08:23:45 +01:00
Robert Peterson	8fb68595d5	[GFS2] Journaled file write/unstuff bug This patch is for bugzilla bug 283162, which uncovered a number of bugs pertaining to writing to files that have the journaled bit on. These bugs happen most often when writing to the meta_fs because the files are always journaled. So operations like gfs2_grow were particularly vulnerable, although many of the problems could be recreated with normal files after setting the journaled bit on. The problems fixed are: -GFS2 wasn't ever writing unstuffed journaled data blocks to their in-place location on disk. Now it does. -If you unmounted too quickly after doing IO to a journaled file, GFS2 was crashing because you would discard a buffer whose bufdata was still on the active items list. GFS2 now deals with this gracefully. -GFS2 was losing track of the bufdata for journaled data blocks, and it wasn't getting freed, causing an error when you tried to unmount the module. GFS2 now frees all the bufdata structures. -There was a memory corruption occurring because GFS2 wrote twice as many log entries for journaled buffers. -It was occasionally trying to write journal headers in buffers that weren't currently mapped. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:40 +01:00
Benjamin Marzinski	ddf4b426aa	[GFS2] fix jdata issues This is a patch for the first three issues of RHBZ #238162 The first issue is that when you allocate a new page for a file, it will not start off uptodate. This makes sense, since you haven't written anything to that part of the file yet. Unfortunately, gfs2_pin() checks to make sure that the buffers are uptodate. The solution to this is to mark the buffers uptodate in gfs2_commit_write(), after they have been zeroed out and have the data written into them. I'm pretty confident with this fix, although it's not completely obvious that there is no problem with marking the buffers uptodate here. The second issue is simply that you can try to pin a data buffer that is already on the incore log, and thus, already pinned. This patch checks to see if this buffer is already on the log, and exits databuf_lo_add() if it is, just like buf_lo_add() does. The third issue is that gfs2_log_flush() doesn't do it's block accounting correctly. Both metadata and journaled data are logged, but gfs2_log_flush() only compares the number of metadata blocks with the number of blocks to commit to the ondisk journal. This patch also counts the journaled data blocks. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:23:08 +01:00
Steven Whitehouse	dbb7cae2a3	[GFS2] Clean up inode number handling This patch cleans up the inode number handling code. The main difference is that instead of looking up the inodes using a struct gfs2_inum_host we now use just the no_addr member of this structure. The tests relating to no_formal_ino can then be done by the calling code. This has advantages in that we want to do different things in different code paths if the no_formal_ino doesn't match. In the NFS patch we want to return -ESTALE, but in the ->lookup() path, its a bug in the fs if the no_formal_ino doesn't match and thus we can withdraw in this case. In order to later fix bz #201012, we need to be able to look up an inode without knowing no_formal_ino, as the only information that is known to us is the on-disk location of the inode in question. This patch will also help us to fix bz #236099 at a later date by cleaning up a lot of the code in that area. There are no user visible changes as a result of this patch and there are no changes to the on-disk format either. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:22:24 +01:00
Robert Peterson	cd81a4bac6	[GFS2] Addendum patch 2 for gfs2_grow This addendum patch 2 corrects three things: 1. It fixes a stupid mistake in the previous addendum that broke gfs2. Ref: https://www.redhat.com/archives/cluster-devel/2007-May/msg00162.html 2. It fixes a problem that Dave Teigland pointed out regarding the external declarations in ops_address.h being in the wrong place. 3. It recasts a couple more %llu printks to (unsigned long long) as requested by Steve Whitehouse. I would have loved to put this all in one revised patch, but there was a rush to get some patches for RHEL5. Therefore, the previous patches were applied to the git tree "as is" and therefore, I'm posting another addendum. Sorry. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:22:19 +01:00
Robert Peterson	6c53267f05	[GFS2] Kernel changes to support new gfs2_grow command (part 2) To avoid code redundancy, I separated out the operational "guts" into a new function called read_rindex_entry. Then I made two functions: the closer-to-original gfs2_ri_update (without the special condition checks) and gfs2_ri_update_special that's designed with that condition in mind. (I don't like the name, but if you have a suggestion, I'm all ears). Oh, and there's an added benefit: we don't need all the ugly gotos anymore. ;) This patch has been tested with gfs2_fsck_hellfire (which runs for three and a half hours, btw). Signed-off-By: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:22:14 +01:00
Robert Peterson	7ae8fa8451	[GFS2] kernel changes to support new gfs2_grow command This is another revision of my gfs2 kernel patch that allows gfs2_grow to function properly. Steve Whitehouse expressed some concerns about the previous patch and I restructured it based on his comments. The previous patch was doing the statfs_change at file close time, under its own transaction. The current patch does the statfs_change inside the gfs2_commit_write function, which keeps it under the umbrella of the inode transaction. I can't call ri_update to re-read the rindex file during the transaction because the transaction may have outstanding unwritten buffers attached to the rgrps that would be otherwise blown away. So instead, I created a new function, gfs2_ri_total, that will re-read the rindex file just to total the file system space for the sake of the statfs_change. The ri_update will happen later, when gfs2 realizes the version number has changed, as it happened before my patch. Since the statfs_change is happening at write_commit time and there may be multiple writes to the rindex file for one grow operation. So one consequence of this restructuring is that instead of getting one kernel message to indicate the change, you may see several. For example, before when you did a gfs2_grow, you'd get a single message like: GFS2: File system extended by 247876 blocks (968MB) Now you get something like: GFS2: File system extended by 207896 blocks (812MB) GFS2: File system extended by 39980 blocks (156MB) This version has also been successfully run against the hours-long "gfs2_fsck_hellfire" test that does several gfs2_grow and gfs2_fsck while interjecting file system damage. It does this repeatedly under a variety Resource Group conditions. Signed-off-By: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:22:12 +01:00
Steven Whitehouse	bf126aee6d	[GFS2] Patch to fix mmap of stuffed files If a stuffed file is mmaped and a page fault is generated at some offset above the initial page, we need to create a zero page to hang the buffer heads off before we can unstuff the file. This is a fix for bz #236087 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:11:46 +01:00
Josef Whiter	1de9139092	[GFS2] Fix bz 231380, unlock page before dequeing glocks in gfs2_commit_write If we are writing a file, and in the middle of writing the file another node attempts to get a shared lock on that file (by doing a du for example) the process doing the writing will hang waiting on lock_page. The reason for this is because when we have waiters on a exclusive glock, we will go through and flush out all dirty pages associated with that inode and release the lock. The problem is that when we flush the dirty pages, we could hit a page that we have locked durring the generic_file_buffered_write part of this operation. This patch unlocks the page before we go to dequeue the lock and locks it immediatly afterwards, since generic_file_buffered_write needs the page locked when the commit_write is completed. This patch resolves the problem, however if somebody sees a better way to do this please don't hesistate to yell. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-05-01 09:10:37 +01:00
Josef Whiter	a13cbe3753	[GFS2] fix hangup when multiple processes are trying to write to the same file This fixes a problem I encountered while running bonnie++. When you have one thread that opens a file and starts to write to it, and then another thread that tries to open and write to the same file, the second thread will loop forever trying to grab the inode lock for that inode. Basically we come in through generic_buffered_file_write, which calls gfs2_prepare_write, which then attempts to grab the glock. Because we don't own the lock, gfs2_prepare_write gets GLR_TRYFAILED, which returns AOP_TRUNCATED_PAGE to generic_buffered_file_write. At this point generic_buffered_file_write loops around again and immediately retries the prepare_write. This means that the second process never gets off of the processor in order to allow the process that holds the lock to finish its work and let go of the lock. This patch makes gfs2_glock_nq schedule() if it gets back a GLR_TRYFAILED, which resolves this problem. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-03-07 13:58:02 -05:00
Adrian Bunk	a2cf822274	[GFS2] make gfs2_writepages() static On Mon, Jan 29, 2007 at 08:45:28PM -0800, Andrew Morton wrote: >... > Changes since 2.6.20-rc6-mm2: >... > git-gfs2-nmw.patch >... > git trees >... This patch makes the needlessly global gfs2_writepages() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-07 10:48:48 -05:00
Steven Whitehouse	2d72e7101c	[GFS2] Unlock page on prepare_write try lock failure When the try lock of the glock failed in prepare_write we were incorrectly exiting this function with the page still locked. This was resulting in further I/O to this page hanging. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-07 10:25:59 -05:00
Steven Whitehouse	a8d638e30e	[GFS2] Add writepages for "data=writeback" mounts It occurred to me that although a gfs2 specific writepages for ordered writes and journaled data would be tricky, by hooking writepages only for "data=writeback" mounts we could take advantage of not needing buffer heads (we don't use them on the read side, nor have we for some time) and create much larger I/Os for the block layer. Using blktrace both before and after, its possible to see that for large I/Os, most of the requests generated through writepages are now 1024 sectors after this patch is applied as opposed to 8 sectors before. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:01 -05:00
Steven Whitehouse	e1d5b18ae9	[GFS2] Fail over to readpage for stuffed files This is partially derrived from a patch written by Russell Cattelan. It fixes a bug where there is a race between readpages and truncate by ignoring readpages for stuffed files. This is ok because a stuffed file will never be more than one block (minus sizeof(struct gfs2_dinode)) in size and block size is always less than page size, so we do not lose anything efficiency-wise by not doing readahead for stuffed files. They will have already been "read ahead" by the action of reading the inode in, in the first place. This is the remaining part of the fix for Red Hat bugzilla #218966 which had not yet made it upstream. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Russell Cattelan <cattelan@redhat.com>	2007-02-05 13:36:12 -05:00
Steven Whitehouse	c7b3383437	[GFS2] Fix DIO deadlock This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs due to trying to take the i_mutex while holding a glock. The correct locking order is defined as i_mutex -> glock in all cases. I've left dealing with allocating writes. I know that we need to do that, but for now this should do the trick. We don't need to take the i_mutex on write, because the VFS has already taken it for us. On read we don't need it since the glock is enough protection. The reason that I've made some of the checks into a separate function is that we'll need to do the checks again in the allocating write case eventually, so this is partly in preparation for this. Likewise the return value test of != 1 might look a bit odd and thats because we'll need a third return value in case of requiring an allocation. I've made the change to deferred mode on the glock to ensure flushing read caches on other nodes. I notice that (using blktrace to look at whats going on) we appear to do a better job of large I/Os than ext3 after this patch (in terms of not splitting up the I/Os). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Wendy Cheng <wcheng@redhat.com>	2007-02-05 13:36:09 -05:00
Steven Whitehouse	ae619320b2	[GFS2] mark_inode_dirty after write to stuffed file Writes to stuffed files were not being marked dirty correctly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:36:36 -05:00
Steven Whitehouse	dcd2479959	[GFS2] Remove unused function from inode.c The gfs2_glock_nq_m_atime function is unused in so far as its only ever called with num_gh = 1, and this falls through to the gfs2_glock_nq_atime function, so we might as well call that directly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:57 -05:00
Russell Cattelan	61057c6bb3	[GFS2] Remove unused zero_readpage from stuffed_readpage Stuffed files only consist of a maximum of (gfs2 block size - sizeof(struct gfs2_dinode)) bytes. Since the gfs2 block size is always less than page size, we will never see a call to stuffed_readpage for anything other than the first page in the file. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:57 -05:00
Steven Whitehouse	2ca99501fa	[GFS2] Fix page lock/glock deadlock This fixes a race between the glock and the page lock encountered during truncate in gfs2_readpage and gfs2_prepare_write. The gfs2_readpages function doesn't need the same fix since it only uses a try lock anyway, so it will fail back to gfs2_readpage in the case of a potential deadlock. This bug was spotted by Russell Cattelan. Cc: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:43 -05:00
Steven Whitehouse	1a7b1eed58	[GFS2] Shrink gfs2_inode (6) - di_atime/di_mtime/di_ctime Remove the di_[amc]time fields and use inode->i_[amc]time fields instead. This saves 24 bytes from the gfs2_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:23 -05:00
Steven Whitehouse	2933f9254a	[GFS2] Shrink gfs2_inode (4) - di_uid/di_gid Remove duplicate di_uid/di_gid fields in favour of using inode->i_uid/inode->i_gid instead. This saves 8 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:17 -05:00
Steven Whitehouse	b60623c238	[GFS2] Shrink gfs2_inode (3) - di_mode This removes the duplicate di_mode field in favour of using the inode->i_mode field. This saves 4 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:34:14 -05:00
OGAWA Hirofumi	7011774db8	[PATCH] gfs2: ->readpages() fixes This just ignore the remaining pages, and remove unneeded unlock_pages(). Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Steven French <sfrench@us.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-11-03 12:27:57 -08:00
Steven Whitehouse	23591256d6	[GFS2] Fix bmap to map extents properly This fix means that bmap will map extents of the length requested by the VFS rather than guessing at it, or just mapping one block at a time. The other callers of gfs2_block_map are audited to ensure they send the correct max extent lengths (i.e. set bh->b_size correctly). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-20 09:13:40 -04:00
Russell Cattelan	c312c4fdc8	[GFS2] Pass the correct value to kunmap_atomic Pass kaddr rather than (incorrect) struct page to kunmap_atomic. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-12 17:11:13 -04:00
Steven Whitehouse	f5c54804d9	[GFS2] Fix uninitialised variable This fixes a bug where, in certain cases an uninitialised variable could cause a dereference of a NULL pointer in gfs2_commit_write(). Also a typo in a comment is fixed at the same time. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-12 17:10:15 -04:00
Russell Cattelan	52ae7b7935	[GFS2] Fix a size calculation error Fix a size calculation error. The size was incorrect being computed as a negative length and then being passed to an unsigned parameter. This in turn would cause the allocator to think it needed enough meta data to store a gigabyte file for every file created. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-12 17:09:54 -04:00
Steven Whitehouse	48516ced21	[GFS2] Remove uneeded endian conversion In many places GFS2 was calling the endian conversion routines for an inode even when only a single field, or a few fields might have changed. As a result we were copying lots of data needlessly. This patch replaces those calls with conversion of just the required fields in each case. This should be faster and easier to understand. There are still other places which suffer from this problem, but this is a start in the right direction. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-02 12:39:19 -04:00
Steven Whitehouse	907b9bceb4	[GFS2/DLM] Fix trailing whitespace As per Andrew Morton's request, removed trailing whitespace. Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-25 09:26:04 -04:00
Steven Whitehouse	7276b3b0c7	[GFS2] Tidy up meta_io code Fix a bug in the directory reading code, where we might have dereferenced a NULL pointer in case of OOM. Updated the directory code to use the new & improved version of gfs2_meta_ra() which now returns the first block that was being read. Previously it was releasing it requiring following code to grab the block again at each point it was called. Also turned off readahead on directory lookups since we are reading a hash table, and therefore reading the entries in order is very unlikely. Readahead is still used for all other calls to the directory reading function (e.g. when growing the hash table). Removed the DIO_START constant. Everywhere this was used, it was used to unconditionally start i/o aside from a couple of places, so I've removed it and made the couple of exceptions to this rule into separate functions. Also hunted through the other DIO flags and removed them as arguments from functions which were always called with the same combination of arguments. Updated gfs2_meta_indirect_buffer to be a bit more efficient and hopefully also be a bit easier to read. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-21 17:05:23 -04:00
Steven Whitehouse	56965536b8	[GFS2] Remove unused constants Three of the DIO constants were not being used, so remove them. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-20 15:48:09 -04:00
Fabio Massimo Di Nitto	7d308590ae	[GFS2] Export lm_interface to kernel headers lm_interface.h has a few out of the tree clients such as GFS1 and userland tools. Right now, these clients keeps a copy of the file in their build tree that can go out of sync. Move lm_interface.h to include/linux, export it to userland and clean up fs/gfs2 to use the new location. Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-19 08:45:18 -04:00
Steven Whitehouse	07903c02d0	[GFS2] Tweek unlock test in readpage() This make the unlock test a bit simpler. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-18 17:30:05 -04:00
Russell Cattelan	dc41aeedef	[GFS2] Fix for mmap() bug in readpage Fix for Red Hat bz 205307. Don't need to lock in readpage if the higher level code has already grabbed the lock. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-18 17:26:48 -04:00
Steven Whitehouse	7a6bbacbb8	[GFS2] Map multiple blocks at once where possible This is a tidy up of the GFS2 bmap code. The main change is that the bh is passed to gfs2_block_map allowing the flags to be set directly rather than having to repeat that code several times in ops_address.c. At the same time, the extent mapping code from gfs2_extent_map has been moved into gfs2_block_map. This allows all calls to gfs2_block_map to map extents in the case that no allocation is taking place. As a result reads and non-allocating writes should be faster. A quick test with postmark appears to support this. There is a limit on the number of blocks mapped in a single bmap call in that it will only ever map blocks which are pointed to from a single pointer block. So in other words, it will never try to do additional i/o in order to satisfy read-ahead. The maximum number of blocks is thus somewhat less than 512 (the GFS2 4k block size minus the header divided by sizeof(u64)). I've further limited the mapping of "normal" blocks to 32 blocks (to avoid extra work) since readpages() will currently read a maximum of 32 blocks ahead (128k). Some further work will probably be needed to set a suitable value for DIO as well, but for now thats left at the maximum 512 (see ops_address.c:gfs2_get_block_direct). There is probably a lot more that can be done to improve bmap for GFS2, but this is a good first step. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-18 17:18:23 -04:00
Steven Whitehouse	0bd5996a00	[GFS2] Style changes in ops_address.c As per the remainder of Jan Engelhardt's fourth email comments, remove an cast thats not required. Also tidy up the "limit" code in stuck_releasepage(). Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 14:59:35 -04:00
Steven Whitehouse	dd538c832a	[GFS2] Spelling sentinal -> sentinel A spelling mistake (one of mine). Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 14:53:30 -04:00
Steven Whitehouse	cd915493fc	[GFS2] Change all types to uX style This makes all fixed size types have consistent names. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-04 12:49:07 -04:00
Steven Whitehouse	e9fc2aa091	[GFS2] Update copyright, tidy up incore.h As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this updates the copyright message to say "version" in full rather than "v.2". Also incore.h has been updated to remove forward structure declarations which are not required. The gfs2_quota_lvb structure has now had endianess annotations added to it. Also quota.c has been updated so that we now store the lvb data locally in endian independant format to avoid needing a structure in host endianess too. As a result the endianess conversions are done as required at various points and thus the conversion routines in lvb.[ch] are no longer required. I've moved the one remaining constant in lvb.h thats used into lm.h and removed the unused lvb.[ch]. I have not changed the HIF_ constants. That is left to a later patch which I hope will unify the gh_flags and gh_iflags fields of the struct gfs2_holder. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-01 11:05:15 -04:00
Steven Whitehouse	623d93555c	[GFS2] Fix releasepage bug (fixes direct i/o writes) This patch fixes three main bugs. Firstly the direct i/o get_block was returning the wrong return code in certain cases. Secondly, the GFS2's releasepage function was not dealing with cases when clean, ordered buffers were found still queued on a transaction (which can happen depending on the ordering of journal flushes). Thirdly, the journaling code itself needed altering to take account of the after effects of removing the clean ordered buffers from the transactions before a journal flush. The releasepage bug did also show up under "normal" buffered i/o as well, so its not just a fix for direct i/o. In fact its not normally used in the direct i/o path at all, except when flushing existing buffers after performing a direct i/o write, but that was the code path that led us to spot this. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-31 12:14:44 -04:00
Steven Whitehouse	166afccd71	[GFS2] Tidy up error handling in gfs2_releasepage() This should clarify the logic in gfs2_releasepage() relating to error handling as well as making the response to errors a bit more graceful. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-24 15:59:40 -04:00
Steven Whitehouse	15d00c0b91	[GFS2] Fix leak of gfs2_bufdata This fixes a memory leak of struct gfs2_bufdata and also some problems in the ordered write handling code. It needs a bit more testing, but I believe that the reference counting of ordered write buffers should now be correct. This is aimed at fixing Red Hat bugzilla: #201028 and #201082 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-18 15:51:09 -04:00
Steven Whitehouse	f4387149ec	[GFS2] Fix lack of buffers in writepage bug In some cases we can enter write page without there being buffers attached to the page. In this case the function to add gfs2_bufdata to the buffers fails sliently causing further failures down the stack. This fix ensures that we always add buffers in writepage if they didn't already exist (mmap is one way to trigger this). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-08 13:23:19 -04:00
Steven Whitehouse	59a1cc6bda	[GFS2] Fix lock ordering bug in page fault path Mmapped files were able to trigger a lock ordering bug. Private maps do not need to take the glock so early on. Shared maps do unfortunately, however we can get around that by adding a flag into the flags for the struct gfs2_file. This only works because we are taking an exclusive lock at this point, so we know that nobody else can be racing with us. Fixes Red Hat bugzilla: #201196 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-04 15:41:22 -04:00

1 2

78 Commits