OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
NeilBrown	aa5cbd1038	md/bitmap: protect against bitmap removal while being updated. A write intent bitmap can be removed from an array while the array is active. When this happens, all IO is suspended and flushed before the bitmap is removed. However it is possible that bitmap_daemon_work is still running to clear old bits from the bitmap. If it is, it can dereference the bitmap after it has been freed. So introduce a new mutex to protect bitmap_daemon_work and get it before destroying a bitmap. This is suitable for any current -stable kernel. Signed-off-by: NeilBrown <neilb@suse.de> Cc: stable@kernel.org	2009-12-14 12:49:46 +11:00
NeilBrown	ae8fa2831b	md: remove clumsy usage of do_sync_mapping_range from bitmap code and replace with vfs_fsync which is much neater (but wasn't exported, or even in existence at the time the code was written). Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.de>	2009-10-16 15:56:01 +11:00
NeilBrown	ee305acef5	md: remove sparse warnings about lock context. There was a real error here on a failure path where we incorrectly call rcu_read_unlock. Signed-off-by: NeilBrown <neilb@suse.de>	2009-09-23 18:06:44 +10:00
Linus Torvalds	c9059598ea	Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits) block: add request clone interface (v2) floppy: fix hibernation ramdisk: remove long-deprecated "ramdisk=" boot-time parameter fs/bio.c: add missing __user annotation block: prevent possible io_context->refcount overflow Add serial number support for virtio_blk, V4a block: Add missing bounce_pfn stacking and fix comments Revert "block: Fix bounce limit setting in DM" cciss: decode unit attention in SCSI error handling code cciss: Remove no longer needed sendcmd reject processing code cciss: change SCSI error handling routines to work with interrupts enabled. cciss: separate error processing and command retrying code in sendcmd_withirq_core() cciss: factor out fix target status processing code from sendcmd functions cciss: simplify interface of sendcmd() and sendcmd_withirq() cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code cciss: Use schedule_timeout_uninterruptible in SCSI error handling code block: needs to set the residual length of a bidi request Revert "block: implement blkdev_readpages" block: Fix bounce limit setting in DM Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt ... Manually fix conflicts with tracing updates in: block/blk-sysfs.c drivers/ide/ide-atapi.c drivers/ide/ide-cd.c drivers/ide/ide-floppy.c drivers/ide/ide-tape.c include/trace/events/block.h kernel/trace/blktrace.c	2009-06-11 11:10:35 -07:00
NeilBrown	be51269103	md: bitmap: improve bitmap maintenance code. The code for checking which bits in the bitmap can be cleared has 2 problems: 1/ it repeatedly takes and drops a spinlock, where it would make more sense to just hold on to it most of the time. 2/ it doesn't make use of some opportunities to skip large sections of the bitmap This patch fixes those. It will only affect CPU consumption, not correctness. Signed-off-by: NeilBrown <neilb@suse.de>	2009-05-26 09:41:17 +10:00
Martin K. Petersen	e1defc4ff0	block: Do away with the notion of hardsect_size Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 23:22:54 +02:00
NeilBrown	db305e507d	md: fix some (more) errors with bitmaps on devices larger than 2TB. If a write intent bitmap covers more than 2TB, we sometimes work with values beyond 32bit, so these need to be sector_t. This patches add the required casts to some unsigned longs that are being shifted up. This will affect any raid10 larger than 2TB, or any raid1/4/5/6 with member devices that are larger than 2TB. Signed-off-by: NeilBrown <neilb@suse.de> Reported-by: "Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE> Cc: stable@kernel.org	2009-05-07 12:49:06 +10:00
NeilBrown	b74fd2826c	md: fix loading of out-of-date bitmap. When md is loading a bitmap which it knows is out of date, it fills each page with 1s and writes it back out again. However the write_page call makes used of bitmap->file_pages and bitmap->last_page_size which haven't been set correctly yet. So this can sometimes fail. Move the setting of file_pages and last_page_size to before the call to write_page. This bug can cause the assembly on an array to fail, thus making the data inaccessible. Hence I think it is a suitable candidate for -stable. Cc: stable@kernel.org Reported-by: Vojtech Pavlik <vojtech@suse.cz> Signed-off-by: NeilBrown <neilb@suse.de>	2009-05-07 12:47:19 +10:00
NeilBrown	1f59390339	md: support bitmaps on RAID10 arrays larger then 2 terabytes .. and other arrays with components larger than 2 terabytes. We use a "long" rather than a "sector_t" in part of the bitmap size calculations, which is sad. Reported-by: "Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE> Signed-off-by: NeilBrown <neilb@suse.de>	2009-04-20 11:50:24 +10:00
NeilBrown	acb180b0e3	md: improve usefulness and accuracy of sysfs file md/sync_completed. The sync_completed file reports how much of a resync (or recovery or reshape) has been completed. However due to the possibility of out-of-order completion of writes, it is not certain to be accurate. We have an internal value - mddev->curr_resync_completed - which is an accurate value (though it might not always be quite so uptodate). So: - make curr_resync_completed be uptodate a little more often, particularly when raid5 reshape updates status in the metadata - report curr_resync_completed in the sysfs file - allow poll/select to report all updates to md/sync_completed. This makes sync_completed completed usable by any external metadata handler that wants to record this status information in its metadata. Signed-off-by: NeilBrown <neilb@suse.de>	2009-04-14 16:28:34 +10:00
Andre Noll	58c0fed400	md: Make mddev->size sector-based. This patch renames the "size" field of struct mddev_s to "dev_sectors" and stores the number of 512-byte sectors instead of the number of 1K-blocks in it. All users of that field, including raid levels 1,4-6,10, are adjusted accordingly. This simplifies the code a bit because it allows to get rid of a couple of divisions/multiplications by two. In order to make checkpatch happy, some minor coding style issues have also been addressed. In particular, size_store() now uses strict_strtoull() instead of simple_strtoull(). Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>	2009-03-31 14:33:13 +11:00
NeilBrown	97e4f42d62	md: occasionally checkpoint drive recovery to reduce duplicate effort after a crash Version 1.x metadata has the ability to record the status of a partially completed drive recovery. However we only update that record on a clean shutdown. It would be nice to update it on unclean shutdowns too, particularly when using a bitmap that removes much to the 'sync' effort after an unclean shutdown. One complication with checkpointing recovery is that we only know where we are up to in terms of IO requests started, not which ones have completed. And we need to know what has completed to record how much is recovered. So occasionally pause the recovery until all submitted requests are completed, then update the record of where we are up to. When we have a bitmap, we already do that pause occasionally to keep the bitmap up-to-date. So enhance that code to record the recovery offset and schedule a superblock update. And when there is no bitmap, just pause 16 times during the resync to do a checkpoint. '16' is a fairly arbitrary number. But we don't really have any good way to judge how often is acceptable, and it seems like a reasonable number for now. Signed-off-by: NeilBrown <neilb@suse.de>	2009-03-31 14:33:13 +11:00
NeilBrown	43b2e5d86d	md: move md_k.h from include/linux/raid/ to drivers/md/ It really is nicer to keep related code together.. Signed-off-by: NeilBrown <neilb@suse.de>	2009-03-31 14:33:13 +11:00
NeilBrown	bff61975b3	md: move lots of #include lines out of .h files and into .c This makes the includes more explicit, and is preparation for moving md_k.h to drivers/md/md.h Remove include/raid/md.h as its only remaining use was to #include other files. Signed-off-by: NeilBrown <neilb@suse.de>	2009-03-31 14:33:13 +11:00
Christoph Hellwig	ef740c372d	md: move headers out of include/linux/raid/ Move the headers with the local structures for the disciplines and bitmap.h into drivers/md/ so that they are more easily grepable for hacking and not far away. md.h is left where it is for now as there are some uses from the outside. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.de>	2009-03-31 14:27:03 +11:00
NeilBrown	355a43e641	md: write bitmap information to devices that are undergoing recovery. When we add some spares to an array and start recovery, and we have a bitmap which is stored 'internally' on all devices, we call bitmap_write_all to make sure the bitmap is correct on the new device(s). However that doesn't work as write_sb_page only writes to 'In_sync' devices, and devices undergoing recovery are not 'In_sync' until recovery finishes. So extend write_sb_page (actually next_active_rdev) to include devices that are under recovery. Signed-off-by: NeilBrown <neilb@suse.de>	2009-03-31 14:27:02 +11:00
NeilBrown	d0a4bb4927	md: never clear bit from the write-intent bitmap when the array is degraded. It is safe to clear a bit from the write-intent bitmap for a raid1 if we know the data has been written to all devices, which is what the current test does. But it is not always safe to update the 'events_cleared' counter in that case. This is because one request could complete successfully after some other request has partially failed. So simply disable the clearing and updating of events_cleared whenever the array is degraded. This might end up not clearing some bits that could safely be cleared, but it is safest approach. Note that the bug fixed here did not risk corrupting data by letting the array get out-of-sync. Rather it meant that when a device is removed and re-added to the array, it might incorrectly require a full recovery rather than just recovering based on the bitmap. Signed-off-by: NeilBrown <neilb@suse.de>	2009-03-31 14:27:02 +11:00
NeilBrown	1187cf0a3c	md: Allow write-intent bitmaps to have chunksize < PAGE_SIZE md currently insists that the chunk size used for write-intent bitmaps (the amount of data that corresponds to one chunk) be at least one page. The reason for this restriction is lost in the mists of time, but a review of the code (and a vague memory) suggests that the only problem would be related to resync. Resync tries very hard to work in multiples of a page, but also needs to sync with units of a bitmap_chunk too. This connection comes out in the bitmap_start_sync call. So change bitmap_start_sync to always work in multiples of a page. If the bitmap chunk size is less that one page, we flag multiple chunks as 'syncing' and generally make them all appear to the resync routines like one chunk. All other code either already works with data ranges that could span multiple chunks, or explicitly only cares about a single chunk. Signed-off-by: Neil Brown <neilb@suse.de>	2009-03-31 14:27:02 +11:00
Cheng Renquan	159ec1fc06	md: use list_for_each_entry macro directly The rdev_for_each macro defined in <linux/raid/md_k.h> is identical to list_for_each_entry_safe, from <linux/list.h>, it should be defined to use list_for_each_entry_safe, instead of reinventing the wheel. But some calls to each_entry_safe don't really need a safe version, just a direct list_for_each_entry is enough, this could save a temp variable (tmp) in every function that used rdev_for_each. In this patch, most rdev_for_each loops are replaced by list_for_each_entry, totally save many tmp vars; and only in the other situations that will call list_del to delete an entry, the safe version is used. Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>	2009-01-09 08:31:08 +11:00
NeilBrown	538452700d	md: fix bitmap-on-external-file bug. commit `a2ed9615e3` fixed a bug with 'internal' bitmaps, but in the process broke 'in a file' bitmaps. So they are broken in 2.6.28 This fixes it, and needs to go in 2.6.28-stable. Signed-off-by: NeilBrown <neilb@suse.de> Cc: stable@kernel.org	2009-01-09 08:31:05 +11:00
NeilBrown	a2ed9615e3	md: Don't read past end of bitmap when reading bitmap. When we read the write-intent-bitmap off the device, we currently read a whole number of pages. When PAGE_SIZE is 4K, this works due to the alignment we enforce on the superblock and bitmap. When PAGE_SIZE is 64K, this case read past the end-of-device which causes an error. When we write the superblock, we ensure to clip the last page to just be the required size. Copy that code into the read path to just read the required number of sectors. Signed-off-by: Neil Brown <neilb@suse.de> Cc: stable@kernel.org	2008-12-19 16:25:01 +11:00
NeilBrown	b2d2c4cead	Fix problem with waiting while holding rcu read lock in md/bitmap.c A recent patch to protect the rdev list with rcu locking leaves us with a problem because we can sleep on memalloc while holding the rcu lock. The rcu lock is only needed while walking the linked list as uninteresting devices (failed or spares) can be removed at any time. So only take the rcu lock while actually walking the linked list. Take a refcount on the rdev during the time when we drop the lock and do the memalloc to start IO. When we return to the locked code, all the interesting devices on the list will not have moved, so we can simply use list_for_each_continue_rcu to pick up where we left off. Signed-off-by: NeilBrown <neilb@suse.de>	2008-09-01 12:48:13 +10:00
Jens Axboe	93769f5807	md: the bitmap code needs to use blk_plug_device_unlocked() It doesn't hold the queue lock, so it's both racey on the queue flags and thus spews a warning. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-08-01 20:32:31 +02:00
NeilBrown	4b80991c6c	md: Protect access to mddev->disks list using RCU All modifications and most access to the mddev->disks list are made under the reconfig_mutex lock. However there are three places where the list is walked without any locking. If a reconfig happens at this time, havoc (and oops) can ensue. So use RCU to protect these accesses: - wrap them in rcu_read_{,un}lock() - use list_for_each_entry_rcu - add to the list with list_add_rcu - delete from the list with list_del_rcu - delay the 'free' with call_rcu rather than schedule_work Note that export_rdev did a list_del_init on this list. In almost all cases the entry was not in the list anymore so it was a no-op and so safe. It is no longer safe as after list_del_rcu we may not touch the list_head. An audit shows that export_rdev is called: - after unbind_rdev_from_array, in which case the delete has already been done, - after bind_rdev_to_array fails, in which case the delete isn't needed. - before the device has been put on a list at all (e.g. in add_new_disk where reading the superblock fails). - and in autorun devices after a failure when the device is on a different list. So remove the list_del_init call from export_rdev, and add it back immediately before the called to export_rdev for that last case. Note also that ->same_set is sometimes used for lists other than mddev->list (e.g. candidates). In these cases rcu is not needed. Signed-off-by: NeilBrown <neilb@suse.de>	2008-07-21 17:05:25 +10:00
Andre Noll	0f420358e3	md: Turn rdev->sb_offset into a sector-based quantity. Rename it to sb_start to make sure all users have been converted. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: Neil Brown <neilb@suse.de>	2008-07-11 22:02:23 +10:00
Neil Brown	a0da84f35b	Improve setting of "events_cleared" for write-intent bitmaps. When an array is degraded, bits in the write-intent bitmap are not cleared, so that if the missing device is re-added, it can be synced by only updated those parts of the device that have changed since it was removed. The enable this a 'events_cleared' value is stored. It is the event counter for the array the last time that any bits were cleared. Sometimes - if a device disappears from an array while it is 'clean' - the events_cleared value gets updated incorrectly (there are subtle ordering issues between updateing events in the main metadata and the bitmap metadata) resulting in the missing device appearing to require a full resync when it is re-added. With this patch, we update events_cleared precisely when we are about to clear a bit in the bitmap. We record events_cleared when we clear the bit internally, and copy that to the superblock which is written out before the bit on storage. This makes it more "obviously correct". We also need to update events_cleared when the event_count is going backwards (as happens on a dirty->clean transition of a non-degraded array). Thanks to Mike Snitzer for identifying this problem and testing early "fixes". Cc: "Mike Snitzer" <snitzer@gmail.com> Signed-off-by: Neil Brown <neilb@suse.de>	2008-06-28 08:31:22 +10:00
Christoph Hellwig	6bcfd60186	md: kill file_path wrapper Kill the trivial and rather pointless file_path wrapper around d_path. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-24 09:56:09 -07:00
NeilBrown	7be3dfec47	md: reduce CPU wastage on idle md array with a write-intent bitmap Recent patch titled Reduce CPU wastage on idle md array with a write-intent bitmap. would sometimes leave the array with dirty bitmap bits that stay dirty. A subsequent write would sort things out so it isn't a big problem, but should be fixed nonetheless. We need to make sure that when the bitmap becomes not "allclean", the daemon_sleep really does get set to a sensible value. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-10 18:01:19 -07:00
NeilBrown	8311c29d40	md: reduce CPU wastage on idle md array with a write-intent bitmap On an md array with a write-intent bitmap, a thread wakes up every few seconds and scans the bitmap looking for work to do. If the array is idle, there will be no work to do, but a lot of scanning is done to discover this. So cache the fact that the bitmap is completely clean, and avoid scanning the whole bitmap when the cache is known to be clean. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:17 -08:00
Jan Blunck	cf28b4863f	d_path: Make d_path() use a struct path d_path() is used on a <dentry,vfsmount> pair. Lets use a struct path to reflect this. [akpm@linux-foundation.org: fix build in mm/memory.c] Signed-off-by: Jan Blunck <jblunck@suse.de> Acked-by: Bryan Wu <bryan.wu@analog.com> Acked-by: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-14 21:17:09 -08:00
NeilBrown	d089c6af10	md: change ITERATE_RDEV to rdev_for_each As this is more in line with common practice in the kernel. Also swap the args around to be more like list_for_each. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:19 -08:00
NeilBrown	b47490c9bc	md: Update md bitmap during resync. Currently an md array with a write-intent bitmap does not updated that bitmap to reflect successful partial resync. Rather the entire bitmap is updated when the resync completes. This is because there is no guarentee that resync requests will complete in order, and tracking each request individually is unnecessarily burdensome. However there is value in regularly updating the bitmap, so add code to periodically pause while all pending sync requests complete, then update the bitmap. Doing this only every few seconds (the same as the bitmap update time) does not notciably affect resync performance. [snitzer@gmail.com: export bitmap_cond_end_sync] Signed-off-by: Neil Brown <neilb@suse.de> Cc: "Mike Snitzer" <snitzer@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:18 -08:00
Alan D. Brunelle	2ad8b1ef11	Add UNPLUG traces to all appropriate places Added blk_unplug interface, allowing all invocations of unplugs to result in a generated blktrace UNPLUG. Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-09 13:41:32 +01:00
NeilBrown	85bfb4da8c	md: fix an unsigned compare to allow creation of bitmaps with v1.0 metadata As page->index is unsigned, this all becomes an unsigned comparison, which almost always returns an error. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Stable <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-23 08:32:06 -07:00
NeilBrown	4ad1366376	md: change bitmap_unplug and others to void functions bitmap_unplug only ever returns 0, so it may as well be void. Two callers try to print a message if it returns non-zero, but that message is already printed by bitmap_file_kick. write_page returns an error which is not consistently checked. It always causes BITMAP_WRITE_ERROR to be set on an error, and that can more conveniently be checked. When the return of write_page is checked, an error causes bitmap_file_kick to be called - so move that call into write_page - and protect against recursive calls into bitmap_file_kick. bitmap_update_sb returns an error that is never checked. So make these 'void' and be consistent about checking the bit. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-17 10:23:15 -07:00
NeilBrown	f0d76d70bc	md: check that internal bitmap does not overlap other data We current completely trust user-space to set up metadata describing an consistant array. In particlar, that the metadata, data, and bitmap do not overlap. But userspace can be buggy, and it is better to report an error than corrupt data. So put in some appropriate checks. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-17 10:23:15 -07:00
NeilBrown	ab6085c795	md: don't write more than is required of the last page of a bitmap It is possible that real data or metadata follows the bitmap without full page alignment. So limit the last write to be only the required number of bytes, rounded up to the hard sector size of the device. Signed-off-by: Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-23 20:14:14 -07:00
Mark Fasheh	ef51c97623	Remove do_sync_file_range() Remove do_sync_file_range() and convert callers to just use do_sync_mapping_range(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:04 -07:00
Neil Brown	505fa2c4a2	[PATCH] md: fix calculation for size of filemap_attr array in md/bitmap If 'num_pages' were ever 1 more than a multiple of 8 (32bit platforms) or of 16 (64 bit platforms). filemap_attr would be allocated one 'unsigned long' shorter than required. We need a round-up in there. Signed-off-by: Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-04-12 15:31:42 -07:00
Andrew Morton	fc0ecff698	[PATCH] remove invalidate_inode_pages() Convert all calls to invalidate_inode_pages() into open-coded calls to invalidate_mapping_pages(). Leave the invalidate_inode_pages() wrapper in place for now, marked as deprecated. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:31 -08:00
Neil Brown	da6e1a32fb	[PATCH] md: avoid possible BUG_ON in md bitmap handling md/bitmap tracks how many active write requests are pending on blocks associated with each bit in the bitmap, so that it knows when it can clear the bit (when count hits zero). The counter has 14 bits of space, so if there are ever more than 16383, we cannot cope. Currently the code just calles BUG_ON as "all" drivers have request queue limits much smaller than this. However is seems that some don't. Apparently some multipath configurations can allow more than 16383 concurrent write requests. So, in this unlikely situation, instead of calling BUG_ON we now wait for the count to drop down a bit. This requires a new wait_queue_head, some waiting code, and a wakeup call. Tested by limiting the counter to 20 instead of 16383 (writes go a lot slower in that case...). Signed-off-by: Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-09 09:25:47 -08:00
NeilBrown	f49d5e62d9	[PATCH] md: avoid reading past the end of a bitmap file In most cases we check the size of the bitmap file before reading data from it. However when reading the superblock, we always read the first PAGE_SIZE bytes, which might not always be appropriate. So limit that read to the size of the file if appropriate. Also, we get the count of available bytes wrong in one place, so that too can read past the end of the file. Cc: "yang yin" <yinyang801120@gmail.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:59 -08:00
Josef Sipek	c649bb9c55	[PATCH] struct path: convert md Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:47 -08:00
NeilBrown	4f2e639af4	[PATCH] md: endian annotations for the bitmap superblock And a couple of bug fixes found by sparse. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-21 13:35:05 -07:00
Alexey Dobriyan	5f6e3c8365	[PATCH] md: use BUILD_BUG_ON Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-11 11:14:26 -07:00
Paul Clements	a638b2dc95	[PATCH] md: use ffz instead of find_first_set to convert multiplier to shift find_first_set doesn't find the least-significant bit on bigendian machines, so it is really wrong to use it. ffs is closer, but takes an 'int' and we have a 'unsigned long'. So use ffz(~X) to convert a chunksize into a chunkshift. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-03 08:04:18 -07:00
Paul Clements	9b1d1dac18	[PATCH] md: new sysfs interface for setting bits in the write-intent-bitmap Add a new sysfs interface that allows the bitmap of an array to be dirtied. The interface is write-only, and is used as follows: echo "1000" > /sys/block/md2/md/bitmap (dirty the bit for chunk 1000 [offset 0] in the in-memory and on-disk bitmaps of array md2) echo "1000-2000" > /sys/block/md1/md/bitmap (dirty the bits for chunks 1000-2000 in md1's bitmap) This is useful, for example, in cluster environments where you may need to combine two disjoint bitmaps into one (following a server failure, after a secondary server has taken over the array). By combining the bitmaps on the two servers, a full resync can be avoided (This was discussed on the list back on March 18, 2005, "[PATCH 1/2] md bitmap bug fixes" thread). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-03 08:04:17 -07:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
NeilBrown	ce25c31bdd	[PATCH] md: Change md/bitmap file handling to use bmap to file blocks-fix Fix problems with new bmap based access to bitmap files. 1/ When not using a file based bitmap, attach a NULL list of buffers to each page so the common free_buffer routine can cope. 2/ Use submit_bh to read as well as write, rather than vfs_read. This makes read and write more symetric. 3/ sync the file before reading, to ensure that the page cache has no dirty pages that might get written out later. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:38 -07:00
NeilBrown	d785a06a0b	[PATCH] md/bitmap: change md/bitmap file handling to use bmap to file blocks If md is asked to store a bitmap in a file, it tries to hold onto the page cache pages for that file, manipulate them directly, and call a cocktail of operations to write the file out. I don't believe this is a supportable approach. This patch changes the approach to use the same approach as swap files. i.e. bmap is used to enumerate all the block address of parts of the file and we write directly to those blocks of the device. swapfile only uses parts of the file that provide a full pages at contiguous addresses. We don't have that luxury so we have to cope with pages that are non-contiguous in storage. To handle this we attach buffers to each page, and store the addresses in those buffers. With this approach the pagecache may contain data which is inconsistent with what is on disk. To alleviate the problems this can cause, md invalidates the pagecache when releasing the file. If the file is to be examined while the array is active (a non-critical but occasionally useful function), O_DIRECT io must be used. And new version of mdadm will have support for this. This approach simplifies a lot of code: - we no longer need to keep a list of pages which we need to wait for, as the b_endio function can keep track of how many outstanding writes there are. This saves a mempool. - -EAGAIN returns from write_page are no longer possible (not sure if they ever were actually). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:38 -07:00
NeilBrown	acc55e2201	[PATCH] md/bitmap: tidy up i_writecount handling in md/bitmap md/bitmap modifies i_writecount of a bitmap file to make sure that no-one else writes to it. The reverting of the change is sometimes done twice, and there is one error path where it is omitted. This patch tidies that up. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:38 -07:00
NeilBrown	0cdd02cabd	[PATCH] md/bitmap: remove dead code from md/bitmap bitmap_active is never called, and the BITMAP_ACTIVE flag is never users or tested, so discard them both. Also remove some out-of-date 'todo' comments. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:38 -07:00
NeilBrown	a647e4bc5c	[PATCH] md/bitmap: remove unnecessary page reference manipulations from md/bitmap code md/bitmap gets a collection of pages representing the bitmap when it initialises the bitmap, and puts all the references when discarding the bitmap. It also occasionally takes extra references without any good reason, and sometimes drops them ... though it doesn't always drop them, which can result in a memory leak. This patch removes the unnecessary 'get_page' calls, and the corresponding 'put_page' calls. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:38 -07:00
NeilBrown	e16b68b6e4	[PATCH] md/bitmap: use set_bit etc for bitmap page attributes In particular, this means that we use 4 bits per page instead of a whole unsigned long. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:38 -07:00
NeilBrown	ec7a3197f4	[PATCH] md/bitmap: cleaner separation of page attribute handlers in md/bitmap md/bitmap has some attributes per-page. Handling of these attributes in largely abstracted in set_page_attr and clear_page_attr. However get_page_attr exposes the format used to store them. So prior to changing that format, introduce test_page_attr instead of get_page_attr, and make appropriate usage changes. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:38 -07:00
NeilBrown	0b79ccf0cd	[PATCH] md/bitmap: remove bitmap writeback daemon md/bitmap currently has a separate thread to wait for writes to the bitmap file to complete (as we cannot get a callback on that action). However this isn't needed as bitmap_unplug is called from process context and waits for the writeback thread to do it's work. The same result can be achieved by doing the waiting directly in bitmap_unplug. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:38 -07:00
Matthew Dobson	0eaae62aba	[PATCH] mempool: use common mempool kmalloc allocator This patch changes several mempool users, all of which are basically just wrappers around kmalloc(), to use the common mempool_kmalloc/kfree, rather than their own wrapper function, removing a bunch of duplicated code. Signed-off-by: Matthew Dobson <colpatch@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-26 08:56:59 -08:00
Linus Torvalds	1e8c573933	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (21 commits) BUG_ON() Conversion in drivers/video/ BUG_ON() Conversion in drivers/parisc/ BUG_ON() Conversion in drivers/block/ BUG_ON() Conversion in sound/sparc/cs4231.c BUG_ON() Conversion in drivers/s390/block/dasd.c BUG_ON() Conversion in lib/swiotlb.c BUG_ON() Conversion in kernel/cpu.c BUG_ON() Conversion in ipc/msg.c BUG_ON() Conversion in block/elevator.c BUG_ON() Conversion in fs/coda/ BUG_ON() Conversion in fs/binfmt_elf_fdpic.c BUG_ON() Conversion in input/serio/hil_mlc.c BUG_ON() Conversion in md/dm-hw-handler.c BUG_ON() Conversion in md/bitmap.c The comment describing how MS_ASYNC works in msync.c is confusing rcu: undeclared variable used in documentation fix typos "wich" -> "which" typo patch for fs/ufs/super.c Fix simple typos tabify drivers/char/Makefile ...	2006-03-25 08:41:09 -08:00
Adrian Bunk	7e31765550	[PATCH] md/bitmap.c:bitmap_mask_state(): fix inconsequent NULL checking We dereference bitmap both one line above and one line below this check rendering this check quite useless. Spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-25 08:22:57 -08:00
Eric Sesterhenn	5daf2cf19a	BUG_ON() Conversion in md/bitmap.c this changes if() BUG(); constructs to BUG_ON() which is cleaner and can better optimized away Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-03-24 18:35:26 +01:00
Arjan van de Ven	858119e159	[PATCH] Unlinline a bunch of other functions Remove the "inline" keyword from a bunch of big functions in the kernel with the goal of shrinking it by 30kb to 40kb Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-14 18:27:06 -08:00
NeilBrown	c708443c00	[PATCH] md: make sure bitmap updates are visible through filesystem When we update a page_cache page in the kernel, we need to flush_dache_page or userspace might not see the change. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:08 -08:00
NeilBrown	1345b1d8ad	[PATCH] md: define and use safe_put_page for md md sometimes call put_page on NULL pointers (treating it like kfree). This is not safe, so define and use a 'safe_put_page' which checks for NULL. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:07 -08:00
NeilBrown	7dd5d34c6c	[PATCH] md: remove inappropriate limits in md/bitmap configuration. The kernel should not be imposing these policy limits: The time between bitmap updates should certainly be allowed to be more than 15 seconds, and if someone wants a bitmap chunk size in excess of 4MB, the kernel isn't the place to stop them. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:07 -08:00
NeilBrown	ea03aff93b	[PATCH] md: convert various kmap calls to kmap_atomic Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:06 -08:00
NeilBrown	9ffae0cf3e	[PATCH] md: convert md to use kzalloc throughout Replace multiple kmalloc/memset pairs with kzalloc calls. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:05 -08:00
NeilBrown	2d1f3b5d1b	[PATCH] md: clean up 'page' related names in md Substitute: page_cache_get -> get_page page_cache_release -> put_page PAGE_CACHE_SHIFT -> PAGE_SHIFT PAGE_CACHE_SIZE -> PAGE_SIZE PAGE_CACHE_MASK -> PAGE_MASK __free_page -> put_page because we aren't using the page cache, we are just using pages. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:05 -08:00
NeilBrown	b15c2e57f0	[PATCH] md: move bitmap_create to after md array has been initialised This is important because bitmap_create uses mddev->resync_max_sectors and that doesn't have a valid value until after the array has been initialised (with pers->run()). [It doesn't make a difference for current personalities that support bitmaps, but will make a difference for raid10] This has the added advantage of meaning with can move the thread->timeout manipulation inside the bitmap.c code instead of sprinkling identical code throughout all personalities. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:03 -08:00
Neil Brown	34ef75f09f	[PATCH] md: don't pass a NULL file* into ->prepare_write() Some filesystems go oops. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-18 07:49:46 -08:00
NeilBrown	a9701a3047	[PATCH] md: support BIO_RW_BARRIER for md/raid1 We can only accept BARRIER requests if all slaves handle barriers, and that can, of course, change with time.... So we keep track of whether the whole array seems safe for barriers, and also whether each individual rdev handles barriers. We initially assumes barriers are OK. When writing the superblock we try a barrier, and if that fails, we flag things for no-barriers. This will usually clear the flags fairly quickly. If writing the superblock finds that BIO_RW_BARRIER is -ENOTSUPP, we need to resubmit, so introduce function "md_super_wait" which waits for requests to finish, and retries ENOTSUPP requests without the barrier flag. When writing the real raid1, write requests which were BIO_RW_BARRIER but which aresn't supported need to be retried. So raid1d is enhanced to do this, and when any bio write completes (i.e. no retry needed) we remove it from the r1bio, so that devices needing retry are easy to find. We should hardly ever get -ENOTSUPP errors when writing data to the raid. It should only happen if: 1/ the device used to support BARRIER, but now doesn't. Few devices change like this, though raid1 can! or 2/ the array has no persistent superblock, so there was no opportunity to pre-test for barriers when writing the superblock. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:38 -08:00
NeilBrown	bd926c63b7	[PATCH] md: make md on-disk bitmaps not host-endian Current bitmaps use set_bit et.al and so are host-endian, which means not-portable. Oops. Define a new version number (4) for which bitmaps are little-endian. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:38 -08:00
NeilBrown	b2d444d7ad	[PATCH] md: convert 'faulty' and 'in_sync' fields to bits in 'flags' field This has the advantage of removing the confusion caused by 'rdev_t' and 'mddev_t' both having 'in_sync' fields. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:38 -08:00
Olaf Hering	733482e445	[PATCH] changing CONFIG_LOCALVERSION rebuilds too much, for no good reason This patch removes almost all inclusions of linux/version.h. The 3 #defines are unused in most of the touched files. A few drivers use the simple KERNEL_VERSION(a,b,c) macro, which is unfortunatly in linux/version.h. There are also lots of #ifdef for long obsolete kernels, this was not touched. In a few places, the linux/version.h include was move to where the LINUX_VERSION_CODE was used. quilt vi `find * -type f -name "*.[ch]"\|xargs grep -El '(UTS_RELEASE\|LINUX_VERSION_CODE\|KERNEL_VERSION\|linux/version.h)'\|grep -Ev '(/(boot\|coda\|drm)/\|~$)'` search pattern: /UTS_RELEASE\\|LINUX_VERSION_CODE\\|KERNEL_VERSION\\|linux\/$utsname\\|version$.h Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:55:57 -08:00
Al Viro	b4e3ca1ab1	[PATCH] gfp_t: remaining bits of drivers/* Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-28 08:16:51 -07:00
NeilBrown	500af87abb	[PATCH] md: tidy up daemon stop/start code in md/bitmap.c The bitmap code used to have two daemons, so there is some 'common' start/stop code. But now there is only one, so the common code is just noise. This patch tidies this up somewhat. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:13 -07:00
NeilBrown	9ba00538ad	[PATCH] md: ensure bitmap_writeback_daemon handles shutdown properly. mddev->bitmap gets clearred before the writeback daemon is stopped. So the write_back daemon needs to be careful not to dereference the 'bitmap' if it is NULL. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:13 -07:00
NeilBrown	ab904d6346	[PATCH] md: fix bitmap/read_sb_page so that it handles errors properly. read_sb_page() assumed that if sync_page_io fails, the device would be marked faultly. However it isn't. So in the face of error, read_sb_page would loop forever. Redo the logic so that this cannot happen. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:11 -07:00
NeilBrown	3178b0dbdf	[PATCH] md: do not set mddev->bitmap until bitmap is fully initialised When hot-adding a bitmap, bitmap_daemon_work could get called while the bitmap is being created, so don't set mddev->bitmap until the bitmap is ready. This requires freeing the bitmap inside bitmap_create if creation failed part-way through. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:11 -07:00
NeilBrown	585f0dd5a9	[PATCH] md: make sure bitmap_daemon_work actually does work. The 'lastrun' time wasn't being initialised, so it could be half a jiffie-cycle before it seemed to be time to do work again. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:11 -07:00
NeilBrown	4b6d287f62	[PATCH] md: add write-behind support for md/raid1 If a device is flagged 'WriteMostly' and the array has a bitmap, and the bitmap superblock indicates that write_behind is allowed, then write_behind is enabled for WriteMostly devices. Write requests will be acknowledges as complete to the caller (via b_end_io) when all non-WriteMostly devices have completed the write, but will not be cleared from the bitmap until all devices complete. This requires memory allocation to make a local copy of the data being written. If there is insufficient memory, then we fall-back on normal write semantics. Signed-Off-By: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:10 -07:00
NeilBrown	6a07997fc3	[PATCH] md: improve handling of bitmap initialisation. When we find a 'stale' bitmap, possibly because it is new, we should just assume every bit needs to be set, but rather base the setting of bits on the current state of the array (degraded and recovery_cp). Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:09 -07:00
NeilBrown	6b8b3e8a8b	[PATCH] md: make sure md bitmap updates are flushed when array is stopped. The recent change to never ignore the bitmap, revealed that the bitmap isn't begin flushed properly when an array is stopped. We call bitmap_daemon_work three times as there is a three-stage pipeline for flushing updates to the bitmap file. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-04 13:00:54 -07:00
NeilBrown	193f1c9315	[PATCH] md: always honour md bitmap being read from disk The code currently will ignore the bitmap if the array seem to be in-sync. This is wrong if the array is degraded, and probably wrong anyway. If the bitmap says some chunks are not in in-sync, and the superblock says everything IS in sync, then something is clearly wrong, and it is safer to trust the bitmap. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-08-04 13:00:54 -07:00
Olaf Hering	44456d37b5	[PATCH] turn many #if $undefined_string into #ifdef $undefined_string turn many #if $undefined_string into #ifdef $undefined_string to fix some warnings after -Wno-def was added to global CFLAGS Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-27 16:26:08 -07:00
NeilBrown	6a806c510d	[PATCH] md/raid1: clear bitmap when fullsync completes We need to be careful differentiating between a resync of a complete array, in which we can clear the bitmap, and a resync of a degraded array, in which we cannot. This patch cleans all that up. Cc: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-15 09:54:51 -07:00
NeilBrown	8a5e9cf1d6	[PATCH] md: make sure md/bitmap doesn't try to write a page with active writeback Due to the use of write-behind, it is possible for md to write a page to the bitmap file that is still completing writeback. This is not allowed. With this patch, we detect those cases and either force a sync write, or back off and try later, as appropriate. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:47 -07:00
NeilBrown	a654b9d8f8	[PATCH] md: allow md intent bitmap to be stored near the superblock. This provides an alternate to storing the bitmap in a separate file. The bitmap can be stored at a given offset from the superblock. Obviously the creator of the array must make sure this doesn't intersect with data.... After is good for version-0.90 superblocks. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:47 -07:00
NeilBrown	aa3163f816	[PATCH] md: don't skip bitmap pages due to lack of bit that we just cleared. When looking for pages that need cleaning we skip pages that don't have BITMAP_PAGE_CLEAN set. But if it is the 'current' page we will have cleared that bit ourselves, so skipping it is wrong. So: move the 'skip this page' inside 'if page != lastpage'. Also fold call of file_page_offset into the one place where the value (bit) is used. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:45 -07:00
NeilBrown	77ad4bc706	[PATCH] md: enable the bitmap write-back daemon and wait for it. Currently we don't wait for updates to the bitmap to be flushed to disk properly. The infrastructure all there, but it isn't being used.... A separate kernel thread (bitmap_writeback_daemon) is needed to wait for each page as we cannot get callbacks when a page write completes. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:45 -07:00
NeilBrown	bfb39fba4e	[PATCH] md: check return value of write_page, rather than ignore it Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:45 -07:00
NeilBrown	a2cff26ad1	[PATCH] md: improve debug-printing of bitmap superblock. - report sync_size properly - need /2 to convert sectors to KB - move everything over 2 spaces to allow proper spelling of "events cleared". Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:45 -07:00
akpm@osdl.org	fc7ca163a4	[PATCH] md printk fix A u64 is not an unsigned long long. On power4 it is `long', and printk warns. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:45 -07:00
NeilBrown	cdbb4cc2e5	[PATCH] md: make sure md bitmap is cleared on a clean start. As the array-wide clean bit (in the superblock) is set more agressively than the bits in the bitmap are cleared, it is possible to have an array which is clean despite there being bits set in the bitmap. These bits will currently never get cleared, as they can only be cleared by a resync pass, which never happens. No, when reading bits from disk, be aware of whether the whole array is known to be in sync, and act accordingly. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:44 -07:00
NeilBrown	bc7f77de2c	[PATCH] md: minor code rearrangement in bitmap_init_from_disk Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:44 -07:00
NeilBrown	d80a138c01	[PATCH] md: print correct pid for newly created bitmap-writeback-daemon. The debugging message printed the wrong pid, which didn't help remove bugs.... Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:44 -07:00
NeilBrown	78d742d876	[PATCH] md: a couple of tidyups relating to the bitmap file. 1/ When init from disk, it is a BUG if there is nowhere to init from, 2/ use seq_path to print path in /proc/mdstat Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:43 -07:00
NeilBrown	32a7627cf3	[PATCH] md: optimised resync using Bitmap based intent logging With this patch, the intent to write to some block in the array can be logged to a bitmap file. Each bit represents some number of sectors and is set before any update happens, and only cleared when all writes relating to all sectors are complete. After an unclean shutdown, information in this bitmap can be used to optimise resync - only sectors which could be out-of-sync need to be updated. Also if a drive is removed and then added back into an array, the recovery can make use of the bitmap to optimise reconstruction. This is not implemented in this patch. Currently the bitmap is stored in a file which must (obviously) be stored on a separate device. The patch only provided infrastructure. It does not update any personalities to bitmap intent logging. Md arrays can still be used with no bitmap file. This patch has minimal impact on such arrays. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 19:07:43 -07:00

1 2 3

147 Commits