linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Christoph Hellwig	a885c8c431	[PATCH] Add block_device_operations.getgeo block device method HDIO_GETGEO is implemented in most block drivers, and all of them have to duplicate the code to copy the structure to userspace, as well as getting the start sector. This patch moves that to common code [1] and adds a ->getgeo method to fill out the raw kernel hd_geometry structure. For many drivers this means ->ioctl can go away now. [1] the s390 block drivers are odd in this respect. xpram sets ->start to 4 always which seems more than odd, and the dasd driver shifts the start offset around, probably because of it's non-standard sector size. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@suse.de> Cc: <mike.miller@hp.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo Giarrusso <blaisorblade@yahoo.it> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:54 -08:00
NeilBrown	88202a0c84	[PATCH] md: allow sync-speed to be controlled per-device Also export current (average) speed and status in sysfs. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:10 -08:00
NeilBrown	6d7ff7380b	[PATCH] md: support adding new devices to md arrays via sysfs Writing major:minor to md/new_dev will bind that device to the array. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:10 -08:00
NeilBrown	83303b613d	[PATCH] md: allow available size of component devices to be set via sysfs Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:10 -08:00
Andrew Morton	6961ece46c	[PATCH] md-export-rdev-data_offset-via-sysfs-fix drivers/md/md.c: In function `offset_show': drivers/md/md.c:1670: warning: long long unsigned int format, different type arg (arg 3) Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:10 -08:00
NeilBrown	93c8cad03f	[PATCH] md: export rdev->data_offset via sysfs Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:09 -08:00
NeilBrown	014236d2b8	[PATCH] md: expose device slot information via sysfs This the role that a device has in an array can be viewed and set. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:09 -08:00
NeilBrown	2bf071bf50	[PATCH] md: keep better track of dev/array size when assembling md arrays Move the checks - that dev size is never less than array size - into bind_rdev_to_array to make sure it always happens properly (there is one place where currently it doesn't). Also reject any superblock which claims an array size smaller than the device in question can hold. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:09 -08:00
NeilBrown	da943b9912	[PATCH] md: allow md/raid_disks to be settable If array is active, try to reshape, else just set the value. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:09 -08:00
NeilBrown	4dbcdc751c	[PATCH] md: count corrected read errors per drive Store this total in superblock (As appropriate), and make it available to userspace via sysfs. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:09 -08:00
NeilBrown	d9d166c2a9	[PATCH] md: allow array level to be set textually via sysfs Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:09 -08:00
NeilBrown	8bb93aaca2	[PATCH] md: expose md metadata format in sysfs Allow it to be set to a particular version, or 'none'. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:08 -08:00
NeilBrown	a35b0d695d	[PATCH] md: allow md array component size to be accessed and set via sysfs Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:08 -08:00
NeilBrown	3b34380ae8	[PATCH] md: allow chunk_size to be settable through sysfs ... only before array is started of course. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:08 -08:00
NeilBrown	03c902e17f	[PATCH] md: fix rdev->pending counts in raid1 When we do a user-requested check/repair, we lose count of the outstanding requests... Also make sure that when anything is written to md/sync_action, the RECOVERY_NEEDED flag is set and the thread is woken up so any changes take effect. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:08 -08:00
NeilBrown	c708443c00	[PATCH] md: make sure bitmap updates are visible through filesystem When we update a page_cache page in the kernel, we need to flush_dache_page or userspace might not see the change. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:08 -08:00
Adrian Bunk	07dbd37727	[PATCH] drivers/md/md.c: make md_new_event() static Make the needlessly global function md_new_event() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:07 -08:00
NeilBrown	2989ddbd6e	[PATCH] md: make a couple of names in md.c static .. because they aren't used outside md.c Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:07 -08:00
NeilBrown	f188593ee7	[PATCH] md: fix typo in comment Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:07 -08:00
NeilBrown	bce74dac08	[PATCH] md: helper function to match commands written to sysfs files Commands written to sysfs files may, or my not, be \n terminated. We want to accept with case. For this we use cmd_match. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:07 -08:00
NeilBrown	1345b1d8ad	[PATCH] md: define and use safe_put_page for md md sometimes call put_page on NULL pointers (treating it like kfree). This is not safe, so define and use a 'safe_put_page' which checks for NULL. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:07 -08:00
NeilBrown	7dd5d34c6c	[PATCH] md: remove inappropriate limits in md/bitmap configuration. The kernel should not be imposing these policy limits: The time between bitmap updates should certainly be allowed to be more than 15 seconds, and if someone wants a bitmap chunk size in excess of 4MB, the kernel isn't the place to stop them. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:07 -08:00
NeilBrown	097426f689	[PATCH] md: fix possible problem in raid1/raid10 error overwriting The code to overwrite/reread for addressing read errors in raid1/raid10 currently assumes that the read will not alter the buffer which could be used to write to the next device. This is not a safe assumption to make. So we split the loops into a overwrite loop and a separate re-read loop, so that the writing is complete before reading is attempted. Cc: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:06 -08:00
NeilBrown	2604b703b6	[PATCH] md: remove personality numbering from md md supports multiple different RAID level, each being implemented by a 'personality' (which is often in a separate module). These personalities have fairly artificial 'numbers'. The numbers are use to: 1- provide an index into an array where the various personalities are recorded 2- identify the module (via an alias) which implements are particular personality. Neither of these uses really justify the existence of personality numbers. The array can be replaced by a linked list which is searched (array lookup only happens very rarely). Module identification can be done using an alias based on level rather than 'personality' number. The current 'raid5' modules support two level (4 and 5) but only one personality. This slight awkwardness (which was handled in the mapping from level to personality) can be better handled by allowing raid5 to register 2 personalities. With this change in place, the core md module does not need to have an exhaustive list of all possible personalities, so other personalities can be added independently. This patch also moves the check for chunksize being non-zero into the ->run routines for the personalities that need it, rather than having it in core-md. This has a side effect of allowing 'faulty' and 'linear' not to have a chunk-size set. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:06 -08:00
NeilBrown	a24a8dd858	[PATCH] md: break out of a loop that doesn't need to run to completion Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:06 -08:00
NeilBrown	a8745db232	[PATCH] md: convert recently exported symbol to GPL ...because that seems to be the preferred practice these days. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:06 -08:00
NeilBrown	ea03aff93b	[PATCH] md: convert various kmap calls to kmap_atomic Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:06 -08:00
NeilBrown	fccddba060	[PATCH] md: tidy up raid5/6 hash table code - replace open-coded hash chain with hlist macros - Fix hash-table size at one page - it is already quite generous, so there will never be a need to use multiple pages, so no need for __get_free_pages No functional change. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:06 -08:00
NeilBrown	9ffae0cf3e	[PATCH] md: convert md to use kzalloc throughout Replace multiple kmalloc/memset pairs with kzalloc calls. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:05 -08:00
NeilBrown	2d1f3b5d1b	[PATCH] md: clean up 'page' related names in md Substitute: page_cache_get -> get_page page_cache_release -> put_page PAGE_CACHE_SHIFT -> PAGE_SHIFT PAGE_CACHE_SIZE -> PAGE_SIZE PAGE_CACHE_MASK -> PAGE_MASK __free_page -> put_page because we aren't using the page cache, we are just using pages. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:05 -08:00
NeilBrown	d7603b7e3a	[PATCH] md: make /proc/mdstat pollable With this patch it is possible to poll /proc/mdstat to detect arrays appearing or disappearing, to detect failures, recovery starting, recovery completing, and devices being added and removed. It is similar to the poll-ability of /proc/mounts, though different in that: We always report that the file is readable (because face it, it is, even if only for EOF). We report POLLPRI when there is a change so that select() can detect it as an exceptional event. Not only are these exceptional events, but that is the mechanism that the current 'mdadm' uses to watch for events (It also polls after a timeout). (We also report POLLERR like /proc/mounts). Finally, we only reset the per-file event counter when the start of the file is read, rather than when poll() returns an event. This is more robust as it means that an fd will continue to report activity to poll/select until the program clearly responds to that activity. md_new_event takes an 'mddev' which isn't currently used, but it will be soon. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:05 -08:00
NeilBrown	0eb3ff12aa	[PATCH] md: raid10 read-error handling - resync and read-only Add in correct read-error handling for resync and read-only situations. When read-only, we don't over-write, so we need to mark the failed drive in the r10_bio so we don't re-try it. During resync, we always read all blocks, so if there is a read error, we simply over-write it with the good block that we found (assuming we found one). Note that the recovery case still isn't handled in an interesting way. There is nothing useful to do for the 2-copies case. If there are 3 or more copies, then we could try reading from one of the non-missing copies, but this is a bit complicated and very rarely would be used, so I'm leaving it for now. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:05 -08:00
NeilBrown	4443ae10ca	[PATCH] md: auto-correct correctable read errors in raid10 Largely just a cross-port from raid1. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:05 -08:00
NeilBrown	220946c901	[PATCH] md: make sure read error on last working drive of raid1 actually returns failure We are inadvertently setting the R1BIO_Uptodate bit on read errors when we decide not to try correcting (because there are no other working devices). This means that the read error is reported to the client as success. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:04 -08:00
NeilBrown	d11c171e63	[PATCH] md: allow raid1 to check consistency Where performing a user-requested 'check' or 'repair', we read all readable devices, and compare the contents. We only write to blocks which had read errors, or blocks with content that differs from the first good device found. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:04 -08:00
NeilBrown	18f08819f4	[PATCH] md: support check-without-repair of raid10 arrays Also keep count on the number of errors found. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:04 -08:00
NeilBrown	9910f16af3	[PATCH] md: fix up some rdev rcu locking in raid5/6 There is this "FIXME" comment with a typo in it!! that been annoying me for days, so I just had to remove it. conf->disks[i].rdev should only be accessed if - we know we hold a reference or - the mddev->reconfig_sem is down or - we have a rcu_readlock handle_stripe was referencing rdev in three places without any of these. For the first two, get an rcu_readlock. For the last, the same access (md_sync_acct call) is made a little later after the rdev has been claimed under and rcu_readlock, if R5_Syncio is set. So just use that access... However R5_Syncio isn't really needed as the 'syncing' variable contains the same information. So use that instead. Issues, comment, and fix are identical in raid5 and raid6. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:04 -08:00
NeilBrown	cf30a473a0	[PATCH] md: handle errors when read-only Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:04 -08:00
NeilBrown	69382e8537	[PATCH] md: better handling for read error in raid1 during resync Handling of read errors during resync is separate from handling of read errors during normal IO in raid1. A previous patch added support for read errors during normal IO. This one adds support for read errors during resync or recovery. The key differences are that we don't need to freeze the array, because the normal handling of resync means that this part of the array will be idle except for resync, and the read/overwrite/re-read is needed in a separate piece of code. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:04 -08:00
NeilBrown	3e198f7826	[PATCH] md: tidyup some issues with raid1 resync and prepare for catching read errors We are dereferencing ->rdev without an rcu lock! Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:03 -08:00
NeilBrown	ddaf22abaa	[PATCH] md: attempt to auto-correct read errors in raid1 On a read-error we suspend the array, then synchronously read the block from other arrays until we find one where we can read it. Then we try writing the good data back everywhere and make sure it works. If any write or subsequent read fails, only then do we fail the device out of the array. To be able to suspend the array, we need to also keep track of how many requests are queued for handling by raid1d. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:03 -08:00
NeilBrown	d69762e984	[PATCH] md: improve handing of read errors with raid6 This is a simple port of match functionality across from raid5. If we get a read error, we don't kick the drive straight away, but try to over-write with good data first. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:03 -08:00
NeilBrown	ca65b73bd9	[PATCH] md: fix raid6 resync check/repair code raid6 currently does not check the P/Q syndromes when doing a resync, it just calculates the correct value and writes it. Doing the check can reduce writes (often to 0) for a resync, and it is needed to properly implement the echo check > sync_action operation. This patch implements the appropriate checks and tidies up some related code. It also allows raid6 user-requested resync to bypass the intent bitmap. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:03 -08:00
NeilBrown	6cce3b23f6	[PATCH] md: write intent bitmap support for raid10 Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:03 -08:00
NeilBrown	b15c2e57f0	[PATCH] md: move bitmap_create to after md array has been initialised This is important because bitmap_create uses mddev->resync_max_sectors and that doesn't have a valid value until after the array has been initialised (with pers->run()). [It doesn't make a difference for current personalities that support bitmaps, but will make a difference for raid10] This has the added advantage of meaning with can move the thread->timeout manipulation inside the bitmap.c code instead of sprinkling identical code throughout all personalities. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:03 -08:00
NeilBrown	6ff8d8ec06	[PATCH] md: allow dirty raid[456] arrays to be started at boot See patch to md.txt for more details Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:02 -08:00
NeilBrown	14f8d26b8e	[PATCH] md: small cleanups for raid5 Resync code: A test that isn't needed, a 'compute_block' that makes more sense elsewhere (And then doesn't need a test), a couple of BUG_ONs to confirm the change makes sense. Printks: A few were missing KERN_* Also fix a typo in a comment.. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:02 -08:00
NeilBrown	0a27ec96b6	[PATCH] md: improve raid10 "IO Barrier" concept raid10 needs to put up a barrier to new requests while it does resync or other background recovery. The code for this is currently open-coded, slighty obscure by its use of two waitqueues, and not documented. This patch gathers all the related code into 4 functions, and includes a comment which (hopefully) explains what is happening. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:02 -08:00
NeilBrown	17999be4aa	[PATCH] md: improve raid1 "IO Barrier" concept raid1 needs to put up a barrier to new requests while it does resync or other background recovery. The code for this is currently open-coded, slighty obscure by its use of two waitqueues, and not documented. This patch gathers all the related code into 4 functions, and includes a comment which (hopefully) explains what is happening. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:01 -08:00
Darrick J. Wong	ac81b2ee45	[PATCH] make dm-mirror not issue invalid resync requests I've been attempting to set up a (Host)RAID mirror with dm_mirror on 2.6.14.3, and I've been having a strange little problem. The configuration in question is a set of 9GB SCSI disks that have 17942584 sectors. I set up the dm_mirror table as such: 0 17942528 mirror core 2 2048 nosync 2 8:48 0 8:64 0 If I'm not mistaken, this sets up a 9GB RAID1 mriror with 1MB stripes across both SCSI disks. The sector count of the dm device is less than the size of the disks, so we shouldn't fall off the end. However, I always get the messages like this in dmesg when I set up the dm table: attempt to access beyond end of device sdd: rw=0, want=17958656, limit=17942584 Clearly, something is trying to read sectors past the end of the drive. I traced it down to the __rh_recovery_prepare function in dm-raid1.c, which gets called when we're putting the mirror set together. This function calls the dirty region log's get_resync_work function to see if there's any resync that needs to be done, and queues up any areas that are out of sync. The log's get_resync_work function is actually a pointer to the core_get_resync_work function in dm-log.c. The core_get_resync_work function queries a bitset lc->sync_bits to find out if there are any regions that are out of date (i.e. the bit is 0), which is where the problem occurs. If every bit in lc->sync_bits is 1 (which is the case when we've just configured a new RAID1 with the nosync option), the find_next_zero_bit does NOT return the size parameter (lc->region_count in this case), it returns the size parameter rounded up to the nearest multiple of 32! I don't know if this is intentional, but i386 and x86_64 both exhibit this behavior. In any case, the statement "if (*region == lc->region_count)" looks like it's supposed to catch the case where are no regions to resync and return 0. Since find_next_zero_bit apparently has a habit of returning a value that's larger than lc->region_count, the enclosed patch changes the equality test to a greater-than test so that we don't try to resync areas outside of the RAID1 region. Seeing as the HostRAID metadata lives just past the end of the RAID1 data, mucking around in that area is not a good idea. I suppose another way to fix this would be to amend find_next_zero_bit so that it doesn't return values larger than "size", but I don't know if there's a reason for the current behavior. Signed-Off-By: Darrick J. Wong <djwong@us.ibm.com> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:01 -08:00
Stefan Rompf	9d3520a339	[PATCH] dm-crypt: zero key before freeing it Zap the memory before freeing it so we don't leave crypto information around in memory. Signed-off-by: Stefan Rompf <stefan@loplof.de> Acked-by: Clemens Fruhwirth <clemens@endorphin.org> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:01 -08:00
Adrian Bunk	0b56306e56	[PATCH] drivers/md/kcopyd.c: #if 0 kcopyd_cancel() This patch #if 0's the not yet implemented global function kcopyd_cancel(). Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:01 -08:00
Alasdair G Kergon	6da487dcc0	[PATCH] device-mapper ioctl: add skip lock_fs flag Add ioctl DM_SKIP_LOCKFS_FLAG for userspace to request that lock_fs is bypassed when suspending a device. There's no change to the behaviour of existing code that doesn't know about the new flag. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:01 -08:00
Alasdair G Kergon	aa8d7c2fbe	[PATCH] device-mapper: make lock_fs optional Devices only needs syncing when creating snapshots, so make this optional when suspending a device. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:01 -08:00
Alasdair G Kergon	e39e2e95eb	[PATCH] device-mapper: rename frozen_bdev Rename frozen_bdev to suspended_bdev and move the bdget outside lockfs. (This prepares for making lockfs optional.) Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:00 -08:00
Jonathan E Brassow	a1a1908070	[PATCH] device-mapper raid1: add default mirror This patch introduces a new field to the mirror_set (default_mirror) to store the default mirror. (A subsequent patch will allow us to change the default mirror in the event of a failure.) Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:00 -08:00
Alasdair G Kergon	2d5fe68987	[PATCH] device-mapper: scanf sector format change Use %llu not %Lu in sscanf/printf format strings. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:00 -08:00
Andrew Stribblehill	e6c276159c	[PATCH] device-mapper: remove unused definition This patch removes an unused #define. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:00 -08:00
Alasdair G Kergon	2d38fe2044	[PATCH] device-mapper snapshot: metadata reading separation More snapshot metadata reading into separate function, to prepare for changing the place it gets called from. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:00 -08:00
goggin, edward	81f1777a55	[PATCH] device-mapper ioctl: event on rename After changing the name of a mapped device, trigger a dm event. (For userspace multipath tools.) Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:00 -08:00
David Teigland	d229a9589f	[PATCH] device-mapper: add dm_get_md Add dm_get_dev() to get a mapped device given its dev_t. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:34:00 -08:00
David Teigland	637842cfdb	[PATCH] device-mapper: add dm_find_md Abstract dm_find_md() from dm_get_mdptr() to allow use elsewhere. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:59 -08:00
Linus Torvalds	25c862cc9e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild	2006-01-04 16:36:52 -08:00
Linus Torvalds	f61ea1b0c8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6	2006-01-04 16:30:12 -08:00
Brian Gerst	352dd1df32	gitignore: misc files Ignore all files generated from *_shipped files, plus a few others. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2006-01-01 22:21:50 +01:00
Neil Brown	bcb97940f3	[PATCH] md: Change case of raid level reported in sys/mdX/md/level I had thought that keeping the reported tail level clearly different from the module name was a good idea, but I've changed my mind. 'raid5' is better and probably less confusing than 'RAID-5'. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-19 16:47:50 -08:00
Mike Christie	defd94b754	[SCSI] seperate max_sectors from max_hw_sectors - export __blk_put_request and blk_execute_rq_nowait needed for async REQ_BLOCK_PC requests - seperate max_hw_sectors and max_sectors for block/scsi_ioctl.c and SG_IO bio.c helpers per Jens's last comments. Since block/scsi_ioctl.c SG_IO was already testing against max_sectors and SCSI-ml was setting max_sectors and max_hw_sectors to the same value this does not change any scsi SG_IO behavior. It only prepares ll_rw_blk.c, scsi_ioctl.c and bio.c for when SCSI-ml begins to set a valid max_hw_sectors for all LLDs. Today if a LLD does not set it SCSI-ml sets it to a safe default and some LLDs set it to a artificial low value to overcome memory and feedback issues. Note: Since we now cap max_sectors to BLK_DEF_MAX_SECTORS, which is 1024, drivers that used to call blk_queue_max_sectors with a large value of max_sectors will now see the fs requests capped to BLK_DEF_MAX_SECTORS. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-12-15 15:11:40 -08:00
NeilBrown	5036805be7	[PATCH] md: use correct size of raid5 stripe cache when measuring how full it is The raid5 stripe cache was recently changed from fixed size (NR_STRIPES) to variable size (conf->max_nr_stripes). However there are two places that still use the constant and as a result, reducing the size of the stripe cache can result in a deadlock. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-12 09:06:04 -08:00
NeilBrown	3795bb0fc5	[PATCH] md: fix a use-after-free bug in raid1 Who would submit code with a FIXME like that in it !!!! Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-12 09:06:04 -08:00
NeilBrown	6aea114a72	[PATCH] md: fix --re-add for raid1 and raid6 If you have an array with a write-intent-bitmap, and you remove a device, then re-add it, a full recovery isn't needed. We detect a re-add by looking at saved_raid_disk. For raid1, it doesn't matter which disk it was, only whether or not it was an active device. The old code being removed set a value of 'mirror' which was then ignored, so it can go. The changed code performs the correct check. For raid6, if there are two missing devices, make sure we chose the right slot on --re-add rather than always the first slot. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-28 14:42:26 -08:00
NeilBrown	b2a2703c28	[PATCH] md: set default_bitmap_offset properly in set_array_info If an array is created using set_array_info, default_bitmap_offset isn't set properly meaning that an internal bitmap cannot be hot-added until the array is stopped and re-assembled. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-28 14:42:25 -08:00
NeilBrown	b5ab28a3b8	[PATCH] md: fix problem with raid6 intent bitmap When doing a recovery, we need to know whether the array will still be degraded after the recovery has finished, so we can know whether bits can be clearred yet or not. This patch performs the required check. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-28 14:42:25 -08:00
NeilBrown	700e432d83	[PATCH] md: fix locking problem in r5/r6 bitmap_unplug actually writes data (bits) to storage, so we shouldn't be holding a spinlock... Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-28 14:42:25 -08:00
NeilBrown	22dfdf5212	[PATCH] md: improve read speed to raid10 arrays using 'far copies' raid10 has two different layouts. One uses near-copies (so multiple copies of a block are at the same or similar offsets of different devices) and the other uses far-copies (so multiple copies of a block are stored a greatly different offsets on different devices). The point of far-copies is that it allows the first section (normally first half) to be layed out in normal raid0 style, and thus provide raid0 sequential read performance. Unfortunately, the read balancing in raid10 makes some poor decisions for far-copies arrays and you don't get the desired performance. So turn off that bad bit of read_balance for far-copies arrays. With this patch, read speed of an 'f2' array is comparable with a raid0 with the same number of devices, though write speed is ofcourse still very slow. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-28 14:42:25 -08:00
Jonathan E Brassow	7692c5dd48	[PATCH] device-mapper raid1: drop mark_region spinlock fix The spinlock region_lock is held while calling mark_region which can sleep. Drop the spinlock before calling that function. A region's state and inclusion in the clean list are altered by rh_inc and rh_dec. The state variable is set to RH_CLEAN in rh_dec, but only if 'pending' is zero. It is set to RH_DIRTY in rh_inc, but not if it is already so. The changes to 'pending', the state, and the region's inclusion in the clean list need to be atomicly. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-22 09:14:31 -08:00
jblunck@suse.de	233886dd32	[PATCH] device-mapper snapshot: bio_list fix bio_list_merge() should do nothing if the second list is empty - not oops. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-22 09:14:31 -08:00
Stefan Bader	640eb3b045	[PATCH] device-mapper dm-mpath: endio spinlock fix do_end_io() can be called without interrupts blocked. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-22 09:14:31 -08:00
Alasdair G Kergon	0e56822d30	[PATCH] device-mapper: mirror log bitset fix The linux bitset operators (test_bit, set_bit etc) work on arrays of "unsigned long". dm-log uses such bitsets but treats them as arrays of uint32_t, only allocating and zeroing a multiple of 4 bytes (as 'clean_bits' is a uint32_t). The patch below fixes this problem. The problem is specific to 64-bit big endian machines such as s390x or ppc-64 and can prevent pvmove terminating. In the simplest case, if "region_count" were (say) 30, then bitset_size (below) would be 4 and bitset_uint32_count would be 1. Thus the memory for this butset, after allocation and zeroing would be 0 0 0 0 X X X X On a bigendian 64bit machine, bit 0 for this bitset is in the 8th byte! (and every bit that dm-log would use would be in the X area). 0 0 0 0 X X X X ^ here which hasn't been cleared properly. As the dm-raid1 code only syncs and counts regions which have a 0 in the 'sync_bits' bitset, and only finishes when it has counted high enough, a large number of 1's among those 'X's will cause the sync to not complete. It is worth noting that the code uses the same bitsets for in-memory and on-disk logs. As these bitsets are host-endian and host-sized, this means that they cannot safely be moved between computers with Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-22 09:14:31 -08:00
Alasdair G Kergon	c4cc66351a	[PATCH] device-mapper: list_versions fix In some circumstances the LIST_VERSIONS output is truncated because the size calculation forgets about a 'uint32_t' in each structure - but the inclusion of the whole of ALIGN_MASK frequently compensates for the omission. This is a quick workaround to use an upper bound. (The code ought to be fixed to supply the actual size.) Running 'dmsetup targets' may demonstrate the problem: when I run it, the last line comes out as 'erro' instead of 'error'. Consequently, 'lvcreate --type error' doesn't work. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-22 09:14:31 -08:00
Kiyoshi Ueda	b6fcc80d03	[PATCH] device-mapper dm-ioctl: missing put in table load error case An error path in table_load() forgets to release a table that won't now be referenced. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-22 09:14:30 -08:00
NeilBrown	c0e485216d	[PATCH] md: fix is_mddev_idle calculation now that disk/sector accounting happens when request completes md needs to monitor the rate of requests to its devices when doing resync/recovery so that it can back-off when there is non-resync IO. It does this by comparing resync IO, which it counts, with total IO which is taken from disk_stats. disk_stats were recently changed to account sectors when a request completes instead of when it is queued. This upsets md's calculations. We could do the sync_io accounting at the end of requests too, but that has problems. If an underlying device is an md array, the accounting will still be done when the request is submitted. This could be changed for some raid levels, but it cannot be changed for raid0 or linear without substantial code changes. So instead, we increase the error that is_mddev_idle allows, up to the maximum amount of resync IO that can be in flight at any time. The calculation is current fragile as each personality as different limits for in-flight resync. This should be fixed up. For now, this simple patch fixes the problem. Increasing the error margin decreases the sensitivity to non-resync IO. To partially compensate for this, the time to wait when non-resync IO is detected is increased so that less steady IO is required to keep the resync at bay. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-18 07:49:46 -08:00
Neil Brown	34ef75f09f	[PATCH] md: don't pass a NULL file* into ->prepare_write() Some filesystems go oops. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-18 07:49:46 -08:00
NeilBrown	93588e2284	[PATCH] md: make md threads interruptible again Despite the fact that md threads don't need to be signalled, and won't respond to signals anyway, we need to have an 'interruptible' wait, else they stay in 'D' state and add to the load average. (akpm: the signal_pending() test is unneeded - we'll fix that up in the next round. For now, leave it there because that's how the code used to be). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-15 08:59:19 -08:00
NeilBrown	e8a0033451	[PATCH] md: mark START_ARRAY deprecated with a date This was marked deprecated "after 2.6" back in the 2.5 days. But now it seems there isn't going to be any "after 2.6", and we deprecate by date now. So set a date. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-15 08:59:19 -08:00
NeilBrown	bb636547b0	[PATCH] md: document sysfs usage of md, and make a couple of small refinements Document in Documentation/md.txt the files that now appear in sysfs, and make a couple of small refinements to exactly when 'level' and 'raid_disks' are empty, to make it match the documentation. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:40 -08:00
NeilBrown	7eec314d75	[PATCH] md: improve 'scan_mode' and rename it to 'sync_action' The current sync_action for an array can be one of idle - nothing happening resync - reduncancy being recalcualted recover - missing device being recoverred to spare check - user initiated check of redundancy repair - like resync but user-initiated and ignores bitmap optimisation. Each of these strings can also be written to the 'sync_action' file to cause that action to happen (if appropriate). While 'sync' is not technically correct, as a recovery is not a 'sync', I think it is the most servicable word here. Also 'action' is a strong word than 'mode'. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:40 -08:00
NeilBrown	787453c239	[PATCH] md: complete conversion of md to use kthreads There are a few loose ends following the conversion of md to use kthreads: - Some fields in mdk_thread_t that aren't needed (kthreads does it's own completion and manages it's own name). - thread->run is now never NULL, so no need to check - Some tests for signal_pending that aren't needed (As we don't use signals to stop threads any more) - Some flush_signals are not needed - Some waits are interruptible and don't need to be. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:40 -08:00
NeilBrown	fd9d49cac4	[PATCH] md: ignore auto-readonly flag for arrays where it isn't meaningful The 'auto-readonly' flag (which suppresses resync and superblock updates until the first write) is not meaningful for personalities that don't support resync or superblock writes (raid0, linear, etc). So clear the setting early to avoid it confusing anything - e.g. appearing in /proc/mdstat Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:39 -08:00
NeilBrown	8e1b39d623	[PATCH] md: only try to print recovery/resync status for personalities that support recovery The introduction of 'resync=PENDING' (for read-only devices) caused that message to appear for non-syncable arrays like raid0 and linear. Simplest thing is to not try to print any resync info unless the personality clearly supports it. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:39 -08:00
NeilBrown	411036fa19	[PATCH] md: split off some md attributes in sysfs to a separate group Some, but not all, md array support data redundancy and hence support checking and restoring that redundancy (resync, rebuild). Some attributes apply specifically to functions involving this redundancy, and so should only appear for md arrays for which they are meaningful. i.e. they should not appear for raid0, linear, multpath, faulty. This patch separates these into a distinct group and creates the group only if the personality supports sync_request. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:39 -08:00
NeilBrown	96de1e663c	[PATCH] md: fix some locking and module refcounting issues with md's use of sysfs 1/ I really should be using the __ATTR macros for defining attributes, so that the .owner field get set properly, otherwise modules can be removed while sysfs files are open. This also involves some name changes of _show routines. 2/ Always lock the mddev (against reconfiguration) for all sysfs attribute access. This easily avoid certain races and is completely consistant with other interfaces (ioctl and /proc/mdstat both always lock against reconfiguration). 3/ raid5 attributes must check that the 'conf' structure actually exists (the array could have been stopped while an attribute file was open). 4/ A missing 'kfree' from when the raid5_conf_t was converted to have a kobject embedded, and then converted back again. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:39 -08:00
NeilBrown	3855ad9f39	[PATCH] md: make sure a user-request sync of raid5 ignores intent bitmap A sync of raid5 usually ignore blocks which the bitmap says are in-sync. But a user-request check or repair should not ignore these. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:39 -08:00
NeilBrown	e5de485f00	[PATCH] md: make manual repair work for raid1 Raid1 currently optimises resync using the intent bitmap etc. This optimisation is not wanted when we explicitly request a repair through sysfs, so add appropriate checks. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:39 -08:00
NeilBrown	f637b9f9fc	[PATCH] md: make sure /block link in /sys/.../md/ goes to correct devices If a block_device is a partition, then it's kobject is bdev->bd_part->kobj otherwise (if it is a full device), the kobject is bdev->bd_disk->kobj As md wants back-links to the correct object (whether partition or not), we need to respect this difference... (Thus current code shows a link to the whole device, whether we are using a partition or not, which is wrong). Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:39 -08:00
NeilBrown	f91de92ed6	[PATCH] md: allow md arrays to be started read-only (module parameter). When an md array is started, the superblock will be written, and resync may commense. This is not good if you want to be completely read-only as, for example, when preparing to resume from a suspend-to-disk image. So introduce a module parameter "start_ro" which can be set to '1' at boot, at module load, or via /sys/module/md_mod/parameters/start_ro When this is set, new arrays get an 'auto-ro' mode, which disables all internal io (superblock updates, resync, recovery) and is automatically switched to 'rw' when the first write request arrives. The array can be set to true 'ro' mode using 'mdadm -r' before the first write request, or resync can be started without a write using 'mdadm -w'. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:38 -08:00
NeilBrown	19133a4298	[PATCH] md: Remove attempt to use dynamic names in sysfs for component devices on an MD array. With version-0.90 superblock, component devices on an md device to not have any stable name related to the array -(version-1 assigns a fixed index when a device is added to an array, and this remains despit any hot-swap). The intial code for making these devices appear in sysfs used dynamic names, which would change whenever a hot-spare was swapped for a failed or missing device. This turns out not to be practical in sysfs for a number of reasons. This patch changes then naming of component devices to be based on the result of 'bdevname'. This is stable and should be unique. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:38 -08:00
NeilBrown	a9701a3047	[PATCH] md: support BIO_RW_BARRIER for md/raid1 We can only accept BARRIER requests if all slaves handle barriers, and that can, of course, change with time.... So we keep track of whether the whole array seems safe for barriers, and also whether each individual rdev handles barriers. We initially assumes barriers are OK. When writing the superblock we try a barrier, and if that fails, we flag things for no-barriers. This will usually clear the flags fairly quickly. If writing the superblock finds that BIO_RW_BARRIER is -ENOTSUPP, we need to resubmit, so introduce function "md_super_wait" which waits for requests to finish, and retries ENOTSUPP requests without the barrier flag. When writing the real raid1, write requests which were BIO_RW_BARRIER but which aresn't supported need to be retried. So raid1d is enhanced to do this, and when any bio write completes (i.e. no retry needed) we remove it from the r1bio, so that devices needing retry are easy to find. We should hardly ever get -ENOTSUPP errors when writing data to the raid. It should only happen if: 1/ the device used to support BARRIER, but now doesn't. Few devices change like this, though raid1 can! or 2/ the array has no persistent superblock, so there was no opportunity to pre-test for barriers when writing the superblock. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:38 -08:00
NeilBrown	bd926c63b7	[PATCH] md: make md on-disk bitmaps not host-endian Current bitmaps use set_bit et.al and so are host-endian, which means not-portable. Oops. Define a new version number (4) for which bitmaps are little-endian. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:38 -08:00
NeilBrown	b2d444d7ad	[PATCH] md: convert 'faulty' and 'in_sync' fields to bits in 'flags' field This has the advantage of removing the confusion caused by 'rdev_t' and 'mddev_t' both having 'in_sync' fields. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:38 -08:00
NeilBrown	ba22dcbf10	[PATCH] md: improvements to raid5 handling of read errors Two refinements to the 'attempt-overwrite-on-read-error' mechanism. 1/ If the array is read-only, don't attempt an over-write. 2/ If there are more than max_nr_stripes read errors on a device with no success, fail the drive. This will make sure a dead drive will be eventually kicked even when we aren't trying to rewrite (which would normally kick a dead drive more quickly. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:38 -08:00
NeilBrown	007583c925	[PATCH] md: change raid5 sysfs attribute to not create a new directory There isn't really a need for raid5 attributes to be an a subdirectory, so this patch moves them from /sys/block/mdX/md/raid5/attribute to /sys/block/mdX/md/attribute This suggests that all md personalities should co-operate about namespace usage, but that shouldn't be a problem. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:38 -08:00
NeilBrown	31399d9e56	[PATCH] md: minor MD fixes 1/ Use reduce stack usage, because 'gcc' apparently doesn't overlay different variables that are in separate scopes... 2/ Use test_bit instead of ( .. & 1<< ..) which in this case is buggy. Thanks to Andrew Morton Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:37 -08:00
NeilBrown	9c79197761	[PATCH] md: fix ref-counting problems with kobjects in md Thanks Greg. Cc: Greg KH <greg@kroah.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:37 -08:00
Suzanne Wood	d6065f7bf8	[PATCH] md: provide proper rcu_dereference / rcu_assign_pointer annotations in md Acked-by: <paulmck@us.ibm.com> Signed-off-by: Suzanne Wood <suzannew@cs.pdx.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:37 -08:00
NeilBrown	9d88883e68	[PATCH] md: teach raid5 the difference between 'check' and 'repair'. With this, raid5 can be asked to check parity without repairing it. It also keeps a count of the number of incorrect parity blocks found (mismatches) and reports them through sysfs. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:37 -08:00
NeilBrown	24dd469d72	[PATCH] md: allow a manual resync with md You can trigger a 'check' with echo check > /sys/block/mdX/md/scan_mode or a check-and-repair errors with echo repair > /sys/block/mdX/md/scan_mode and read the current state from the same file. Note: personalities need to know the different between 'check' and 'repair', but don't yet. Until they do, 'check' will be the same as 'repair' and will just do a normal resync pass. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:37 -08:00
NeilBrown	3f294f4fb6	[PATCH] md: add kobject/sysfs support to raid5 /sys/block/mdX/md/raid5/ contains raid5-related attributes. Currently stripe_cache_size is number of entries in stripe cache, and is settable. stripe_cache_active is number of active entries, and in only readable. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:37 -08:00
NeilBrown	86e6ffdd24	[PATCH] md: extend md sysfs support to component devices. Each device in an md array how has a corresponding /sys/block/mdX/md/devNN/ directory which can contain attributes. Currently there is only 'state' which summarises the state, nd 'super' which has a copy of the superblock, and 'block' which is a symlink to the block device. Also, /sys/block/mdX/md/rdNN represents slot 'NN' in the array, and is a symlink to the relevant 'devNN'. Obviously spare devices do not have a slot in the array, and so don't have such a symlink. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:37 -08:00
NeilBrown	eae1701fbd	[PATCH] md: initial sysfs support for md Start using kobjects in mddevs, and provide a couple of simple attributes (level and disks). Attributes live in /sys/block/mdX/md/attr-name Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:36 -08:00
NeilBrown	4e5314b56a	[PATCH] md: better handling of readerrors with raid5. This patch changes the behaviour of raid5 when it gets a read error. Instead of just failing the device, it tried to find out what should have been there, and writes it over the bad block. For some media-errors, this has a reasonable chance of fixing the error. If the write succeeds, and a subsequent read succeeds as well, raid5 decided the address is OK and conitnues. Instead of failing a drive on read-error, we attempt to re-write the block, and then re-read. If that all works, we allow the device to remain in the array. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:56:36 -08:00
Olaf Hering	733482e445	[PATCH] changing CONFIG_LOCALVERSION rebuilds too much, for no good reason This patch removes almost all inclusions of linux/version.h. The 3 #defines are unused in most of the touched files. A few drivers use the simple KERNEL_VERSION(a,b,c) macro, which is unfortunatly in linux/version.h. There are also lots of #ifdef for long obsolete kernels, this was not touched. In a few places, the linux/version.h include was move to where the LINUX_VERSION_CODE was used. quilt vi `find * -type f -name "*.[ch]"\|xargs grep -El '(UTS_RELEASE\|LINUX_VERSION_CODE\|KERNEL_VERSION\|linux/version.h)'\|grep -Ev '(/(boot\|coda\|drm)/\|~$)'` search pattern: /UTS_RELEASE\\|LINUX_VERSION_CODE\\|KERNEL_VERSION\\|linux\/$utsname\\|version$.h Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-09 07:55:57 -08:00
Nishanth Aravamudan	66c006a551	[PATCH] drivers/md: fix-up schedule_timeout() usage Use schedule_timeout_interruptible() instead of set_current_state()/schedule_timeout() to reduce kernel size. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-07 07:53:57 -08:00
Jens Axboe	a362357b6c	[BLOCK] Unify the seperate read/write io stat fields into arrays Instead of having ->read_sectors and ->write_sectors, combine the two into ->sectors[2] and similar for the other fields. This saves a branch several places in the io path, since we don't have to care for what the actual io direction is. On my x86-64 box, that's 200 bytes less text in just the core (not counting the various drivers). Signed-off-by: Jens Axboe <axboe@suse.de>	2005-11-01 09:26:16 +01:00
David Hardeman	378f058cc4	[PATCH] Use sg_set_buf/sg_init_one where applicable This patch uses sg_set_buf/sg_init_one in some places where it was duplicated. Signed-off-by: David Hardeman <david@2gen.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-10-30 11:19:43 +11:00
Al Viro	b4e3ca1ab1	[PATCH] gfp_t: remaining bits of drivers/* Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-28 08:16:51 -07:00
NeilBrown	8712e55356	[PATCH] md: make sure mdthreads will always respond to kthread_stop There are still a couple of cases where md threads (the resync/recovery thread) is not interruptible since the change to use kthreads. All places there it tests "signal_pending", it should also test kthread_should_stop, as with this patch. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-26 10:39:42 -07:00
NeilBrown	6985c43f39	[PATCH] Three one-liners in md.c The main problem fixes is that in certain situations stopping md arrays may take longer than you expect, or may require multiple attempts. This would only happen when resync/recovery is happening. This patch fixes three vaguely related bugs. 1/ The recent change to use kthreads got the setting of the process name wrong. This fixes it. 2/ The recent change to use kthreads lost the ability for md threads to be signalled with SIG_KILL. This restores that. 3/ There is a long standing bug in that if: - An array needs recovery (onto a hot-spare) and - The recovery is being blocked because some other array being recovered shares a physical device and - The recovery thread is killed with SIG_KILL Then the recovery will appear to have completed with no IO being done, which can cause data corruption. This patch makes sure that incomplete recovery will be treated as incomplete. Note that any kernel affected by bug 2 will not suffer the problem of bug 3, as the signal can never be delivered. Thus the current 2.6.14-rc kernels are not susceptible to data corruption. Note also that if arrays are shutdown (with "mdadm -S" or "raidstop") then the problem doesn't occur. It only happens if a SIGKILL is independently delivered as done by 'init' when shutting down. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-19 23:04:30 -07:00
Al Viro	dd0fc66fb3	[PATCH] gfp flags annotations - part 1 - added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-08 15:00:57 -07:00
Alasdair G Kergon	485ef69ede	[PATCH] device-mapper: Fix queue_if_no_path initialisation When creating a multipath device, if the queue_if_no_path parameter is specified it gets ignored. While the queue_if_no_path variable is correctly set to 1, the saved_queue_if_no_path gets set to 0. When the device is subsequently made live (resumed), the saved value (0) always overwrites the live value (1) so the option always gets turned off. The fix adds a parameter to the queue_if_no_path() function to indicate whether the previous value should be preserved or not - if not, as when the device is being set up, the saved value is set to the new value (1). Signed-Off-By: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-28 07:46:42 -07:00
goggin, edward	269fd2a6f8	[PATCH] device-mapper: Trigger an event when a table is deleted If anything is waiting on a device's table when the device is removed, we must first wake it up so it will release its reference. Otherwise the table's reference count will not drop to zero and the table will not get removed. Signed-Off-By: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-28 07:46:42 -07:00
H. Peter Anvin	d7e70ba45f	[PATCH] RAID6 Altivec fix This patch fixes a signedness bug with RAID6 for Altivec, and makes the Altivec code testable in userspace. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-17 11:49:58 -07:00
Adrian Bunk	338cec3253	[PATCH] merge some from Rusty's trivial patches This patch contains the most trivial from Rusty's trivial patches: - spelling fixes - remove duplicate includes Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-10 10:06:30 -07:00
Jesper Juhl	f9101210e7	[PATCH] vfree and kfree cleanup in drivers/ This patch does a full cleanup of 'NULL checks before vfree', and a partial cleanup of calls to kfree for all of drivers/ - the kfree bit is partial in that I only did the files that also had vfree calls in them. The patch also gets rid of some redundant (void *) casts of pointers being passed to [vk]free, and a some tiny whitespace corrections also crept in. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-10 10:06:30 -07:00
NeilBrown	87fc767b83	[PATCH] md: fix BUG when raid10 rebuilds without enough drives This shouldn't be a BUG. We should cope. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:15 -07:00
NeilBrown	6d508242b2	[PATCH] md: fix raid10 assembly when too many devices are missing If you try to assemble an array with too many missing devices, raid10 will now reject the attempt, instead of allowing it. Also check when hot-adding a drive and refuse the hot-add if the array is beyond hope. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:14 -07:00
NeilBrown	611815651b	[PATCH] md: really get sb_size setting right in all cases There was another case where sb_size wasn't being set, so instead do the sensible thing and set if when filling in the content of a superblock. That ensures that whenever we write a superblock, the sb_size MUST be set. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:14 -07:00
NeilBrown	188c18fd79	[PATCH] md: make sure the new 'sb_size' is set properly device added without pre-existing superblock. There are two ways to add devices to an md/raid array. It can have superblock written to it, and then given to the md driver, which will read the superblock (the new way) or md can be told (through SET_ARRAY_INFO) the shape of the array, and the told about individual drives, and md will create the required superblock (the old way). The newly introduced sb_size was only set for drives being added the new way, not the old ways. Oops :-( Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:14 -07:00
NeilBrown	b325a32e57	[PATCH] md: report spare drives in /proc/mdstat Just like failed drives have (F), so spare drives now have (S). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:14 -07:00
NeilBrown	1cd6bf19bb	[PATCH] md: add information about superblock version to /proc/mdstat Leave it unchanged if the original (0.90) is used, incase it might be a compatability problem. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:14 -07:00
NeilBrown	720a3dc39b	[PATCH] md: use queue_hardsect_size instead of block_size for md superblock size calc. Doh. I want the physical hard-sector-size, not the current block size... Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:13 -07:00
NeilBrown	53e87fbb5d	[PATCH] md: choose better default offset for bitmap. On reflection, a better default location for hot-adding bitmaps with version-1 superblocks is immediately after the superblock. There might not be much room there, but there is usually atleast 3k, and that is a good start. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:13 -07:00
NeilBrown	500af87abb	[PATCH] md: tidy up daemon stop/start code in md/bitmap.c The bitmap code used to have two daemons, so there is some 'common' start/stop code. But now there is only one, so the common code is just noise. This patch tidies this up somewhat. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:13 -07:00
NeilBrown	9ba00538ad	[PATCH] md: ensure bitmap_writeback_daemon handles shutdown properly. mddev->bitmap gets clearred before the writeback daemon is stopped. So the write_back daemon needs to be careful not to dereference the 'bitmap' if it is NULL. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:13 -07:00
NeilBrown	a6fb0934f9	[PATCH] md: use kthread infrastructure in md Switch MD to use the kthread infrastructure, to simplify the code and get rid of tasklist_lock abuse in md_unregister_thread. Also don't flush signals in md_thread, as the called thread will always do that. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:13 -07:00
NeilBrown	934ce7c840	[PATCH] md: write-intent bitmap support for raid6 This is a direct port of the raid5 patch. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:12 -07:00
NeilBrown	72626685dc	[PATCH] md: add write-intent-bitmap support to raid5 Most awkward part of this is delaying write requests until bitmap updates have been flushed. To achieve this, we have a sequence number (seq_flush) which is incremented each time the raid5 is unplugged. If the raid thread notices that this has changed, it flushes bitmap changes, and assigned the value of seq_flush to seq_write. When a write request arrives, it is given the number from seq_write, and that write request may not complete until seq_flush is larger than the saved seq number. We have a new queue for storing stripes which are waiting for a bitmap flush and an extra flag for stripes to record if the write was 'degraded' and so should not clear the a bit in the bitmap. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:12 -07:00
NeilBrown	0002b2718d	[PATCH] md: limit size of sb read/written to appropriate amount version-1 superblocks are not (normally) 4K long, and can be of variable size. Writing the full 4K can cause corruption (but only in non-default configurations). With this patch the super-block-flavour can choose a size to read, and set a size to write based on what it finds. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:12 -07:00
NeilBrown	ab904d6346	[PATCH] md: fix bitmap/read_sb_page so that it handles errors properly. read_sb_page() assumed that if sync_page_io fails, the device would be marked faultly. However it isn't. So in the face of error, read_sb_page would loop forever. Redo the logic so that this cannot happen. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:11 -07:00
NeilBrown	71c0805cb4	[PATCH] md: allow md to load a superblock with feature-bit '1' set As this is used to flag an internal bitmap. Also, introduce symbolic names for feature bits. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:11 -07:00
NeilBrown	7b1e35f6d6	[PATCH] md: allow hot-adding devices to arrays with non-persistant superblocks. It is possibly (and occasionally useful) to have a raid1 without persistent superblocks. The code in add_new_disk for adding a device to such an array always tries to read a superblock. This will obviously fail. So do the appropriate test and call md_import_device with appropriate args. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:11 -07:00
NeilBrown	3178b0dbdf	[PATCH] md: do not set mddev->bitmap until bitmap is fully initialised When hot-adding a bitmap, bitmap_daemon_work could get called while the bitmap is being created, so don't set mddev->bitmap until the bitmap is ready. This requires freeing the bitmap inside bitmap_create if creation failed part-way through. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:11 -07:00
NeilBrown	585f0dd5a9	[PATCH] md: make sure bitmap_daemon_work actually does work. The 'lastrun' time wasn't being initialised, so it could be half a jiffie-cycle before it seemed to be time to do work again. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:11 -07:00
NeilBrown	9e6603da9b	[PATCH] md: raid1_quiesce is back to front, fix it. A state of 0 mean 'not quiesced' A state of 1 means 'is quiesced' The original code got this wrong. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:10 -07:00
NeilBrown	15945fee6f	[PATCH] md: support md/linear array with components greater than 2 terabytes. linear currently uses division by the size of the smallest componenet device to find which device a request goes to. If that smallest device is larger than 2 terabytes, then the division will not work on some systems. So we introduce a pre-shift, and take care not to make the hash table too large, much like the code in raid0. Also get rid of conf->nr_zones, which is not needed. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:10 -07:00
NeilBrown	4b6d287f62	[PATCH] md: add write-behind support for md/raid1 If a device is flagged 'WriteMostly' and the array has a bitmap, and the bitmap superblock indicates that write_behind is allowed, then write_behind is enabled for WriteMostly devices. Write requests will be acknowledges as complete to the caller (via b_end_io) when all non-WriteMostly devices have completed the write, but will not be cleared from the bitmap until all devices complete. This requires memory allocation to make a local copy of the data being written. If there is insufficient memory, then we fall-back on normal write semantics. Signed-Off-By: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:10 -07:00
NeilBrown	8ddf9efe67	[PATCH] md: support write-mostly device in raid1 This allows a device in a raid1 to be marked as "write mostly". Read requests will only be sent if there is no other option. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:10 -07:00
NeilBrown	36fa30636f	[PATCH] md: all hot-add and hot-remove of md intent logging bitmaps Both file-bitmaps and superblock bitmaps are supported. If you add a bitmap file on the array device, you lose. This introduces a 'default_bitmap_offset' field in mddev, as the ioctl used for adding a superblock bitmap doesn't have room for giving an offset. Later, this value will be setable via sysfs. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:10 -07:00
NeilBrown	6a07997fc3	[PATCH] md: improve handling of bitmap initialisation. When we find a 'stale' bitmap, possibly because it is new, we should just assume every bit needs to be set, but rather base the setting of bits on the current state of the array (degraded and recovery_cp). Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:09 -07:00
NeilBrown	1923b99a0f	[PATCH] md: don't allow new md/bitmap file to be set if one already exists ... otherwise we loose a reference and can never free the file. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:09 -07:00
Jun'ichi Nomura	844e8d904a	[PATCH] dm: fix rh_dec()/rh_inc() race in dm-raid1.c Fix another bug in dm-raid1.c that the dirty region may stay in or be moved to clean list and freed while in use. It happens as follows: CPU0 CPU1 ------------------------------------------------------------------------------ rh_dec() if (atomic_dec_and_test(pending)) <the region is still marked dirty> rh_inc() if the region is clean mark the region dirty and remove from clean list mark the region clean and move to clean list atomic_inc(pending) At this stage, the region is in clean list and will be mistakenly reclaimed by rh_update_states() later. Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 16:39:09 -07:00

1 2 3 4 5 ...

328 Commits