md: avoid endless recovery loop when waiting for fail device to complete.
If a device fails in a way that causes pending request to take a while to complete, md will not be able to immediately remove it from the array in remove_and_add_spares. It will then incorrectly look like a spare device and md will try to recover it even though it is failed. This leads to a recovery process starting and instantly aborting over and over again. We should check if the device is faulty before considering it to be a spare. This will avoid trying to start a recovery that cannot proceed. This bug was introduced in 2.6.26 so that patch is suitable for any kernel since then. Cc: stable@kernel.org Reported-by: Jim Paradis <james.paradis@stratus.com> Signed-off-by: NeilBrown <neilb@suse.de>
This commit is contained in:
parent
2992c4bd57
commit
4274215d24
|
@ -7088,6 +7088,7 @@ static int remove_and_add_spares(mddev_t *mddev)
|
|||
list_for_each_entry(rdev, &mddev->disks, same_set) {
|
||||
if (rdev->raid_disk >= 0 &&
|
||||
!test_bit(In_sync, &rdev->flags) &&
|
||||
!test_bit(Faulty, &rdev->flags) &&
|
||||
!test_bit(Blocked, &rdev->flags))
|
||||
spares++;
|
||||
if (rdev->raid_disk < 0
|
||||
|
|
Loading…
Reference in New Issue