raid1: prefer disk without bad blocks

If an array consists of two drives and the first drive has the bad block, the read request to the region overlapping the bad block chooses the same disk (with bad block) as device to read from over and over and the request gets stuck. If the first disk only partially overlaps with bad block, it becomes a candidate ("best disk") for shorter range of sectors. The second disk is capable of reading the entire requested range and it is updated accordingly, however it is not recorded as a best device for the request. In the end the request is sent to the first disk to read entire range of sectors. It fails and is re-tried in a moment but with the same outcome. Actually it is quite likely scenario but it had little exposure in my test until commit 715d40b93b10 ("md/raid1: add failfast handling for reads.") removed preference for idle disk. Such scenario had been passing as second disk was always chosen when idle. Reset a candidate ("best disk") to read from if disk can read entire range. Do it only if other disk has already been chosen as a candidate for a smaller range. The head position / disk type logic will select the best disk to read from - it is fine as disk with bad block won't be considered for it. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Shaohua Li <shli@fb.com>
author: Tomasz Majchrzak <tomasz.majchrzak@intel.com> 2017-05-12 14:26:10 +0200
committer: Shaohua Li <shli@fb.com> 2017-05-12 14:41:15 -0700
commit: d82dd0e34d0347be201fd274dc84cd645dccc064 (patch)
tree: 4ddf83040aee745295f9ae92d628191b74e34bf8 /drivers/md
parent: 5ddf0440a1a28f00f69ed2e093476bab3b60c2c3 (diff)
1 files changed, 4 insertions, 1 deletions
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index a17ed6218d51..af5056d56878 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -666,8 +666,11 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect
 					break;
 			}
 			continue;
-		} else
+		} else {
+			if ((sectors > best_good_sectors) && (best_disk >= 0))
+				best_disk = -1;
 			best_good_sectors = sectors;
+		}
 
 		if (best_disk >= 0)
 			/* At least two disks to choose from so failfast is OK */
author	Tomasz Majchrzak <tomasz.majchrzak@intel.com>	2017-05-12 14:26:10 +0200
committer	Shaohua Li <shli@fb.com>	2017-05-12 14:41:15 -0700
commit	d82dd0e34d0347be201fd274dc84cd645dccc064 (patch)
tree	4ddf83040aee745295f9ae92d628191b74e34bf8 /drivers/md
parent	5ddf0440a1a28f00f69ed2e093476bab3b60c2c3 (diff)