summaryrefslogtreecommitdiff
path: root/drivers/nvme
diff options
context:
space:
mode:
authorSagi Grimberg <sagi@grimberg.me>2020-06-26 10:46:29 -0700
committerChristoph Hellwig <hch@lst.de>2020-07-02 10:38:00 +0200
commitea43d9709f727e728e933a8157a7a7ca1a868281 (patch)
tree30d71119a2dfbeafcb6a5f7dfae1400be5bd3894 /drivers/nvme
parente7eea44eefbdd5f0345a0a8b80a3ca1c21030d06 (diff)
nvme: fix identify error status silent ignore
Commit 59c7c3caaaf8 intended to only silently ignore non retry-able errors (DNR bit set) such that we can still identify misbehaving controllers, and in the other hand propagate retry-able errors (DNR bit cleared) so we don't wrongly abandon a namespace just because it happens to be temporarily inaccessible. The goal remains the same as the original commit where this was introduced but unfortunately had the logic backwards. Fixes: 59c7c3caaaf8 ("nvme: fix possible hang when ns scanning fails during error recovery") Reported-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
Diffstat (limited to 'drivers/nvme')
-rw-r--r--drivers/nvme/host/core.c12
1 files changed, 9 insertions, 3 deletions
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 28f4388c1337..8410d03b940d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1116,10 +1116,16 @@ static int nvme_identify_ns_descs(struct nvme_ctrl *ctrl, unsigned nsid,
dev_warn(ctrl->device,
"Identify Descriptors failed (%d)\n", status);
/*
- * Don't treat an error as fatal, as we potentially already
- * have a NGUID or EUI-64.
+ * Don't treat non-retryable errors as fatal, as we potentially
+ * already have a NGUID or EUI-64. If we failed with DNR set,
+ * we want to silently ignore the error as we can still
+ * identify the device, but if the status has DNR set, we want
+ * to propagate the error back specifically for the disk
+ * revalidation flow to make sure we don't abandon the
+ * device just because of a temporal retry-able error (such
+ * as path of transport errors).
*/
- if (status > 0 && !(status & NVME_SC_DNR))
+ if (status > 0 && (status & NVME_SC_DNR))
status = 0;
goto free_data;
}