summaryrefslogtreecommitdiff
path: root/arch/powerpc
diff options
context:
space:
mode:
authorMichael Neuling <mikey@neuling.org>2017-09-22 13:32:21 +1000
committerMichael Ellerman <mpe@ellerman.id.au>2017-09-26 21:01:59 +1000
commitd8bd9f3f0925d22726de159531bfe3774b5cacc6 (patch)
treee0f5525c741b273c011ab2968a21344266bcb71a /arch/powerpc
parente19b205be43d11bff638cad4487008c48d21c103 (diff)
powerpc: Handle MCE on POWER9 with only DSISR bit 30 set
On POWER9 DD2.1 and below, it's possible for a paste instruction to cause a Machine Check Exception (MCE) where only DSISR bit 30 (IBM 33) is set. This will result in the MCE handler seeing an unknown event, which triggers linux to crash. We change this by detecting unknown events caused by load/stores in the MCE handler and marking them as handled so that we no longer crash. An MCE that occurs like this is spurious, so we don't need to do anything in terms of servicing it. If there is something that needs to be serviced, the CPU will raise the MCE again with the correct DSISR so that it can be serviced properly. Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com Acked-by: Balbir Singh <bsingharora@gmail.com> [mpe: Expand comment with details from change log, use normal bit #s] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Diffstat (limited to 'arch/powerpc')
-rw-r--r--arch/powerpc/kernel/mce_power.c13
1 files changed, 13 insertions, 0 deletions
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index b76ca198e09c..f523125b9d34 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -624,5 +624,18 @@ long __machine_check_early_realmode_p8(struct pt_regs *regs)
long __machine_check_early_realmode_p9(struct pt_regs *regs)
{
+ /*
+ * On POWER9 DD2.1 and below, it's possible to get a machine check
+ * caused by a paste instruction where only DSISR bit 30 is set. This
+ * will result in the MCE handler seeing an unknown event and the kernel
+ * crashing. An MCE that occurs like this is spurious, so we don't need
+ * to do anything in terms of servicing it. If there is something that
+ * needs to be serviced, the CPU will raise the MCE again with the
+ * correct DSISR so that it can be serviced properly. So detect this
+ * case and mark it as handled.
+ */
+ if (SRR1_MC_LOADSTORE(regs->msr) && regs->dsisr == 0x40000000)
+ return 1;
+
return mce_handle_error(regs, mce_p9_derror_table, mce_p9_ierror_table);
}