mm: suppress mm fault logging if fatal signal already pending

Commit eda0047296a1 ("mm: make the page fault mmap locking killable") intentionally made it much easier to trigger the "page fault fails because a fatal signal is pending" situation, by having the mmap locking fail early in that case. We have long aborted page faults in other fatal cases when the actual IO for a page is interrupted by SIGKILL - which is particularly useful for the traditional case of NFS hanging due to network issues, but local filesystems could cause it too if you happened to get the SIGKILL while waiting for a page to be faulted in (eg lock_folio_maybe_drop_mmap()). So aborting the page fault wasn't a new condition - but it now triggers earlier, before we even get to 'handle_mm_fault()'. And as a result the error doesn't go through our 'fault_signal_pending()' logic, and doesn't get filtered away there. Normally you'd never even notice, because if a fatal signal is pending, the new SIGSEGV we send ends up being ignored anyway. But it turns out that there is one very noticeable exception: if you enable 'show_unhandled_signals', the aborted page fault will be logged in the kernel messages, and you'll get a scary line looking something like this in your logs: pverados[2183248]: segfault at 55e5a00f9ae0 ip 000055e5a00f9ae0 sp 00007ffc0720bea8 error 14 in perl[55e5a00d4000+195000] likely on CPU 10 (core 4, socket 0) which is rather misleading. It's not really a segfault at all, it's just "the thread was killed before the page fault completed, so we aborted the page fault". Fix this by just making it clear that a pending fatal signal means that any new signal coming in after that is implicitly handled. This will avoid the misleading logging, since now the signal isn't 'unhandled' any more. Reported-and-tested-by: Fiona Ebner <f.ebner@proxmox.com> Tested-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Link: https://lore.kernel.org/lkml/8d063a26-43f5-0bb7-3203-c6a04dc159f8@proxmox.com/ Acked-by: Oleg Nesterov <oleg@redhat.com> Fixes: eda0047296a1 ("mm: make the page fault mmap locking killable") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Linus Torvalds <torvalds@linux-foundation.org> 2023-07-25 09:38:32 -0700
committer: Linus Torvalds <torvalds@linux-foundation.org> 2023-07-26 10:51:59 -0700
commit: 5f0bc0b042fc77ff70e14c790abdec960cde4ec1 (patch)
tree: da39e29e21f80634f400cd71572b7a3d48e01b72 /kernel/signal.c
parent: 18b44bc5a67275641fb26f2c54ba7eef80ac5950 (diff)
1 files changed, 4 insertions, 0 deletions
diff --git a/kernel/signal.c b/kernel/signal.c
index b5370fe5c198..128e9bb3d1a2 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -562,6 +562,10 @@ bool unhandled_signal(struct task_struct *tsk, int sig)
 	if (handler != SIG_IGN && handler != SIG_DFL)
 		return false;
 
+	/* If dying, we handle all new signals by ignoring them */
+	if (fatal_signal_pending(tsk))
+		return false;
+
 	/* if ptraced, let the tracer determine */
 	return !tsk->ptrace;
 }
author	Linus Torvalds <torvalds@linux-foundation.org>	2023-07-25 09:38:32 -0700
committer	Linus Torvalds <torvalds@linux-foundation.org>	2023-07-26 10:51:59 -0700
commit	5f0bc0b042fc77ff70e14c790abdec960cde4ec1 (patch)
tree	da39e29e21f80634f400cd71572b7a3d48e01b72 /kernel/signal.c
parent	18b44bc5a67275641fb26f2c54ba7eef80ac5950 (diff)