summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-11-09drm/nouveau/imem: allow bar2 mapping of user allocationsBen Skeggs
Will be used to init client-allocated USERD to default values. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/flcn: show falcon user in debug outputBen Skeggs
Displays both owner/user of the falcon (when they differ), and takes both subdevs' debug levels into account when deciding whether to log the message. - runlist debugging will use one of the alternate macros added here Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/nvkm: add locking to subdev/engine init pathsBen Skeggs
This wasn't really needed before; the main place this could race is with channel recovery, but (through potentially fragile means) shouldn't have been possible. However, a number of upcoming patches benefit from having better control over subdev init, necessitating some improvements here. - allows subdev/engine oneinit() without init() (host/fifo patches) - merges engine use locking/tracking into subdev, and extends it to fix some issues that will arise with future usage patterns (acr patches) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/mc/ga100: switch to using NV_PMC_DEVICE_ENABLEBen Skeggs
- NV_PMC_ENABLE still exists, but we don't touch anything in it yet Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/mc: move NV_PMC_ENABLE bashing to chipset-specific codeBen Skeggs
Ampere needs different handling here, most of what we touch has moved. We probably want to refactor these interfaces in general, but I'm not yet sure how they should look, this will get the job done for now. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/mc: implement intr handling on top of nvkm_intrBen Skeggs
- new-style handlers can now be used here too - decent clean-up Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/fault/ga100: initial supportBen Skeggs
TU102 implementation should be OK for Ampere now. v2. fixup for ga103 early merge Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2022-11-09drm/nouveau/fault/tu102: switch to explicit intr handlersBen Skeggs
- reads vectors from HW, rather than being hardcoded - removes hacks to support routing via old interfaces Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/vfn/tu102-: support new-style interrupt treeBen Skeggs
- switches ampere over now, and removes its hack mc implementation Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/vfn: move NV_USERMODE class from hostBen Skeggs
- uses proper class IDs for Turing/Ampere Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/vfn: add stub subdev for dev_funcBen Skeggs
Initially for NV_USERMODE class, and Turing/Ampere's new interrupt tree. v2. fixup for ga103 early merge Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2022-11-09drm/nouveau/intr: add nvkm_subdev_intr() compatibilityBen Skeggs
It's quite a lot of tedious and error-prone work to switch over all the subdevs at once, so allow an nvkm_intr to request new-style handlers to be created that wrap the existing interfaces. This will allow a more gradual transition. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/intr: support multiple trees, and explicit interfacesBen Skeggs
Turing adds a second top-level interrupt tree in HW, in addition to the trees available via NV_PMC. Most of the interrupts we care about are exposed in both trees, but not all of them, and we have some rather nasty hacks to route the fault buffer interrupts. Ampere removes the NV_PMC trees entirely. Here we add some infrastructure to be able to handle all of this more cleanly, as well as providing more explicit control over handlers. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/intr: add shared interrupt plumbing between pci/tegraBen Skeggs
Unifies the handling between PCI-based and Tegra GPUs, and makes more explicit/obvious where device interrupts can be expected. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/top: parse device topology right after devinitBen Skeggs
We're going to want this information available earlier than it is now. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/nvkm: give each nvkm_event its own lockdep classBen Skeggs
The vblank and nonstall events have some annoying interactions with DRM locking, and aren't able to do certain things as a result. However, other uses of event notifications don't have such requirements, and upcoming patches take advantage of this for various improvements. Having separate classes for each nvkm_event's spinlocks allows lockdep to distinguish between them and avoid false-positives. v2: __always_inline + comment Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/kms: switch to drm fbdev helpersBen Skeggs
This removes support for accelerated fbcon rendering, and fixes a number of races/crashes/issues around suspend/resume/module unload etc. Losing HW accelerated rendering isn't ideal, but it's been significantly reduced in performance since the removal of accelerated scrolling in the kernel anyway - not to mention, can be racey (skips cpu<->gpu sync) from certain contexts. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2022-11-09drm/nouveau/nvkm: rip out old notifyBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/fifo: expose channel killed in host channel event classBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/fifo: expose non-stall intr in host channel event classBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/disp: expose page flip event classBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/disp: expose conn event classBen Skeggs
This removes some now-unnecessary nesting of workqueues. v2: - use ?: (lyude) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/disp: expose head event classBen Skeggs
Also fixes vblank interrupts being left enabled when they're not meant to be as a result of races/bugs in previous event handling code. v2: - use ?: (lyude) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/disp: switch vblank semaphore release to nvkm_event_ntfyBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/fault: expose replayable fault buffer event classBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/fault: switch non-replayable faults to nvkm_event_ntfyBen Skeggs
v2: fix flush_work() being called uninitialised during init Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/nvkm: add a replacement for nvkm_notifyBen Skeggs
This replaces the twisty, confusing, relationship between nvkm_event and nvkm_notify with something much simpler, and less racey. It also places events in the object tree hierarchy, which will allow a heap of the code tracking events across allocation/teardown/suspend to be removed. This commit just adds the new interfaces, and passes the owning subdev to the event constructor to enable debug-tracing in the new code. v2: - use ?: (lyude) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/disp: move head scanoutpos methodBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/disp: add head classBen Skeggs
v2: remove extra whitespace Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/disp: move DP MST payload config methodBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/disp: add method to trigger DP link retrainBen Skeggs
This moves control of link retraining in response to HPD IRQ to the KMS driver's HPD IRQ handler. NVKM still handles checking link status for the moment, this can be moved to the KMS driver when it takes explicit control of link rate selection. v2: - skip source config on retrain (fixes some retrain failures) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/kms: pass event mask to hpd handlerBen Skeggs
Will be moving the DP link status check / re-train here so it's safe from racing with modeset routing changes. MST message handling etc. will remain where it is. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm/nouveau/kms: switch hpd_lock from mutex to spinlockBen Skeggs
There's no good reason for this to be a mutex, and once the layers of workqueues have been untangled, nouveau_connector_hpd() can be called from IRQ context and won't be able to take a mutex. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
2022-11-09drm: lcdif: Set and enable FIFO Panic thresholdMarek Vasut
In case the LCDIFv3 is used to drive a 4k panel via i.MX8MP HDMI bridge, the LCDIFv3 becomes susceptible to FIFO underflows, these lead to nasty flicker of the image on the panel, or image being shifted by half frame horizontally every second frame. The flicker can be easily triggered by running 3D application on top of weston compositor, like neverball or chromium. Surprisingly glmark2-es2-wayland or glmark2-es2-drm does not trigger this effect so easily. Configure the FIFO Panic threshold register and enable the FIFO Panic mode, which internally boosts the NoC interconnect priority for LCDIFv3 transactions in case of possible underflow. This mitigates the flicker effect on 4k panels as well. Fixes: 9db35bb349a0 ("drm: lcdif: Add support for i.MX8MP LCDIF variant") Signed-off-by: Marek Vasut <marex@denx.de> Tested-by: Liu Ying <victor.liu@nxp.com> # i.MX8mp EVK Reviewed-by: Liu Ying <victor.liu@nxp.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221101152629.21768-1-marex@denx.de
2022-11-08docs: kmsan: fix formatting of "Example report"Alexander Potapenko
Add a blank line to make the sentence before the list render as a separate paragraph, not a definition. Link: https://lkml.kernel.org/r/20221107142255.4038811-1-glider@google.com Fixes: 93858ae70cf4 ("kmsan: add ReST documentation") Signed-off-by: Alexander Potapenko <glider@google.com> Suggested-by: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08mm/damon/dbgfs: check if rm_contexts input is for a real contextSeongJae Park
A user could write a name of a file under 'damon/' debugfs directory, which is not a user-created context, to 'rm_contexts' file. In the case, 'dbgfs_rm_context()' just assumes it's the valid DAMON context directory only if a file of the name exist. As a result, invalid memory access could happen as below. Fix the bug by checking if the given input is for a directory. This check can filter out non-context inputs because directories under 'damon/' debugfs directory can be created via only 'mk_contexts' file. This bug has found by syzbot[1]. [1] https://lore.kernel.org/damon/000000000000ede3ac05ec4abf8e@google.com/ Link: https://lkml.kernel.org/r/20221107165001.5717-2-sj@kernel.org Fixes: 75c1c2b53c78 ("mm/damon/dbgfs: support multiple contexts") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: syzbot+6087eafb76a94c4ac9eb@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> [5.15.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08maple_tree: don't set a new maximum on the node when not reusing nodesLiam Howlett
In RCU mode, the node limits were being updated to the last pivot which may not be correct and would cause the metadata to be set when it shouldn't. Fix this by not setting a new limit in this case. Link: https://lkml.kernel.org/r/20221107163857.867377-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08maple_tree: fix depth tracking in maple_stateLiam Howlett
It is possible to confuse the depth tracking in the maple state by searching the same node for values. Fix the depth tracking by moving where the depth is incremented closer to where the node changes level. Also change the initial depth setting when using the root node. Link: https://lkml.kernel.org/r/20221107163814.866612-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08arch/x86/mm/hugetlbpage.c: pud_huge() returns 0 when using 2-level pagingNaoya Horiguchi
The following bug is reported to be triggered when starting X on x86-32 system with i915: [ 225.777375] kernel BUG at mm/memory.c:2664! [ 225.777391] invalid opcode: 0000 [#1] PREEMPT SMP [ 225.777405] CPU: 0 PID: 2402 Comm: Xorg Not tainted 6.1.0-rc3-bdg+ #86 [ 225.777415] Hardware name: /8I865G775-G, BIOS F1 08/29/2006 [ 225.777421] EIP: __apply_to_page_range+0x24d/0x31c [ 225.777437] Code: ff ff 8b 55 e8 8b 45 cc e8 0a 11 ec ff 89 d8 83 c4 28 5b 5e 5f 5d c3 81 7d e0 a0 ef 96 c1 74 ad 8b 45 d0 e8 2d 83 49 00 eb a3 <0f> 0b 25 00 f0 ff ff 81 eb 00 00 00 40 01 c3 8b 45 ec 8b 00 e8 76 [ 225.777446] EAX: 00000001 EBX: c53a3b58 ECX: b5c00000 EDX: c258aa00 [ 225.777454] ESI: b5c00000 EDI: b5900000 EBP: c4b0fdb4 ESP: c4b0fd80 [ 225.777462] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010202 [ 225.777470] CR0: 80050033 CR2: b5900000 CR3: 053a3000 CR4: 000006d0 [ 225.777479] Call Trace: [ 225.777486] ? i915_memcpy_init_early+0x63/0x63 [i915] [ 225.777684] apply_to_page_range+0x21/0x27 [ 225.777694] ? i915_memcpy_init_early+0x63/0x63 [i915] [ 225.777870] remap_io_mapping+0x49/0x75 [i915] [ 225.778046] ? i915_memcpy_init_early+0x63/0x63 [i915] [ 225.778220] ? mutex_unlock+0xb/0xd [ 225.778231] ? i915_vma_pin_fence+0x6d/0xf7 [i915] [ 225.778420] vm_fault_gtt+0x2a9/0x8f1 [i915] [ 225.778644] ? lock_is_held_type+0x56/0xe7 [ 225.778655] ? lock_is_held_type+0x7a/0xe7 [ 225.778663] ? 0xc1000000 [ 225.778670] __do_fault+0x21/0x6a [ 225.778679] handle_mm_fault+0x708/0xb21 [ 225.778686] ? mt_find+0x21e/0x5ae [ 225.778696] exc_page_fault+0x185/0x705 [ 225.778704] ? doublefault_shim+0x127/0x127 [ 225.778715] handle_exception+0x130/0x130 [ 225.778723] EIP: 0xb700468a Recently pud_huge() got aware of non-present entry by commit 3a194f3f8ad0 ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry") to handle some special states of gigantic page. However, it's overlooked that pud_none() always returns false when running with 2-level paging, and as a result pud_huge() can return true pointlessly. Introduce "#if CONFIG_PGTABLE_LEVELS > 2" to pud_huge() to deal with this. Link: https://lkml.kernel.org/r/20221107021010.2449306-1-naoya.horiguchi@linux.dev Fixes: 3a194f3f8ad0 ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry") Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Yang Shi <shy828301@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08fs: fix leaked psi pressure stateJohannes Weiner
When psi annotations were added to to btrfs compression reads, the psi state tracking over add_ra_bio_pages and btrfs_submit_compressed_read was faulty. A pressure state, once entered, is never left. This results in incorrectly elevated pressure, which triggers OOM kills. pflags record the *previous* memstall state when we enter a new one. The code tried to initialize pflags to 1, and then optimize the leave call when we either didn't enter a memstall, or were already inside a nested stall. However, there can be multiple PageWorkingset pages in the bio, at which point it's that path itself that enters repeatedly and overwrites pflags. This causes us to miss the exit. Enter the stall only once if needed, then unwind correctly. erofs has the same problem, fix that up too. And move the memstall exit past submit_bio() to restore submit accounting originally added by b8e24a9300b0 ("block: annotate refault stalls from IO submission"). Link: https://lkml.kernel.org/r/Y2UHRqthNUwuIQGS@cmpxchg.org Fixes: 4088a47e78f9 ("btrfs: add manual PSI accounting for compressed reads") Fixes: 99486c511f68 ("erofs: add manual PSI accounting for the compressed address space") Fixes: 118f3663fbc6 ("block: remove PSI accounting from the bio layer") Link: https://lore.kernel.org/r/d20a0a85-e415-cf78-27f9-77dd7a94bc8d@leemhuis.info/ Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Thorsten Leemhuis <linux@leemhuis.info> Tested-by: Thorsten Leemhuis <linux@leemhuis.info> Cc: Chao Yu <chao@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Sterba <dsterba@suse.com> Cc: Gao Xiang <xiang@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08nilfs2: fix use-after-free bug of ns_writer on remountRyusuke Konishi
If a nilfs2 filesystem is downgraded to read-only due to metadata corruption on disk and is remounted read/write, or if emergency read-only remount is performed, detaching a log writer and synchronizing the filesystem can be done at the same time. In these cases, use-after-free of the log writer (hereinafter nilfs->ns_writer) can happen as shown in the scenario below: Task1 Task2 -------------------------------- ------------------------------ nilfs_construct_segment nilfs_segctor_sync init_wait init_waitqueue_entry add_wait_queue schedule nilfs_remount (R/W remount case) nilfs_attach_log_writer nilfs_detach_log_writer nilfs_segctor_destroy kfree finish_wait _raw_spin_lock_irqsave __raw_spin_lock_irqsave do_raw_spin_lock debug_spin_lock_before <-- use-after-free While Task1 is sleeping, nilfs->ns_writer is freed by Task2. After Task1 waked up, Task1 accesses nilfs->ns_writer which is already freed. This scenario diagram is based on the Shigeru Yoshida's post [1]. This patch fixes the issue by not detaching nilfs->ns_writer on remount so that this UAF race doesn't happen. Along with this change, this patch also inserts a few necessary read-only checks with superblock instance where only the ns_writer pointer was used to check if the filesystem is read-only. Link: https://syzkaller.appspot.com/bug?id=79a4c002e960419ca173d55e863bd09e8112df8b Link: https://lkml.kernel.org/r/20221103141759.1836312-1-syoshida@redhat.com [1] Link: https://lkml.kernel.org/r/20221104142959.28296-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+f816fa82f8783f7a02bb@syzkaller.appspotmail.com Reported-by: Shigeru Yoshida <syoshida@redhat.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08x86/traps: avoid KMSAN bugs originating from handle_bug()Alexander Potapenko
There is a case in exc_invalid_op handler that is executed outside the irqentry_enter()/irqentry_exit() region when an UD2 instruction is used to encode a call to __warn(). In that case the `struct pt_regs` passed to the interrupt handler is never unpoisoned by KMSAN (this is normally done in irqentry_enter()), which leads to false positives inside handle_bug(). Use kmsan_unpoison_entry_regs() to explicitly unpoison those registers before using them. Link: https://lkml.kernel.org/r/20221102110611.1085175-5-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08kmsan: make sure PREEMPT_RT is offAlexander Potapenko
As pointed out by Peter Zijlstra, __msan_poison_alloca() does not play well with IRQ code when PREEMPT_RT is on, because in that mode even GFP_ATOMIC allocations cannot be performed. Fixing this would require making stackdepot completely lockless, which is quite challenging and may be excessive for the time being. Instead, make sure KMSAN is incompatible with PREEMPT_RT, like other debug configs are. Link: https://lkml.kernel.org/r/20221102110611.1085175-4-glider@google.com Link: https://lore.kernel.org/lkml/20221025221755.3810809-1-glider@google.com/ Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08Kconfig.debug: ensure early check for KMSAN in CONFIG_KMSAN_WARNAlexander Potapenko
As pointed out by Masahiro Yamada, Kconfig picks up the first default entry which has true 'if' condition. Hence, the previously added check for KMSAN was never used, because it followed the checks for 64BIT and !64BIT. Put KMSAN check before others to ensure it is always applied. Link: https://lkml.kernel.org/r/20221102110611.1085175-3-glider@google.com Link: https://github.com/google/kmsan/issues/89 Link: https://lore.kernel.org/linux-mm/20221024212144.2852069-3-glider@google.com/ Fixes: 921757bc9b61 ("Kconfig.debug: disable CONFIG_FRAME_WARN for KMSAN by default") Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Marco Elver <elver@google.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08x86/uaccess: instrument copy_from_user_nmi()Alexander Potapenko
Make sure usercopy hooks from linux/instrumented.h are invoked for copy_from_user_nmi(). This fixes KMSAN false positives reported when dumping opcodes for a stack trace. Link: https://lkml.kernel.org/r/20221102110611.1085175-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Marco Elver <elver@google.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08kmsan: core: kmsan_in_runtime() should return true in NMI contextAlexander Potapenko
Without that, every call to __msan_poison_alloca() in NMI may end up allocating memory, which is NMI-unsafe. Link: https://lkml.kernel.org/r/20221102110611.1085175-1-glider@google.com Link: https://lore.kernel.org/lkml/20221025221755.3810809-1-glider@google.com/ Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08mm: hugetlb_vmemmap: include missing linux/moduleparam.hVasily Gorbik
The kernel test robot reported build failures with a 'randconfig' on s390: >> mm/hugetlb_vmemmap.c:421:11: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes] core_param(hugetlb_free_vmemmap, vmemmap_optimize_enabled, bool, 0); ^ Link: https://lore.kernel.org/linux-mm/202210300751.rG3UDsuc-lkp@intel.com/ Link: https://lkml.kernel.org/r/patch.git-296b83ca939b.your-ad-here.call-01667411912-ext-5073@work.hours Fixes: 30152245c63b ("mm: hugetlb_vmemmap: replace early_param() with core_param()") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08mm/shmem: use page_mapping() to detect page cache for uffd continuePeter Xu
mfill_atomic_install_pte() checks page->mapping to detect whether one page is used in the page cache. However as pointed out by Matthew, the page can logically be a tail page rather than always the head in the case of uffd minor mode with UFFDIO_CONTINUE. It means we could wrongly install one pte with shmem thp tail page assuming it's an anonymous page. It's not that clear even for anonymous page, since normally anonymous pages also have page->mapping being setup with the anon vma. It's safe here only because the only such caller to mfill_atomic_install_pte() is always passing in a newly allocated page (mcopy_atomic_pte()), whose page->mapping is not yet setup. However that's not extremely obvious either. For either of above, use page_mapping() instead. Link: https://lkml.kernel.org/r/Y2K+y7wnhC4vbnP2@x1n Fixes: 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem") Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08mm/memremap.c: map FS_DAX device memory as decryptedPankaj Gupta
virtio_pmem use devm_memremap_pages() to map the device memory. By default this memory is mapped as encrypted with SEV. Guest reboot changes the current encryption key and guest no longer properly decrypts the FSDAX device meta data. Mark the corresponding device memory region for FSDAX devices (mapped with memremap_pages) as decrypted to retain the persistent memory property. Link: https://lkml.kernel.org/r/20221102160728.3184016-1-pankaj.gupta@amd.com Fixes: b7b3c01b19159 ("mm/memremap_pages: support multiple ranges per invocation") Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08Partly revert "mm/thp: carry over dirty bit when thp splits on pmd"Peter Xu
Anatoly Pugachev reported sparc64 breakage on the patch: https://lore.kernel.org/r/20221021160603.GA23307@u164.east.ru The sparc64 impl of pte_mkdirty() is definitely slightly special in that it leverages a code patching mechanism for sun4u/sun4v on relevant pgtable entry operations. Before having a clue of why the sparc64 is special and caused the patch to SIGSEGV the processes, revert the patch for now. The swap path of dirty bit inheritage is kept because that's using the swap shared code so we assume it'll not be affected. Link: https://lkml.kernel.org/r/Y1Wbi4yyVvDtg4zN@x1n Fixes: 0ccf7f168e17 ("mm/thp: carry over dirty bit when thp splits on pmd") Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Anatoly Pugachev <matorola@gmail.com> Tested-by: Anatoly Pugachev <matorola@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Minchan Kim <minchan@kernel.org> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>