summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2022-11-30KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't validSean Christopherson
Skip the WRMSR fastpath in SVM's VM-Exit handler if the next RIP isn't valid, e.g. because KVM is running with nrips=false. SVM must decode and emulate to skip the WRMSR if the CPU doesn't provide the next RIP. Getting the instruction bytes to decode the WRMSR requires reading guest memory, which in turn means dereferencing memslots, and that isn't safe because KVM doesn't hold SRCU when the fastpath runs. Don't bother trying to enable the fastpath for this case, e.g. by doing only the WRMSR and leaving the "skip" until later. NRIPS is supported on all modern CPUs (KVM has considered making it mandatory), and the next RIP will be valid the vast, vast majority of the time. ============================= WARNING: suspicious RCU usage 6.0.0-smp--4e557fcd3d80-skip #13 Tainted: G O ----------------------------- include/linux/kvm_host.h:954 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by stable/206475: #0: ffff9d9dfebcc0f0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x8b/0x620 [kvm] stack backtrace: CPU: 152 PID: 206475 Comm: stable Tainted: G O 6.0.0-smp--4e557fcd3d80-skip #13 Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 10.48.0 01/27/2022 Call Trace: <TASK> dump_stack_lvl+0x69/0xaa dump_stack+0x10/0x12 lockdep_rcu_suspicious+0x11e/0x130 kvm_vcpu_gfn_to_memslot+0x155/0x190 [kvm] kvm_vcpu_gfn_to_hva_prot+0x18/0x80 [kvm] paging64_walk_addr_generic+0x183/0x450 [kvm] paging64_gva_to_gpa+0x63/0xd0 [kvm] kvm_fetch_guest_virt+0x53/0xc0 [kvm] __do_insn_fetch_bytes+0x18b/0x1c0 [kvm] x86_decode_insn+0xf0/0xef0 [kvm] x86_emulate_instruction+0xba/0x790 [kvm] kvm_emulate_instruction+0x17/0x20 [kvm] __svm_skip_emulated_instruction+0x85/0x100 [kvm_amd] svm_skip_emulated_instruction+0x13/0x20 [kvm_amd] handle_fastpath_set_msr_irqoff+0xae/0x180 [kvm] svm_vcpu_run+0x4b8/0x5a0 [kvm_amd] vcpu_enter_guest+0x16ca/0x22f0 [kvm] kvm_arch_vcpu_ioctl_run+0x39d/0x900 [kvm] kvm_vcpu_ioctl+0x538/0x620 [kvm] __se_sys_ioctl+0x77/0xc0 __x64_sys_ioctl+0x1d/0x20 do_syscall_64+0x3d/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 404d5d7bff0d ("KVM: X86: Introduce more exit_fastpath_completion enum values") Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220930234031.1732249-1-seanjc@google.com
2022-11-30KVM: x86: Fail emulation during EMULTYPE_SKIP on any exceptionSean Christopherson
Treat any exception during instruction decode for EMULTYPE_SKIP as a "full" emulation failure, i.e. signal failure instead of queuing the exception. When decoding purely to skip an instruction, KVM and/or the CPU has already done some amount of emulation that cannot be unwound, e.g. on an EPT misconfig VM-Exit KVM has already processeed the emulated MMIO. KVM already does this if a #UD is encountered, but not for other exceptions, e.g. if a #PF is encountered during fetch. In SVM's soft-injection use case, queueing the exception is particularly problematic as queueing exceptions while injecting events can put KVM into an infinite loop due to bailing from VM-Enter to service the newly pending exception. E.g. multiple warnings to detect such behavior fire: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1017 at arch/x86/kvm/x86.c:9873 kvm_arch_vcpu_ioctl_run+0x1de5/0x20a0 [kvm] Modules linked in: kvm_amd ccp kvm irqbypass CPU: 3 PID: 1017 Comm: svm_nested_soft Not tainted 6.0.0-rc1+ #220 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x1de5/0x20a0 [kvm] Call Trace: kvm_vcpu_ioctl+0x223/0x6d0 [kvm] __x64_sys_ioctl+0x85/0xc0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1017 at arch/x86/kvm/x86.c:9987 kvm_arch_vcpu_ioctl_run+0x12a3/0x20a0 [kvm] Modules linked in: kvm_amd ccp kvm irqbypass CPU: 3 PID: 1017 Comm: svm_nested_soft Tainted: G W 6.0.0-rc1+ #220 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x12a3/0x20a0 [kvm] Call Trace: kvm_vcpu_ioctl+0x223/0x6d0 [kvm] __x64_sys_ioctl+0x85/0xc0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 ---[ end trace 0000000000000000 ]--- Fixes: 6ea6e84309ca ("KVM: x86: inject exceptions produced by x86_decode_insn") Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220930233632.1725475-1-seanjc@google.com
2022-11-30KVM: x86: Keep the lock order consistent between SRCU and gpc spinlockPeng Hao
Acquire SRCU before taking the gpc spinlock in wait_pending_event() so as to be consistent with all other functions that acquire both locks. It's not illegal to acquire SRCU inside a spinlock, nor is there deadlock potential, but in general it's preferable to order locks from least restrictive to most restrictive, e.g. if wait_pending_event() needed to sleep for whatever reason, it could do so while holding SRCU, but would need to drop the spinlock. Signed-off-by: Peng Hao <flyingpeng@tencent.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/CAPm50a++Cb=QfnjMZ2EnCj-Sb9Y4UM-=uOEtHAcjnNLCAAf-dQ@mail.gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-30s390/mm: use pmd_pgtable_page() helper in __gmap_segment_gaddr()Anshuman Khandual
In __gmap_segment_gaddr() pmd level page table page is being extracted from the pmd pointer, similar to pmd_pgtable_page() implementation. This reduces some redundancy by directly using pmd_pgtable_page() instead, though first making it available. Link: https://lkml.kernel.org/r/20221125034502.1559986-1-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30mm/uffd: sanity check write bit for uffd-wp protected ptesPeter Xu
Let's add one sanity check for CONFIG_DEBUG_VM on the write bit in whatever chance we have when walking through the pgtables. It can bring the error earlier even before the app notices the data was corrupted on the snapshot. Also it helps us to identify this is a wrong pgtable setup, so hopefully a great information to have for debugging too. Link: https://lkml.kernel.org/r/20221114000447.1681003-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30mm: delay page_remove_rmap() until after the TLB has been flushedLinus Torvalds
When we remove a page table entry, we are very careful to only free the page after we have flushed the TLB, because other CPUs could still be using the page through stale TLB entries until after the flush. However, we have removed the rmap entry for that page early, which means that functions like folio_mkclean() would end up not serializing with the page table lock because the page had already been made invisible to rmap. And that is a problem, because while the TLB entry exists, we could end up with the following situation: (a) one CPU could come in and clean it, never seeing our mapping of the page (b) another CPU could continue to use the stale and dirty TLB entry and continue to write to said page resulting in a page that has been dirtied, but then marked clean again, all while another CPU might have dirtied it some more. End result: possibly lost dirty data. This extends our current TLB gather infrastructure to optionally track a "should I do a delayed page_remove_rmap() for this page after flushing the TLB". It uses the newly introduced 'encoded page pointer' to do that without having to keep separate data around. Note, this is complicated by a couple of issues: - we want to delay the rmap removal, but not past the page table lock, because that simplifies the memcg accounting - only SMP configurations want to delay TLB flushing, since on UP there are obviously no remote TLBs to worry about, and the page table lock means there are no preemption issues either - s390 has its own mmu_gather model that doesn't delay TLB flushing, and as a result also does not want the delayed rmap. As such, we can treat S390 like the UP case and use a common fallback for the "no delays" case. - we can track an enormous number of pages in our mmu_gather structure, with MAX_GATHER_BATCH_COUNT batches of MAX_TABLE_BATCH pages each, all set up to be approximately 10k pending pages. We do not want to have a huge number of batched pages that we then need to check for delayed rmap handling inside the page table lock. Particularly that last point results in a noteworthy detail, where the normal page batch gathering is limited once we have delayed rmaps pending, in such a way that only the last batch (the so-called "active batch") in the mmu_gather structure can have any delayed entries. NOTE! While the "possibly lost dirty data" sounds catastrophic, for this all to happen you need to have a user thread doing either madvise() with MADV_DONTNEED or a full re-mmap() of the area concurrently with another thread continuing to use said mapping. So arguably this is about user space doing crazy things, but from a VM consistency standpoint it's better if we track the dirty bit properly even when user space goes off the rails. [akpm@linux-foundation.org: fix UP build, per Linus] Link: https://lore.kernel.org/all/B88D3073-440A-41C7-95F4-895D3F657EF2@gmail.com/ Link: https://lkml.kernel.org/r/20221109203051.1835763-4-torvalds@linux-foundation.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Hugh Dickins <hughd@google.com> Reported-by: Nadav Amit <nadav.amit@gmail.com> Tested-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30mm: mmu_gather: prepare to gather encoded page pointers with flagsLinus Torvalds
This is purely a preparatory patch that makes all the data structures ready for encoding flags with the mmu_gather page pointers. The code currently always sets the flag to zero and doesn't use it yet, but now it's tracking the type state along. The next step will be to actually start using it. Link: https://lkml.kernel.org/r/20221109203051.1835763-3-torvalds@linux-foundation.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30mm: remove unused savedwrite infrastructureDavid Hildenbrand
NUMA hinting no longer uses savedwrite, let's rip it out. ... and while at it, drop __pte_write() and __pmd_write() on ppc64. Link: https://lkml.kernel.org/r/20221108174652.198904-7-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nadav Amit <namit@vmware.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30KVM: VMX: Resume guest immediately when injecting #GP on ECREATESean Christopherson
Resume the guest immediately when injecting a #GP on ECREATE due to an invalid enclave size, i.e. don't attempt ECREATE in the host. The #GP is a terminal fault, e.g. skipping the instruction if ECREATE is successful would result in KVM injecting #GP on the instruction following ECREATE. Fixes: 70210c044b4e ("KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions") Cc: stable@vger.kernel.org Cc: Kai Huang <kai.huang@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Link: https://lore.kernel.org/r/20220930233132.1723330-1-seanjc@google.com
2022-11-30Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A set of clk driver fixes that resolve issues for various SoCs. Most of these are incorrect clk data, like bad parent descriptions. When the clk tree is improperly described things don't work, like USB and UFS controllers, because clk frequencies are wonky. Here are the extra details: - Fix the parent of UFS reference clks on Qualcomm SC8280XP so that UFS works properly - Fix the clk ID for USB on AT91 RM9200 so the USB driver continues to probe - Stop using of_device_get_match_data() on the wrong device for a Samsung Exynos driver so it gets the proper clk data - Fix ExynosAutov9 binding - Fix the parent of the div4 clk on Exynos7885 - Stop calling runtime PM APIs from the Qualcomm GDSC driver directly as it leads to a lockdep splat and is just plain wrong because it violates runtime PM semantics by calling runtime PM APIs when the device has been runtime PM disabled" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: qcom: gcc-sc8280xp: add cxo as parent for three ufs ref clks ARM: at91: rm9200: fix usb device clock id clk: samsung: Revert "clk: samsung: exynos-clkout: Use of_device_get_match_data()" dt-bindings: clock: exynosautov9: fix reference to CMU_FSYS1 clk: qcom: gdsc: Remove direct runtime PM calls clk: samsung: exynos7885: Correct "div4" clock parents
2022-12-01mm, slob: rename CONFIG_SLOB to CONFIG_SLOB_DEPRECATEDVlastimil Babka
As explained in [1], we would like to remove SLOB if possible. - There are no known users that need its somewhat lower memory footprint so much that they cannot handle SLUB (after some modifications by the previous patches) instead. - It is an extra maintenance burden, and a number of features are incompatible with it. - It blocks the API improvement of allowing kfree() on objects allocated via kmem_cache_alloc(). As the first step, rename the CONFIG_SLOB option in the slab allocator configuration choice to CONFIG_SLOB_DEPRECATED. Add CONFIG_SLOB depending on CONFIG_SLOB_DEPRECATED as an internal option to avoid code churn. This will cause existing .config files and defconfigs with CONFIG_SLOB=y to silently switch to the default (and recommended replacement) SLUB, while still allowing SLOB to be configured by anyone that notices and needs it. But those should contact the slab maintainers and linux-mm@kvack.org as explained in the updated help. With no valid objections, the plan is to update the existing defconfigs to SLUB and remove SLOB in a few cycles. To make SLUB more suitable replacement for SLOB, a CONFIG_SLUB_TINY option was introduced to limit SLUB's memory overhead. There is a number of defconfigs specifying CONFIG_SLOB=y. As part of this patch, update them to select CONFIG_SLUB and CONFIG_SLUB_TINY. [1] https://lore.kernel.org/all/b35c3f82-f67b-2103-7d82-7a7ba7521439@suse.cz/ Cc: Russell King <linux@armlinux.org.uk> Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: Janusz Krzysztofik <jmkrzyszt@gmail.com> Cc: Tony Lindgren <tony@atomide.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Conor Dooley <conor@kernel.org> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi> # OMAP1 Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> # riscv k210 Acked-by: Arnd Bergmann <arnd@arndb.de> # arm Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Christoph Lameter <cl@linux.com>
2022-11-30Merge branch 'mm-hotfixes-stable' into mm-stableAndrew Morton
2022-11-30mm: introduce arch_has_hw_nonleaf_pmd_young()Juergen Gross
When running as a Xen PV guests commit eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") can cause a protection violation in pmdp_test_and_clear_young(): BUG: unable to handle page fault for address: ffff8880083374d0 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 3026067 P4D 3026067 PUD 3027067 PMD 7fee5067 PTE 8010000008337065 Oops: 0003 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 158 Comm: kswapd0 Not tainted 6.1.0-rc5-20221118-doflr+ #1 RIP: e030:pmdp_test_and_clear_young+0x25/0x40 This happens because the Xen hypervisor can't emulate direct writes to page table entries other than PTEs. This can easily be fixed by introducing arch_has_hw_nonleaf_pmd_young() similar to arch_has_hw_pte_young() and test that instead of CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG. Link: https://lkml.kernel.org/r/20221123064510.16225-1-jgross@suse.com Fixes: eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") Signed-off-by: Juergen Gross <jgross@suse.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Yu Zhao <yuzhao@google.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: David Hildenbrand <david@redhat.com> [core changes] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30mm: add dummy pmd_young() for architectures not having itJuergen Gross
In order to avoid #ifdeffery add a dummy pmd_young() implementation as a fallback. This is required for the later patch "mm: introduce arch_has_hw_nonleaf_pmd_young()". Link: https://lkml.kernel.org/r/fd3ac3cd-7349-6bbd-890a-71a9454ca0b3@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30KVM: Drop @gpa from exported gfn=>pfn cache check() and refresh() helpersSean Christopherson
Drop the @gpa param from the exported check()+refresh() helpers and limit changing the cache's GPA to the activate path. All external users just feed in gpc->gpa, i.e. this is a fancy nop. Allowing users to change the GPA at check()+refresh() is dangerous as those helpers explicitly allow concurrent calls, e.g. KVM could get into a livelock scenario. It's also unclear as to what the expected behavior should be if multiple tasks attempt to refresh with different GPAs. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2022-11-30KVM: Use gfn_to_pfn_cache's immutable "kvm" in kvm_gpc_refresh()Michal Luczaj
Make kvm_gpc_refresh() use kvm instance cached in gfn_to_pfn_cache. No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> [sean: leave kvm_gpc_unmap() as-is] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2022-11-30KVM: Use gfn_to_pfn_cache's immutable "kvm" in kvm_gpc_check()Michal Luczaj
Make kvm_gpc_check() use kvm instance cached in gfn_to_pfn_cache. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2022-11-30KVM: Store immutable gfn_to_pfn_cache propertiesMichal Luczaj
Move the assignment of immutable properties @kvm, @vcpu, and @usage to the initializer. Make _activate() and _deactivate() use stored values. Note, @len is also effectively immutable for most cases, but not in the case of the Xen runstate cache, which may be split across two pages and the length of the first segment will depend on its address. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> [sean: handle @len in a separate patch] Signed-off-by: Sean Christopherson <seanjc@google.com> [dwmw2: acknowledge that @len can actually change for some use cases] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2022-11-30KVM: x86/xen: add support for 32-bit guests in SCHEDOP_pollMetin Kaya
This patch introduces compat version of struct sched_poll for SCHEDOP_poll sub-operation of sched_op hypercall, reads correct amount of data (16 bytes in 32-bit case, 24 bytes otherwise) by using new compat_sched_poll struct, copies it to sched_poll properly, and lets rest of the code run as is. Signed-off-by: Metin Kaya <metikaya@amazon.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org>
2022-11-30Merge tag 'at91-defconfig-6.2-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/defconfig AT91 defconfig for 6.2 #2 It contains: - updates for defconfigs to use the new CONFIG_VIDEO_MICROCHIP_ISC, CONFIG_VIDEO_MICROCHIP_XISC config flags that replaced the CONFIG_VIDEO_ATMEL_ISC, CONFIG_VIDEO_ATMEL_XISC. Drivers under CONFIG_VIDEO_ATMEL_* were moved to staging and considered deprecated. * tag 'at91-defconfig-6.2-2' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: ARM: configs: multi_v7: switch to new MICROCHIP_ISC driver ARM: configs: sama5/7: switch to new MICROCHIP_ISC driver Link: https://lore.kernel.org/r/20221125142736.385654-1-claudiu.beznea@microchip.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30KVM: x86: fix uninitialized variable use on KVM_REQ_TRIPLE_FAULTPaolo Bonzini
If a triple fault was fixed by kvm_x86_ops.nested_ops->triple_fault (by turning it into a vmexit), there is no need to leave vcpu_enter_guest(). Any vcpu->requests will be caught later before the actual vmentry, and in fact vcpu_enter_guest() was not initializing the "r" variable. Depending on the compiler's whims, this could cause the x86_64/triple_fault_event_test test to fail. Cc: Maxim Levitsky <mlevitsk@redhat.com> Fixes: 92e7d5c83aff ("KVM: x86: allow L1 to not intercept triple fault") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-30arm64: defconfig: Enable Qualcomm SM6115 / SM4250 GCC and PinctrlBhupesh Sharma
Enable the Qualcomm SM6115 / SM4250 TLMM pinctrl and GCC clock drivers. They need to be builtin to ensure that the UART is allowed to probe before user space needs a console. Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org> Acked-by: Konrad Dybcio <konrad.dybcio@linaro.org> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221128200834.1776868-1-bhupesh.sharma@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30Merge tag 'arm-soc/for-6.2/devicetree-arm64' of ↵Arnd Bergmann
https://github.com/Broadcom/stblinux into soc/dt This pull request contains Broadcom ARM64-based SoCs Device Tree updates for 6.2, please pull the following: - Rafal describes the timer/watchdog block for the BCM4908 and BCM6858 SoCs - Krzysztof corrects invalid "reg" properties for the memory nodes that were off by one digit - Pierre updates a number of cache Device Tree node properties to be schema compliant * tag 'arm-soc/for-6.2/devicetree-arm64' of https://github.com/Broadcom/stblinux: arm64: dts: Update cache properties for broadcom arm64: dts: broadcom: trim addresses to 8 digits arm64: dts: broadcom: bcmbca: bcm6858: add TWD block arm64: dts: broadcom: bcmbca: bcm4908: add TWD block timer Link: https://lore.kernel.org/r/20221129191755.542584-2-f.fainelli@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30Merge tag 'arm-soc/for-6.2/devicetree' of ↵Arnd Bergmann
https://github.com/Broadcom/stblinux into soc/dt This pull request contains Broadcom ARM-based SoCs Device Tree changes for 6.2, please pull the following: - Linus adds support for the D-Link DWL-8610AP which is based upon the BCM53016 SoC and the D-Link DIR-890L routers - Maxime resolves a long standing issue affecting Raspberry Pi devices by switching entirely over to the VPU firmware clock provider rather than mixing the "bare metal" clock driver and VPU - Rafal corrects the description of the TP-Link router partitions to use the "safeloader" partition parser - Stefan fixes a number of invalid underscores in the bcm283x DTS files and also moves the ACT LED into a separate DTS include file for better re-use - Krzysztof aligns the LEDs DT nodes to the proper schema format - Pierre adds missing cache properties to various SoCs * tag 'arm-soc/for-6.2/devicetree' of https://github.com/Broadcom/stblinux: arm: dts: Update cache properties for broadcom ARM: dts: broadcom: align LED node names with dtschema ARM: dts: bcm283x: Move ACT LED into separate dtsi ARM: dts: bcm283x: Fix underscores in node names ARM: dts: BCM5301X: Correct description of TP-Link partitions ARM: dts: bcm47094: Add devicetree for D-Link DIR-890L dt-bindings: ARM: add bindings for the D-Link DIR-890L ARM: dts: bcm2835-rpi: Use firmware clocks for display ARM: dts: bcm283x: Remove bcm2835-rpi-common.dtsi from SoC DTSI ARM: dts: bcm53016: Add devicetree for D-Link DWL-8610AP dt-bindings: ARM: add bindings for the D-Link DWL-8610AP Link: https://lore.kernel.org/r/20221129191755.542584-1-f.fainelli@gmail.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30Merge tag 'juno-updates-6.2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/dt Armv8 Juno/FVP updates for v6.2 Just few addtions including updates to cache information on various platforms to align well with the bindings, addition of cache information on FVP Rev C model, addition of SPE to Foundation model and updates to LED node names. * tag 'juno-updates-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: ARM: dts: vexpress: align LED node names with dtschema arm64: dts: fvp: Add information about L1 and L2 caches arm64: dts: fvp: Add SPE to Foundation FVP arm64: dts: Update cache properties for Arm Ltd platforms arm64: dts: juno: Add thermal critical trip points Link: https://lore.kernel.org/r/20221129115111.2464233-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30Revert "ARM: dts: stm32: add fake interrupt propoerty for ASync notif - ↵Alexandre Torgue
TEMP/TO REMOVE" This reverts commit fb4ce97d9c5daafe100a83670c697b92c9d1bb45. Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com> Link: https://lore.kernel.org/r/20221128133339.25055-1-alexandre.torgue@foss.st.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30KVM: x86: fix uninitialized variable use on KVM_REQ_TRIPLE_FAULTPaolo Bonzini
If a triple fault was fixed by kvm_x86_ops.nested_ops->triple_fault (by turning it into a vmexit), there is no need to leave vcpu_enter_guest(). Any vcpu->requests will be caught later before the actual vmentry, and in fact vcpu_enter_guest() was not initializing the "r" variable. Depending on the compiler's whims, this could cause the x86_64/triple_fault_event_test test to fail. Cc: Maxim Levitsky <mlevitsk@redhat.com> Fixes: 92e7d5c83aff ("KVM: x86: allow L1 to not intercept triple fault") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-30KVM: Shorten gfn_to_pfn_cache function namesMichal Luczaj
Formalize "gpc" as the acronym and use it in function names. No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-30KVM: x86/xen: Add runstate tests for 32-bit mode and crossing page boundaryDavid Woodhouse
Torture test the cases where the runstate crosses a page boundary, and and especially the case where it's configured in 32-bit mode and doesn't, but then switching to 64-bit mode makes it go onto the second page. To simplify this, make the KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST ioctl also update the guest runstate area. It already did so if the actual runstate changed, as a side-effect of kvm_xen_update_runstate(). So doing it in the plain adjustment case is making it more consistent, as well as giving us a nice way to trigger the update without actually running the vCPU again and changing the values. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-30KVM: x86/xen: Allow XEN_RUNSTATE_UPDATE flag behaviour to be configuredDavid Woodhouse
Closer inspection of the Xen code shows that we aren't supposed to be using the XEN_RUNSTATE_UPDATE flag unconditionally. It should be explicitly enabled by guests through the HYPERVISOR_vm_assist hypercall. If we randomly set the top bit of ->state_entry_time for a guest that hasn't asked for it and doesn't expect it, that could make the runtimes fail to add up and confuse the guest. Without the flag it's perfectly safe for a vCPU to read its own vcpu_runstate_info; just not for one vCPU to read *another's*. I briefly pondered adding a word for the whole set of VMASST_TYPE_* flags but the only one we care about for HVM guests is this, so it seemed a bit pointless. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20221127122210.248427-3-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-30KVM: x86/xen: Compatibility fixes for shared runstate areaDavid Woodhouse
The guest runstate area can be arbitrarily byte-aligned. In fact, even when a sane 32-bit guest aligns the overall structure nicely, the 64-bit fields in the structure end up being unaligned due to the fact that the 32-bit ABI only aligns them to 32 bits. So setting the ->state_entry_time field to something|XEN_RUNSTATE_UPDATE is buggy, because if it's unaligned then we can't update the whole field atomically; the low bytes might be observable before the _UPDATE bit is. Xen actually updates the *byte* containing that top bit, on its own. KVM should do the same. In addition, we cannot assume that the runstate area fits within a single page. One option might be to make the gfn_to_pfn cache cope with regions that cross a page — but getting a contiguous virtual kernel mapping of a discontiguous set of IOMEM pages is a distinctly non-trivial exercise, and it seems this is the *only* current use case for the GPC which would benefit from it. An earlier version of the runstate code did use a gfn_to_hva cache for this purpose, but it still had the single-page restriction because it used the uhva directly — because it needs to be able to do so atomically when the vCPU is being scheduled out, so it used pagefault_disable() around the accesses and didn't just use kvm_write_guest_cached() which has a fallback path. So... use a pair of GPCs for the first and potential second page covering the runstate area. We can get away with locking both at once because nothing else takes more than one GPC lock at a time so we can invent a trivial ordering rule. The common case where it's all in the same page is kept as a fast path, but in both cases, the actual guest structure (compat or not) is built up from the fields in @vx, following preset pointers to the state and times fields. The only difference is whether those pointers point to the kernel stack (in the split case) or to guest memory directly via the GPC. The fast path is also fixed to use a byte access for the XEN_RUNSTATE_UPDATE bit, then the only real difference is the dual memcpy. Finally, Xen also does write the runstate area immediately when it's configured. Flip the kvm_xen_update_runstate() and …_guest() functions and call the latter directly when the runstate area is set. This means that other ioctls which modify the runstate also write it immediately to the guest when they do so, which is also intended. Update the xen_shinfo_test to exercise the pathological case where the XEN_RUNSTATE_UPDATE flag in the top byte of the state_entry_time is actually in a different page to the rest of the 64-bit word. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-30Merge tag 'mvebu-dt-6.2-1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/dt mvebu dt for 6.2 (part 1) Fix assigned-addresses for every PCIe Root Port Align LED node names with dtschema Add a new kirkwood based board: Zyxel NSA310S Fix compatible string for gpios for Armada 38x and 39x Add interrupts for watchdog on Armada XP Turris Omnia (Armada 385 based): - Add switch port 6 node - Add ethernet aliases Switch to using gpiod API in pm-board code * tag 'mvebu-dt-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: ARM: dts: armada-xp: add interrupts for watchdog ARM: dts: armada: align LED node names with dtschema ARM: mvebu: switch to using gpiod API in pm-board code ARM: dts: armada-39x: Fix compatible string for gpios ARM: dts: armada-38x: Fix compatible string for gpios ARM: dts: turris-omnia: Add switch port 6 node ARM: dts: turris-omnia: Add ethernet aliases ARM: dts: armada-39x: Fix assigned-addresses for every PCIe Root Port ARM: dts: armada-38x: Fix assigned-addresses for every PCIe Root Port ARM: dts: armada-375: Fix assigned-addresses for every PCIe Root Port ARM: dts: armada-xp: Fix assigned-addresses for every PCIe Root Port ARM: dts: armada-370: Fix assigned-addresses for every PCIe Root Port ARM: dts: dove: Fix assigned-addresses for every PCIe Root Port ARM: dts: kirkwood: Add Zyxel NSA310S board Link: https://lore.kernel.org/r/87cz979adf.fsf@BL-laptop Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30Merge tag 'mvebu-dt64-6.2-1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/dt mvebu dt64 for 6.2 (part 1) Update cache properties for various Marvell SoCs Reserved memory for optee firmware Turris Mox (Armada 3720 based Socs) - Define slot-power-limit-milliwatt for PCIe - Add missing interrupt for RTC * tag 'mvebu-dt64-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: arm64: dts: marvell: add optee FW definitions arm64: dts: Update cache properties for marvell arm64: dts: armada-3720-turris-mox: Add missing interrupt for RTC arm64: dts: armada-3720-turris-mox: Define slot-power-limit-milliwatt for PCIe Link: https://lore.kernel.org/r/87fse39aer.fsf@BL-laptop Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30Merge tag 'at91-dt-6.2-3' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into soc/dt AT91 DT for 6.2 #3 It contains: - proper power rail description for SDMMC devices available on SAMA7G5-EK - OTP controller has been added for LAN966X devices * tag 'at91-dt-6.2-3' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: ARM: dts: lan966x: Add otp support ARM: dts: at91: sama7g5ek: align power rails for sdmmc0/1 Link: https://lore.kernel.org/r/20221125140525.384928-1-claudiu.beznea@microchip.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30Merge tag 'musb-for-v6.2-signed' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into soc/dt Devicetree related musb changes for omap3 for v6.2 Recent musb driver regressions eposed two issues for musb legacy probing. The changes to use device_set_of_node_from_dev() confuse the legacy interconnect code. And we now have to manually populate the musb core irq resources. The musb driver has a fix for these, but it's not a good long term solution. To fix the issue properly, let's just update musb to probe with ti-sysc interconnect driver with proper devicetree data. This allows dropping most of the musb driver workaround later on. And with these changes we have the omap2430 musb glue layer behaving the same way for all the SoCs using it. We need to patch the ti-sysc driver quirks, and add devicetree data to make things work. And we want to drop the legacy data too to avoid pointless warnings. As we have a musb driver workaround, these changes are not needed as fixes and can wait for the merge window. * tag 'musb-for-v6.2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2+: Drop legacy hwmod data for omap3 otg ARM: dts: Update omap3 musb to probe with ti-sysc bus: ti-sysc: Add otg quirk flags for omap3 musb Link: https://lore.kernel.org/r/pull-1669364566-84575@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30Merge tag 'omap-for-v6.2/dt-signed' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into soc/dt Devicetree fixes for omaps for v6.2 Two devicetree fixes for omaps. These fixes are not urgent and can wait for the merge window: - Fix up the node names and missing #pwm-cells property for ti,omap-dmtimer-pwm to avoid warnings when the the related yaml binding gets merged - Fix TDA998x port addressing * tag 'omap-for-v6.2/dt-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: Unify pwm-omap-dmtimer node names ARM: dts: am335x: Fix TDA998x ports addressing ARM: dts: am335x-pcm-953: Define fixed regulators in root node Link: https://lore.kernel.org/r/pull-1669363695-856423@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-30Merge tag 'qcom-arm64-for-6.2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/dt Qualcomm ARM64 DTS updates for 6.2 This introduces support for SM4250, SM6115, SM6375 and SDM670 platforms and Sony Xperia 10 IV, Google Pixel 3a, OnePlus 3, OnePlus 3T, Google Pazquel and OnePlus Nord N100. A wide variety of updates to align with DeviceTree bindings across many/most platforms is introduced, and incorrectly styled comments are adjusted across the tree. Apps RSC is added to the cluster-idle power-domain across SM8150, SM8250, SM8350 and SM8450, to ensure sleep and wake votes are flushed as the last core is being powered down. Remoteproc firmware patches are aligned with agreed upon structure used in linux-firmware across Inforce 6560, Lenovo Miix 630, various Sony Xperia devices and Samsung Galaxy Book2 (although these are not available in linux-firmware today). On IPQ8074 CPU clocks are added, thermal zones are introduced and vqmmc supply is specified for the HK01 board. Alcatel OneTouch Idol 3 gains LED nodes and Samsung Galaxy A3U gained vibrator support. The application subsystem's IOMMU and the display subsystem is enabled for MSM8953. A new CPU frequency table is introduced for MSM8996Pro, to properly describe it separate of MSM8996. The GPU opp-table is extended as well. On SC7180 USB is marked as a wakeup source, USB gains required-opps to ensure that the core voltage rail is voted for as needed. The description of the fingerprint sensor in Trogdor is corrected. On SC7280 Wake-on-WLAN is introduced, and PHY parameters for the SNPS USB PHY is defined across SC7280. The memory map across Google Herobrine is adjusted, to regain unused memory on the WiFi SKUs. A LTE SKU of the Evoker board is introduced and the bard gains touchscreen. NVME support is disabled on Villager boards, as it's not used. PCIe support is introduced on SC8280XP, with NVMe, SDX55 (5G) and WiFi enabled on the Lenovo Thinkpad X13s and Compute Reference Device. ADCs and thermal zones are intrduced for the same. Lenovo Thinkpad X13s gains LID switch support. Fairphone FP3 gains touchscreen support. Support for Xiaomi Poco F1 variant with EBBG panel. The round-robin ADC is enabled across DB845c, OnePlus devices and Pocophone F1 devices. The displayport controller on SDM845 is introduced. SM6350 gains SDHCI support and on Sony Xperia 10 III sd-card, touchscreen and GPI DMA is enabled. Fairphone FP4 got SD-card support. UFS PHY register ranges are corrected across SM8150, SM8250, SM8350 and SM8450. Sony Xperia 1 II got NFC support and Sony Xperia 5 III got PMIC regulators defined and USB definition corrected, to enable USB3. The SDHCI controller is described for SM8450 and microSD support is enabled for the HDK and QRD devices. SM8450 also gains camera CCI interface and display clock controller. * tag 'qcom-arm64-for-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (261 commits) arm64: dts: qcom: sdm845-polaris: Don't duplicate DMA assignment arm64: dts: qcom: sm8350-sagami: Wire up USB regulators and fix USB3 arm64: dts: qcom: sm8350-sagami: Add most RPMh regulators arm64: dts: qcom: sc7280: Make herobrine-audio-rt5682 mic dtsi's match more arm64: dts: qcom: trim addresses to 8 digits arm64: dts: msm8998: unify PCIe clock order withMSM8996 arm64: dts: msm8998: add MSM8998 specific compatible arm64: dts: qcom: sc8280xp-x13s: enable WiFi controller arm64: dts: qcom: sc8280xp-x13s: enable modem arm64: dts: qcom: sc8280xp-x13s: enable NVMe SSD arm64: dts: qcom: sc8280xp-crd: enable WiFi controller arm64: dts: qcom: sc8280xp-crd: enable SDX55 modem arm64: dts: qcom: sc8280xp-crd: enable NVMe SSD arm64: dts: qcom: sc8280xp-crd: rename backlight and misc regulators arm64: dts: qcom: sa8295p-adp: enable PCIe arm64: dts: qcom: sc8280xp/sa8540p: add PCIe2-4 nodes arm64: dts: qcom: add sdm670 and pixel 3a device trees arm64: dts: qcom: sc7280: Add Google Herobrine WIFI SKU dts fragment arm64: dts: qcom: sc7280: Mark all Qualcomm reference boards as LTE arm64: dts: qcom: sm7225-fairphone-fp4: Enable SD card ... Link: https://lore.kernel.org/r/20221124100650.1982448-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-29Merge patch series "riscv: kexec: Fxiup crash_save percpu and ↵Palmer Dabbelt
machine_kexec_mask_interrupts" guoren@kernel.org <guoren@kernel.org> says: From: Guo Ren <guoren@linux.alibaba.com> Current riscv kexec can't crash_save percpu states and disable interrupts properly. The patch series fix them, make kexec work correct. * b4-shazam-merge: riscv: kexec: Fixup crash_smp_send_stop without multi cores riscv: kexec: Fixup irq controller broken in kexec crash path Link: https://lore.kernel.org/r/20221020141603.2856206-1-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: kexec: Fixup crash_smp_send_stop without multi coresGuo Ren
Current crash_smp_send_stop is the same as the generic one in kernel/panic and misses crash_save_cpu in percpu. This patch is inspired by 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()") and adds the same mechanism for riscv. Before this patch, test result: crash> help -r CPU 0: [OFFLINE] CPU 1: epc : ffffffff80009ff0 ra : ffffffff800b789a sp : ff2000001098bb40 gp : ffffffff815fca60 tp : ff60000004680000 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff2000001098bc90 s1 : ffffffff81600798 a0 : ff2000001098bb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000004690800 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff2000001098bb48 s3 : ffffffff81093ec8 s4 : ffffffff816004ac s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f720 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff2000001098b9a8 CPU 2: [OFFLINE] CPU 3: [OFFLINE] After this patch, test result: crash> help -r CPU 0: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ffffffff81403eb0 gp : ffffffff815fcb48 tp : ffffffff81413400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffff81403ec0 s1 : 0000000000000000 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eac s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 1: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff2000000068bf30 gp : ffffffff815fcb48 tp : ff6000000240d400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff2000000068bf40 s1 : 0000000000000001 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039ea8 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 2: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff20000000693f30 gp : ffffffff815fcb48 tp : ff6000000240e900 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff20000000693f40 s1 : 0000000000000002 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eb0 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 3: epc : ffffffff8000a1e4 ra : ffffffff800b7bba sp : ff200000109bbb40 gp : ffffffff815fcb48 tp : ff6000000373aa00 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff200000109bbc90 s1 : ffffffff816007a0 a0 : ff200000109bbb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000002c61c00 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff200000109bbb48 s3 : ffffffff810941a8 s4 : ffffffff816004b4 s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f7a0 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff200000109bb9a8 Fixes: ad943893d5f1 ("RISC-V: Fixup schedule out issue in machine_crash_shutdown()") Reviewed-by: Xianting Tian <xianting.tian@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Nick Kossifidis <mick@ics.forth.gr> Link: https://lore.kernel.org/r/20221020141603.2856206-3-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: kexec: Fixup irq controller broken in kexec crash pathGuo Ren
If a crash happens on cpu3 and all interrupts are binding on cpu0, the bad irq routing will cause a crash kernel which can't receive any irq. Because crash kernel won't clean up all harts' PLIC enable bits in enable registers. This patch is similar to 9141a003a491 ("ARM: 7316/1: kexec: EOI active and mask all interrupts in kexec crash path") and 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()"), and PowerPC also has the same mechanism. Fixes: fba8a8674f68 ("RISC-V: Add kexec support") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Xianting Tian <xianting.tian@linux.alibaba.com> Cc: Nick Kossifidis <mick@ics.forth.gr> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20221020141603.2856206-2-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: mm: Proper page permissions after initmem freeBjörn Töpel
64-bit RISC-V kernels have the kernel image mapped separately to alias the linear map. The linear map and the kernel image map are documented as "direct mapping" and "kernel" respectively in [1]. At image load time, the linear map corresponding to the kernel image is set to PAGE_READ permission, and the kernel image map is set to PAGE_READ|PAGE_EXEC. When the initmem is freed, the pages in the linear map should be restored to PAGE_READ|PAGE_WRITE, whereas the corresponding pages in the kernel image map should be restored to PAGE_READ, by removing the PAGE_EXEC permission. This is not the case. For 64-bit kernels, only the linear map is restored to its proper page permissions at initmem free, and not the kernel image map. In practise this results in that the kernel can potentially jump to dead __init code, and start executing invalid instructions, without getting an exception. Restore the freed initmem properly, by setting both the kernel image map to the correct permissions. [1] Documentation/riscv/vm-layout.rst Fixes: e5c35fa04019 ("riscv: Map the kernel with correct permissions the first time") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr> Tested-by: Alexandre Ghiti <alex@ghiti.fr> Link: https://lore.kernel.org/r/20221115090641.258476-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: vdso: fix section overlapping under some conditionsJisheng Zhang
lkp reported a build error, I tried the config and can reproduce build error as below: VDSOLD arch/riscv/kernel/vdso/vdso.so.dbg ld.lld: error: section .note file range overlaps with .text >>> .note range is [0x7C8, 0x803] >>> .text range is [0x800, 0x1993] ld.lld: error: section .text file range overlaps with .dynamic >>> .text range is [0x800, 0x1993] >>> .dynamic range is [0x808, 0x937] ld.lld: error: section .note virtual address range overlaps with .text >>> .note range is [0x7C8, 0x803] >>> .text range is [0x800, 0x1993] Fix it by setting DISABLE_BRANCH_PROFILING which will disable branch tracing for vdso, thus avoid useless _ftrace_annotated_branch section and _ftrace_branch section. Although we can also fix it by removing the hardcoded .text begin address, but I think that's another story and should be put into another patch. Link: https://lore.kernel.org/lkml/202210122123.Cc4FPShJ-lkp@intel.com/#r Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20221102170254.1925-1-jszhang@kernel.org Fixes: ad5d1122b82f ("riscv: use vDSO common flow to reduce the latency of the time-related functions") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: fix race when vmap stack overflowJisheng Zhang
Currently, when detecting vmap stack overflow, riscv firstly switches to the so called shadow stack, then use this shadow stack to call the get_overflow_stack() to get the overflow stack. However, there's a race here if two or more harts use the same shadow stack at the same time. To solve this race, we introduce spin_shadow_stack atomic var, which will be swap between its own address and 0 in atomic way, when the var is set, it means the shadow_stack is being used; when the var is cleared, it means the shadow_stack isn't being used. Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Suggested-by: Guo Ren <guoren@kernel.org> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org [Palmer: Add AQ to the swap, and also some comments.] Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
tools/lib/bpf/ringbuf.c 927cbb478adf ("libbpf: Handle size overflow for ringbuf mmap") b486d19a0ab0 ("libbpf: checkpatch: Fixed code alignments in ringbuf.c") https://lore.kernel.org/all/20221121122707.44d1446a@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29Merge tag 'v6.2-rockchip-dts64-1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into asahi-wip New boards: - Model A and blade baseboards for the SOQuartz (rk3568) SoM, - Anberic RG351M, RG353V, RG353VS; Odroid Go Super, Advance gaming devices - Odroid M1 - Theobroma px30 SoM with baseboard - Rockchip's own rk3566 demo board Some core support for per SoC specifics: - crypto support for rk3399 and rk3328 - second I2S controller for rk3568 - Cache properties for follow the binding for rk3308 and rk3328 Bigger device support updates for: - SOQuartz: PCIe2, video output, gpu, HDMI sound - Rock 3A: eth regulator, eth clock input, Wifi+Bt, I2S, PCIe3 As well as some minor extensions for Rock960 (hdmi supplies), rk3566-roc-pc (PCIe2), Rock 4C+ (thermal support), Pinephone Pro (Wifi+Bt) * tag 'v6.2-rockchip-dts64-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: (51 commits) arm64: dts: rockchip: update cache properties for rk3308 and rk3328 arm64: dts: rockchip: Add SOQuartz Model A baseboard dt-bindings: arm: rockchip: Add SOQuartz Model A arm64: dts: rockchip: Add SOQuartz blade board dt-bindings: arm: rockchip: Add SOQuartz Blade arm64: dts: rockchip: Add Anbernic RG351M arm64: dts: rockchip: Add Odroid Go Super arm64: dts: rockchip: Add Odroid Go Advance Black Edition dt-bindings: arm: rockchip: Add more RK3326 devices arm64: dts: rockchip: Move most of Odroid Go Advance DTS into a DTSI arm64: dts: rockchip: Add support of regulator for ethernet node on Rock 3A SBC arm64: dts: rockchip: Add support of external clock to ethernet node on Rock 3A SBC arm64: dts: rockchip: Add HDMI supplies on Rock960 arm64: dts: rockchip: Add dts for rockchip rk3566 box demo board dt-bindings: rockchip: Add Rockchip rk3566 box demo board arm64: dts: rockchip: Enable PCIe 2 on SOQuartz CM4IO arm64: dts: rockchip: Enable HDMI sound on SOQuartz arm64: dts: rockchip: Enable video output and HDMI on SOQuartz arm64: dts: rockchip: Enable GPU on SOQuartz CM4 arm64: dts: rockchip: enable pcie2 on rk3566-roc-pc ... Link: https://lore.kernel.org/r/4716610.aeNJFYEL58@phil Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-29Merge tag 'renesas-arm-dt-for-v6.2-tag3' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into soc/dt Renesas ARM DT updates for v6.2 (take three) - Rename Renesas DTB overlay source files from .dts to .dtso. * tag 'renesas-arm-dt-for-v6.2-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel: arm64: dts: renesas: Rename DTB overlay source files from .dts to .dtso Link: https://lore.kernel.org/r/cover.1669283381.git.geert+renesas@glider.be Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-29RISC-V: enable sparsemem by default for defconfigConor Dooley
on an arch level, RISC-V defaults to FLATMEM. On PolarFire SoC, the memory layout is almost always sparse, with a maximum of 1 GiB at 0x8000_0000 & a possible 16 GiB range at 0x10_0000_0000. The Icicle kit, for example, has 2 GiB of DDR - so there's a big hole in the memory map between the two gigs. Prior to v6.1-rc1, boot times from defconfig builds were pretty bad on Icicle but enabling sparsemem would fix those issues. As of v6.1-rc1, the Icicle kit no longer boots from defconfig builds with the in-kernel devicetree. A change to the memory map resulted in a futher "sparse-ification", producing a splat on boot: OF: fdt: Ignoring memory range 0x80000000 - 0x80200000 Machine model: Microchip PolarFire-SoC Icicle Kit earlycon: ns16550a0 at MMIO32 0x0000000020100000 (options '115200n8') printk: bootconsole [ns16550a0] enabled printk: debug: skip boot console de-registration. efi: UEFI not found. Zone ranges: DMA32 [mem 0x0000000080200000-0x00000000ffffffff] Normal [mem 0x0000000100000000-0x000000107fffffff] Movable zone start for each node Early memory node ranges node 0: [mem 0x0000000080200000-0x00000000bfbfffff] node 0: [mem 0x00000000bfc00000-0x00000000bfffffff] node 0: [mem 0x0000001040000000-0x000000107fffffff] Initmem setup node 0 [mem 0x0000000080200000-0x000000107fffffff] Kernel panic - not syncing: Failed to allocate 1073741824 bytes for node 0 memory map CPU: 0 PID: 0 Comm: swapper Not tainted 5.19.0-dirty #1 Hardware name: Microchip PolarFire-SoC Icicle Kit (DT) Call Trace: [<ffffffff800057f0>] show_stack+0x30/0x3c [<ffffffff807d5802>] dump_stack_lvl+0x4a/0x66 [<ffffffff807d5836>] dump_stack+0x18/0x20 [<ffffffff807d1ae8>] panic+0x124/0x2c6 [<ffffffff80814064>] free_area_init_core+0x0/0x11e [<ffffffff80813720>] free_area_init_node+0xc2/0xf6 [<ffffffff8081331e>] free_area_init+0x222/0x260 [<ffffffff808064d6>] misc_mem_init+0x62/0x9a [<ffffffff80803cb2>] setup_arch+0xb0/0xea [<ffffffff8080039a>] start_kernel+0x88/0x4ee ---[ end Kernel panic - not syncing: Failed to allocate 1073741824 bytes for node 0 memory map ]--- With the aim of keeping defconfig builds booting on icicle, enable SPARSEMEM_MANUAL. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221021160028.4042304-1-conor@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29x86/cpuid: Carve out all CPUID functionalityBorislav Petkov
Carve it out into a special header, where it belongs. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221124164150.3040-1-bp@alien8.de
2022-11-29x86/hyperv: Remove unregister syscore call from Hyper-V cleanupGaurav Kohli
Hyper-V cleanup code comes under panic path where preemption and irq is already disabled. So calling of unregister_syscore_ops might schedule out the thread even for the case where mutex lock is free. hyperv_cleanup unregister_syscore_ops mutex_lock(&syscore_ops_lock) might_sleep Here might_sleep might schedule out this thread, where voluntary preemption config is on and this thread will never comes back. And also this was added earlier to maintain the symmetry which is not required as this can comes during crash shutdown path only. To prevent the same, removing unregister_syscore_ops function call. Signed-off-by: Gaurav Kohli <gauravkohli@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1669443291-2575-1-git-send-email-gauravkohli@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-11-29ARM: dts: socfpga: align LED node names with dtschemaKrzysztof Kozlowski
The node names should be generic and DT schema expects certain pattern: socfpga_arria5_socdk.dtb: leds: 'hps0', 'hps1', 'hps2', 'hps3' do not match any of the regexes: '(^led-[0-9a-f]$|led)', 'pinctrl-[0-9]+' Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>