summaryrefslogtreecommitdiff
path: root/arch/um/kernel/um_arch.c
AgeCommit message (Collapse)Author
2025-04-14x86/alternatives, um: Rename UML's text_poke_sync() wrapper to ↵Ingo Molnar
smp_text_poke_sync_each_cpu() Missed this UML wrapper in the rename. Fixes: 6e4955a9d73e ("x86/alternatives: Rename 'text_poke_sync()' to 'smp_text_poke_sync_each_cpu()'") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/202504141003.kc69fVoj-lkp@intel.com
2025-04-02Merge tag 'uml-for-linux-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Johannes Berg: - proper nofault accesses and read-only rodata - hostfs fix for host inode number reuse - fixes for host errno handling - various cleanups/small fixes * tag 'uml-for-linux-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: um: Rewrite the sigio workaround based on epoll and tgkill um: Prohibit the VM_CLONE flag in run_helper_thread() um: Switch to the pthread-based helper in sigio workaround um: ubd: Switch to the pthread-based helper um: Add pthread-based helper support um: x86: clean up elf specific definitions um: Store full CSGSFS and SS register from mcontext um: virt-pci: Refactor virtio_pcidev into its own module um: work around sched_yield not yielding in time-travel mode um/locking: Remove semicolon from "lock" prefix um: Update min_low_pfn to match changes in uml_reserved um: use str_yes_no() to remove hardcoded "yes" and "no" um: hostfs: avoid issues on inode number reuse by host um: Allocate vdso page pointer statically um: remove copy_from_kernel_nofault_allowed um: mark rodata read-only and implement _nofault accesses um: Pass the correct Rust target and options with gcc
2025-04-01Merge tag 'mm-stable-2025-03-30-16-52' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - The series "Enable strict percpu address space checks" from Uros Bizjak uses x86 named address space qualifiers to provide compile-time checking of percpu area accesses. This has caused a small amount of fallout - two or three issues were reported. In all cases the calling code was found to be incorrect. - The series "Some cleanup for memcg" from Chen Ridong implements some relatively monir cleanups for the memcontrol code. - The series "mm: fixes for device-exclusive entries (hmm)" from David Hildenbrand fixes a boatload of issues which David found then using device-exclusive PTE entries when THP is enabled. More work is needed, but this makes thins better - our own HMM selftests now succeed. - The series "mm: zswap: remove z3fold and zbud" from Yosry Ahmed remove the z3fold and zbud implementations. They have been deprecated for half a year and nobody has complained. - The series "mm: further simplify VMA merge operation" from Lorenzo Stoakes implements numerous simplifications in this area. No runtime effects are anticipated. - The series "mm/madvise: remove redundant mmap_lock operations from process_madvise()" from SeongJae Park rationalizes the locking in the madvise() implementation. Performance gains of 20-25% were observed in one MADV_DONTNEED microbenchmark. - The series "Tiny cleanup and improvements about SWAP code" from Baoquan He contains a number of touchups to issues which Baoquan noticed when working on the swap code. - The series "mm: kmemleak: Usability improvements" from Catalin Marinas implements a couple of improvements to the kmemleak user-visible output. - The series "mm/damon/paddr: fix large folios access and schemes handling" from Usama Arif provides a couple of fixes for DAMON's handling of large folios. - The series "mm/damon/core: fix wrong and/or useless damos_walk() behaviors" from SeongJae Park fixes a few issues with the accuracy of kdamond's walking of DAMON regions. - The series "expose mapping wrprotect, fix fb_defio use" from Lorenzo Stoakes changes the interaction between framebuffer deferred-io and core MM. No functional changes are anticipated - this is preparatory work for the future removal of page structure fields. - The series "mm/damon: add support for hugepage_size DAMOS filter" from Usama Arif adds a DAMOS filter which permits the filtering by huge page sizes. - The series "mm: permit guard regions for file-backed/shmem mappings" from Lorenzo Stoakes extends the guard region feature from its present "anon mappings only" state. The feature now covers shmem and file-backed mappings. - The series "mm: batched unmap lazyfree large folios during reclamation" from Barry Song cleans up and speeds up the unmapping for pte-mapped large folios. - The series "reimplement per-vma lock as a refcount" from Suren Baghdasaryan puts the vm_lock back into the vma. Our reasons for pulling it out were largely bogus and that change made the code more messy. This patchset provides small (0-10%) improvements on one microbenchmark. - The series "Docs/mm/damon: misc DAMOS filters documentation fixes and improves" from SeongJae Park does some maintenance work on the DAMON docs. - The series "hugetlb/CMA improvements for large systems" from Frank van der Linden addresses a pile of issues which have been observed when using CMA on large machines. - The series "mm/damon: introduce DAMOS filter type for unmapped pages" from SeongJae Park enables users of DMAON/DAMOS to filter my the page's mapped/unmapped status. - The series "zsmalloc/zram: there be preemption" from Sergey Senozhatsky teaches zram to run its compression and decompression operations preemptibly. - The series "selftests/mm: Some cleanups from trying to run them" from Brendan Jackman fixes a pile of unrelated issues which Brendan encountered while runnimg our selftests. - The series "fs/proc/task_mmu: add guard region bit to pagemap" from Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to determine whether a particular page is a guard page. - The series "mm, swap: remove swap slot cache" from Kairui Song removes the swap slot cache from the allocation path - it simply wasn't being effective. - The series "mm: cleanups for device-exclusive entries (hmm)" from David Hildenbrand implements a number of unrelated cleanups in this code. - The series "mm: Rework generic PTDUMP configs" from Anshuman Khandual implements a number of preparatoty cleanups to the GENERIC_PTDUMP Kconfig logic. - The series "mm/damon: auto-tune aggregation interval" from SeongJae Park implements a feedback-driven automatic tuning feature for DAMON's aggregation interval tuning. - The series "Fix lazy mmu mode" from Ryan Roberts fixes some issues in powerpc, sparc and x86 lazy MMU implementations. Ryan did this in preparation for implementing lazy mmu mode for arm64 to optimize vmalloc. - The series "mm/page_alloc: Some clarifications for migratetype fallback" from Brendan Jackman reworks some commentary to make the code easier to follow. - The series "page_counter cleanup and size reduction" from Shakeel Butt cleans up the page_counter code and fixes a size increase which we accidentally added late last year. - The series "Add a command line option that enables control of how many threads should be used to allocate huge pages" from Thomas Prescher does that. It allows the careful operator to significantly reduce boot time by tuning the parallalization of huge page initialization. - The series "Fix calculations in trace_balance_dirty_pages() for cgwb" from Tang Yizhou fixes the tracing output from the dirty page balancing code. - The series "mm/damon: make allow filters after reject filters useful and intuitive" from SeongJae Park improves the handling of allow and reject filters. Behaviour is made more consistent and the documention is updated accordingly. - The series "Switch zswap to object read/write APIs" from Yosry Ahmed updates zswap to the new object read/write APIs and thus permits the removal of some legacy code from zpool and zsmalloc. - The series "Some trivial cleanups for shmem" from Baolin Wang does as it claims. - The series "fs/dax: Fix ZONE_DEVICE page reference counts" from Alistair Popple regularizes the weird ZONE_DEVICE page refcount handling in DAX, permittig the removal of a number of special-case checks. - The series "refactor mremap and fix bug" from Lorenzo Stoakes is a preparatoty refactoring and cleanup of the mremap() code. - The series "mm: MM owner tracking for large folios (!hugetlb) + CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in which we determine whether a large folio is known to be mapped exclusively into a single MM. - The series "mm/damon: add sysfs dirs for managing DAMOS filters based on handling layers" from SeongJae Park adds a couple of new sysfs directories to ease the management of DAMON/DAMOS filters. - The series "arch, mm: reduce code duplication in mem_init()" from Mike Rapoport consolidates many per-arch implementations of mem_init() into code generic code, where that is practical. - The series "mm/damon/sysfs: commit parameters online via damon_call()" from SeongJae Park continues the cleaning up of sysfs access to DAMON internal data. - The series "mm: page_ext: Introduce new iteration API" from Luiz Capitulino reworks the page_ext initialization to fix a boot-time crash which was observed with an unusual combination of compile and cmdline options. - The series "Buddy allocator like (or non-uniform) folio split" from Zi Yan reworks the code to split a folio into smaller folios. The main benefit is lessened memory consumption: fewer post-split folios are generated. - The series "Minimize xa_node allocation during xarry split" from Zi Yan reduces the number of xarray xa_nodes which are generated during an xarray split. - The series "drivers/base/memory: Two cleanups" from Gavin Shan performs some maintenance work on the drivers/base/memory code. - The series "Add tracepoints for lowmem reserves, watermarks and totalreserve_pages" from Martin Liu adds some more tracepoints to the page allocator code. - The series "mm/madvise: cleanup requests validations and classifications" from SeongJae Park cleans up some warts which SeongJae observed during his earlier madvise work. - The series "mm/hwpoison: Fix regressions in memory failure handling" from Shuai Xue addresses two quite serious regressions which Shuai has observed in the memory-failure implementation. - The series "mm: reliable huge page allocator" from Johannes Weiner makes huge page allocations cheaper and more reliable by reducing fragmentation. - The series "Minor memcg cleanups & prep for memdescs" from Matthew Wilcox is preparatory work for the future implementation of memdescs. - The series "track memory used by balloon drivers" from Nico Pache introduces a way to track memory used by our various balloon drivers. - The series "mm/damon: introduce DAMOS filter type for active pages" from Nhat Pham permits users to filter for active/inactive pages, separately for file and anon pages. - The series "Adding Proactive Memory Reclaim Statistics" from Hao Jia separates the proactive reclaim statistics from the direct reclaim statistics. - The series "mm/vmscan: don't try to reclaim hwpoison folio" from Jinjiang Tu fixes our handling of hwpoisoned pages within the reclaim code. * tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (431 commits) mm/page_alloc: remove unnecessary __maybe_unused in order_to_pindex() x86/mm: restore early initialization of high_memory for 32-bits mm/vmscan: don't try to reclaim hwpoison folio mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper cgroup: docs: add pswpin and pswpout items in cgroup v2 doc mm: vmscan: split proactive reclaim statistics from direct reclaim statistics selftests/mm: speed up split_huge_page_test selftests/mm: uffd-unit-tests support for hugepages > 2M docs/mm/damon/design: document active DAMOS filter type mm/damon: implement a new DAMOS filter type for active pages fs/dax: don't disassociate zero page entries MM documentation: add "Unaccepted" meminfo entry selftests/mm: add commentary about 9pfs bugs fork: use __vmalloc_node() for stack allocation docs/mm: Physical Memory: Populate the "Zones" section xen: balloon: update the NR_BALLOON_PAGES state hv_balloon: update the NR_BALLOON_PAGES state balloon_compaction: update the NR_BALLOON_PAGES state meminfo: add a per node counter for balloon drivers mm: remove references to folio in __memcg_kmem_uncharge_page() ...
2025-03-18um: use str_yes_no() to remove hardcoded "yes" and "no"Ethan Carter Edwards
Remove hard-coded strings by using the str_yes_no() helper function provided by <linux/string_choices.h>. Signed-off-by: Ethan Carter Edwards <ethan@ethancedwards.com> Link: https://patch.msgid.link/20250220-um_yes_no-v1-1-2a355ed2d225@ethancedwards.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-17arch, mm: set high_memory in free_area_init()Mike Rapoport (Microsoft)
high_memory defines upper bound on the directly mapped memory. This bound is defined by the beginning of ZONE_HIGHMEM when a system has high memory and by the end of memory otherwise. All this is known to generic memory management initialization code that can set high_memory while initializing core mm structures. Add a generic calculation of high_memory to free_area_init() and remove per-architecture calculation except for the architectures that set and use high_memory earlier than that. Link: https://lkml.kernel.org/r/20250313135003.836600-11-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> [x86] Tested-by: Mark Brown <broonie@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Betkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Guo Ren (csky) <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Richard Weinberger <richard@nod.at> Cc: Russel King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17arch, mm: set max_mapnr when allocating memory map for FLATMEMMike Rapoport (Microsoft)
max_mapnr is essentially the size of the memory map for systems that use FLATMEM. There is no reason to calculate it in each and every architecture when it's anyway calculated in alloc_node_mem_map(). Drop setting of max_mapnr from architecture code and set it once in alloc_node_mem_map(). While on it, move definition of mem_map and max_mapnr to mm/mm_init.c so there won't be two copies for MMU and !MMU variants. Link: https://lkml.kernel.org/r/20250313135003.836600-10-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> [x86] Tested-by: Mark Brown <broonie@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Betkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Guo Ren (csky) <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Richard Weinberger <richard@nod.at> Cc: Russel King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-02-03Revert "x86/module: prepare module loading for ROX allocations of text"Mike Rapoport (Microsoft)
The module code does not create a writable copy of the executable memory anymore so there is no need to handle it in module relocation and alternatives patching. This reverts commit 9bfc4824fd4836c16bb44f922bfaffba5da3e4f3. Signed-off-by: "Mike Rapoport (Microsoft)" <rppt@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250126074733.1384926-8-rppt@kernel.org
2025-01-10um: Mark get_top_address as __initTiwei Bie
It's only invoked during boot from linux_main(). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20241128083137.2219830-4-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-10um: Mark parse_cache_line as __initTiwei Bie
It's only invoked during boot from get_host_cpu_features(). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20241128083137.2219830-3-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-10um: Mark parse_host_cpu_flags as __initTiwei Bie
It's only invoked during boot from get_host_cpu_features(). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20241128083137.2219830-2-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-01-10um: Count iomem_size only once in physmem calculationTiwei Bie
When calculating max_physmem, we've already factored in the space used by iomem. We don't need to subtract it again. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20241128081939.2216246-4-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-11-30Merge tag 'uml-for-linus-6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Lots of cleanups, mostly from Benjamin Berg and Tiwei Bie - Removal of unused code - Fix for sparse warnings - Cleanup around stub_exe() * tag 'uml-for-linus-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (68 commits) hostfs: Fix the NULL vs IS_ERR() bug for __filemap_get_folio() um: move thread info into task um: Always dump trace for specified task in show_stack um: vector: Do not use drvdata in release um: net: Do not use drvdata in release um: ubd: Do not use drvdata in release um: ubd: Initialize ubd's disk pointer in ubd_add um: virtio_uml: query the number of vqs if supported um: virtio_uml: fix call_fd IRQ allocation um: virtio_uml: send SET_MEM_TABLE message with the exact size um: remove broken double fault detection um: remove duplicate UM_NSEC_PER_SEC definition um: remove file sync for stub data um: always include kconfig.h and compiler-version.h um: set DONTDUMP and DONTFORK flags on KASAN shadow memory um: fix sparse warnings in signal code um: fix sparse warnings from regset refactor um: Remove double zero check um: fix stub exe build with CONFIG_GCOV um: Use os_set_pdeathsig helper in winch thread/process ...
2024-11-12um: move thread info into taskBenjamin Berg
This selects the THREAD_INFO_IN_TASK option for UM and changes the way that the current task is discovered. This is trivial though, as UML already tracks the current task in cpu_tasks[] and this can be used to retrieve it. Also remove the signal handler code that copies the thread information into the IRQ stack. It is obsolete now, which also means that the mentioned race condition cannot happen anymore. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Reviewed-by: Hajime Tazaki <thehajime@gmail.com> Link: https://patch.msgid.link/20241111102910.46512-1-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-11-07x86/module: prepare module loading for ROX allocations of textMike Rapoport (Microsoft)
When module text memory will be allocated with ROX permissions, the memory at the actual address where the module will live will contain invalid instructions and there will be a writable copy that contains the actual module code. Update relocations and alternatives patching to deal with it. [rppt@kernel.org: fix writable address in cfi_rewrite_endbr()] Link: https://lkml.kernel.org/r/ZysRwR29Ji8CcbXc@kernel.org Link: https://lkml.kernel.org/r/20241023162711.2579610-7-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Tested-by: kdevops <kdevops@lists.linux.dev> Tested-by: Nathan Chancellor <nathan@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07asm-generic: introduce text-patching.hMike Rapoport (Microsoft)
Several architectures support text patching, but they name the header files that declare patching functions differently. Make all such headers consistently named text-patching.h and add an empty header in asm-generic for architectures that do not support text patching. Link: https://lkml.kernel.org/r/20241023162711.2579610-4-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Tested-by: kdevops <kdevops@lists.linux.dev> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-23um: switch to regset API and depend on XSTATEBenjamin Berg
The PTRACE_GETREGSET API has now existed since Linux 2.6.33. The XSAVE CPU feature should also be sufficiently common to be able to rely on it. With this, define our internal FP state to be the hosts XSAVE data. Add discovery for the hosts XSAVE size and place the FP registers at the end of task_struct so that we can adjust the size at runtime. Next we can implement the regset API on top and update the signal handling as well as ptrace APIs to use them. Also switch coredump creation to use the regset API and finally set HAVE_ARCH_TRACEHOOK. This considerably improves the signal frames. Previously they might not have contained all the registers (i386) and also did not have the sizes and magic values set to the correct values to permit userspace to decode the frame. As a side effect, this will permit UML to run on hosts with newer CPU extensions (such as AMX) that need even more register state. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20241023094120.4083426-1-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-23um: Remove UML specific debug parameterTiwei Bie
It does nothing but emit a warning when 'debug' is provided in the kernel command line. It can be a bit annoying, as 'debug' is also a valid kernel parameter to enable kernel debugging. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20241011040441.1586345-2-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-10um: Discover host_task_size from envpBenjamin Berg
When loading the UML binary, the host kernel will place the stack at the highest possible address. It will then map the program name and environment variables onto the start of the stack. As such, an easy way to figure out the host_task_size is to use the highest pointer to an environment variable as a reference. Ensure that this works by disabling address layout randomization and re-executing UML in case it was enabled. This increases the available TASK_SIZE for 64 bit UML considerably. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240919124511.282088-9-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-10um: Limit TASK_SIZE to the addressable rangeBenjamin Berg
We may have a TASK_SIZE from the host that is bigger than UML is able to address with a three-level pagetable on 64-bit. Guard against that by clipping the maximum TASK_SIZE to the maximum addressable area. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240919124511.282088-8-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-10um: Calculate stub data address relative to stub codeBenjamin Berg
Instead of using the current stack pointer, we can also use the current instruction to calculate where the stub data is. With this the stub data only needs to be aligned to a full page boundary. Changing this has the advantage that we do not have a hole in the memory space above the stub data (which would need to be explicitly cleared). Another motivation to do this is that with the planned addition of a SECCOMP based userspace the stack pointer may not be fully trustworthy. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240919124511.282088-7-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-10um: Fix the definition for physmem_sizeTiwei Bie
Currently physmem_size is defined as long long but declared locally as unsigned long long before using it in separate .c files. Make them match by defining physmem_size as unsigned long long and also move the declaration to a common header to allow the compiler to check it. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20240916045950.508910-5-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-10um: Remove highmem leftoversTiwei Bie
Highmem was only supported on UML/i386. And the support has been removed by commit a98a6d864d3b ("um: Remove broken highmem support"). Remove the leftovers and stop UML from trying to setup highmem when the sum of physmem_size and iomem_size exceeds max_physmem. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20240916045950.508910-4-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-07-03um: Remove unused ncpus variableTiwei Bie
It's no longer used. And uml_ncpus_setup doesn't exist anymore. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20240527134024.1539848-2-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-04-22um: Fix -Wmissing-prototypes warnings for text_poke*Tiwei Bie
The prototypes for text_poke* are declared in asm/text-patching.h under arch/x86/include/. It's safe to include this header, as it's UML-aware (by checking CONFIG_UML_X86). This will address below -Wmissing-prototypes warnings: arch/um/kernel/um_arch.c:461:7: warning: no previous prototype for ‘text_poke’ [-Wmissing-prototypes] arch/um/kernel/um_arch.c:473:6: warning: no previous prototype for ‘text_poke_sync’ [-Wmissing-prototypes] Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-08mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDERKirill A. Shutemov
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-10x86/alternative: Rename apply_ibt_endbr()Peter Zijlstra
The current name doesn't reflect what it does very well. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lkml.kernel.org/r/20230622144321.427441595%40infradead.org
2023-06-16um/cpu: Switch to arch_cpu_finalize_init()Thomas Gleixner
check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20230613224545.493148694@linutronix.de
2023-04-20um: make stub data pages size tweakableJohannes Berg
There's a lot of code here that hard-codes that the data is a single page, and right now that seems to be sufficient, but to make it easier to change this in the future, add a new STUB_DATA_PAGES constant and use it throughout the code. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2023-02-10um: Make the definition of cpu_data more compatiblePeter Foley
Match the x86 implementation to improve build errors. Noticed when building allyesconfig. e.g. ../arch/um/include/asm/processor-generic.h:94:19: error: called object is not a function or function pointer 94 | #define cpu_data (&boot_cpu_data) | ~^~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:2157:16: note: in expansion of macro ‘cpu_data’ 2157 | return cpu_data(first_cpu_of_numa_node).apicid; Signed-off-by: Peter Foley <pefoley2@pefoley.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-11-01x86/ibt: Implement FineIBTPeter Zijlstra
Implement an alternative CFI scheme that merges both the fine-grained nature of kCFI but also takes full advantage of the coarse grained hardware CFI as provided by IBT. To contrast: kCFI is a pure software CFI scheme and relies on being able to read text -- specifically the instruction *before* the target symbol, and does the hash validation *before* doing the call (otherwise control flow is compromised already). FineIBT is a software and hardware hybrid scheme; by ensuring every branch target starts with a hash validation it is possible to place the hash validation after the branch. This has several advantages: o the (hash) load is avoided; no memop; no RX requirement. o IBT WAIT-FOR-ENDBR state is a speculation stop; by placing the hash validation in the immediate instruction after the branch target there is a minimal speculation window and the whole is a viable defence against SpectreBHB. o Kees feels obliged to mention it is slightly more vulnerable when the attacker can write code. Obviously this patch relies on kCFI, but additionally it also relies on the padding from the call-depth-tracking patches. It uses this padding to place the hash-validation while the call-sites are re-written to modify the indirect target to be 16 bytes in front of the original target, thus hitting this new preamble. Notably, there is no hardware that needs call-depth-tracking (Skylake) and supports IBT (Tigerlake and onwards). Suggested-by: Joao Moreira (Intel) <joao@overdrivepizza.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221027092842.634714496@infradead.org
2022-10-14Merge tag 'for-linus-6.1-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Move to strscpy() - Improve panic notifiers - Fix NR_CPUS usage - Fixes for various comments - Fixes for virtio driver * tag 'for-linus-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: uml: Remove the initialization of statics to 0 um: Do not initialise statics to 0. um: Fix comment typo um: Improve panic notifiers consistency and ordering um: remove unused reactivate_chan() declaration um: mmaper: add __exit annotations to module exit funcs um: virt-pci: add __init/__exit annotations to module init/exit funcs hostfs: move from strlcpy with unused retval to strscpy um: move from strlcpy with unused retval to strscpy um: increase default virtual physical memory to 64 MiB UM: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK um: read multiple msg from virtio slave request fd
2022-09-19um: Improve panic notifiers consistency and orderingGuilherme G. Piccoli
Currently the panic notifiers from user mode linux don't follow the convention for most of the other notifiers present in the kernel (indentation, priority setting, numeric return). More important, the priorities could be improved, since it's a special case (userspace), hence we could run the notifiers earlier; user mode linux shouldn't care much with other panic notifiers but the ordering among the mconsole and arch notifier is important, given that the arch one effectively triggers a core dump. Fix that by running the mconsole notifier as the first panic notifier, followed by the architecture one (that coredumps). Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> V3: - No changes. V2: - Kept the notifier header to avoid implicit usage - thanks Johannes for the suggestion! Signed-off-by: Richard Weinberger <richard@nod.at>
2022-09-19um: move from strlcpy with unused retval to strscpyWolfram Sang
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-09-19um: increase default virtual physical memory to 64 MiBChristian Lamparter
The current 32 MiB of RAM causes OOMs to appear shortly after booting in a minimal OpenWrt 22.03 configuration with a 5.10.134 kernel. Of course, passing a "mem=64M" (from the --help text) parameter works too, but it produces the following (info) message: | [ 0.000000] Unknown kernel command line parameters "mem=64M", will be passed to user space. That's why, I think it would be nicer, if this is working out of the box again :). Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-09-19um: fix default console kernel parameterChristian Lamparter
OpenWrt's UML with 5.15 was producing odd errors/warnings during preinit part of the early userspace portion: |[ 0.000000] Kernel command line: ubd0=root.img root=98:0 console=tty |[...] |[ 0.440000] random: jshn: uninitialized urandom read (4 bytes read) |[ 0.460000] random: jshn: uninitialized urandom read (4 bytes read) |/etc/preinit: line 47: can't create /dev/tty: No such device or address |/etc/preinit: line 48: can't create /dev/tty: No such device or address |/etc/preinit: line 58: can't open /dev/tty: No such device or address |[...] repeated many times That "/dev/tty" came from the command line (which is automatically added if no console= parameter was specified for the uml binary). The TLDP project tells the following about the /dev/tty: <https://tldp.org/HOWTO/Text-Terminal-HOWTO-7.html#ss7.3> | /dev/tty stands for the controlling terminal (if any) for the current | process.[...] | /dev/tty is something like a link to the actually terminal device[..] The "(if any)" is important here, since it's possible for processes to not have a controlling terminal. I think this was a simple typo and the author wanted tty0 there. CC: Thomas Meyer <thomas@m3y3r.de> Fixes: d7ffac33631b ("um: stdio_console: Make preferred console") Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-09-19UM: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACKHuacai Chen
When CONFIG_CPUMASK_OFFSTACK and CONFIG_DEBUG_PER_CPU_MAPS is selected, cpu_max_bits_warn() generates a runtime warning similar as below while we show /proc/cpuinfo. Fix this by using nr_cpu_ids (the runtime limit) instead of NR_CPUS to iterate CPUs. [ 3.052463] ------------[ cut here ]------------ [ 3.059679] WARNING: CPU: 3 PID: 1 at include/linux/cpumask.h:108 show_cpuinfo+0x5e8/0x5f0 [ 3.070072] Modules linked in: efivarfs autofs4 [ 3.076257] CPU: 0 PID: 1 Comm: systemd Not tainted 5.19-rc5+ #1052 [ 3.099465] Stack : 9000000100157b08 9000000000f18530 9000000000cf846c 9000000100154000 [ 3.109127] 9000000100157a50 0000000000000000 9000000100157a58 9000000000ef7430 [ 3.118774] 90000001001578e8 0000000000000040 0000000000000020 ffffffffffffffff [ 3.128412] 0000000000aaaaaa 1ab25f00eec96a37 900000010021de80 900000000101c890 [ 3.138056] 0000000000000000 0000000000000000 0000000000000000 0000000000aaaaaa [ 3.147711] ffff8000339dc220 0000000000000001 0000000006ab4000 0000000000000000 [ 3.157364] 900000000101c998 0000000000000004 9000000000ef7430 0000000000000000 [ 3.167012] 0000000000000009 000000000000006c 0000000000000000 0000000000000000 [ 3.176641] 9000000000d3de08 9000000001639390 90000000002086d8 00007ffff0080286 [ 3.186260] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1c [ 3.195868] ... [ 3.199917] Call Trace: [ 3.203941] [<90000000002086d8>] show_stack+0x38/0x14c [ 3.210666] [<9000000000cf846c>] dump_stack_lvl+0x60/0x88 [ 3.217625] [<900000000023d268>] __warn+0xd0/0x100 [ 3.223958] [<9000000000cf3c90>] warn_slowpath_fmt+0x7c/0xcc [ 3.231150] [<9000000000210220>] show_cpuinfo+0x5e8/0x5f0 [ 3.238080] [<90000000004f578c>] seq_read_iter+0x354/0x4b4 [ 3.245098] [<90000000004c2e90>] new_sync_read+0x17c/0x1c4 [ 3.252114] [<90000000004c5174>] vfs_read+0x138/0x1d0 [ 3.258694] [<90000000004c55f8>] ksys_read+0x70/0x100 [ 3.265265] [<9000000000cfde9c>] do_syscall+0x7c/0x94 [ 3.271820] [<9000000000202fe4>] handle_syscall+0xc4/0x160 [ 3.281824] ---[ end trace 8b484262b4b8c24c ]--- Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-08-02Merge tag 'random-6.0-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "Though there's been a decent amount of RNG-related development during this last cycle, not all of it is coming through this tree, as this cycle saw a shift toward tackling early boot time seeding issues, which took place in other trees as well. Here's a summary of the various patches: - The CONFIG_ARCH_RANDOM .config option and the "nordrand" boot option have been removed, as they overlapped with the more widely supported and more sensible options, CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu". This change allowed simplifying a bit of arch code. - x86's RDRAND boot time test has been made a bit more robust, with RDRAND disabled if it's clearly producing bogus results. This would be a tip.git commit, technically, but I took it through random.git to avoid a large merge conflict. - The RNG has long since mixed in a timestamp very early in boot, on the premise that a computer that does the same things, but does so starting at different points in wall time, could be made to still produce a different RNG state. Unfortunately, the clock isn't set early in boot on all systems, so now we mix in that timestamp when the time is actually set. - User Mode Linux now uses the host OS's getrandom() syscall to generate a bootloader RNG seed and later on treats getrandom() as the platform's RDRAND-like faculty. - The arch_get_random_{seed_,}_long() family of functions is now arch_get_random_{seed_,}_longs(), which enables certain platforms, such as s390, to exploit considerable performance advantages from requesting multiple CPU random numbers at once, while at the same time compiling down to the same code as before on platforms like x86. - A small cleanup changing a cmpxchg() into a try_cmpxchg(), from Uros. - A comment spelling fix" More info about other random number changes that come in through various architecture trees in the full commentary in the pull request: https://lore.kernel.org/all/20220731232428.2219258-1-Jason@zx2c4.com/ * tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: correct spelling of "overwrites" random: handle archrandom with multiple longs um: seed rng using host OS rng random: use try_cmpxchg in _credit_init_bits timekeeping: contribute wall clock to rng on time change x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu" random: remove CONFIG_ARCH_RANDOM
2022-07-18um: seed rng using host OS rngJason A. Donenfeld
UML generally does not provide access to special CPU instructions like RDRAND, and execution tends to be rather deterministic, with no real hardware interrupts, making good randomness really very hard, if not all together impossible. Not only is this a security eyebrow raiser, but it's also quite annoying when trying to do various pieces of UML-based automation that takes a long time to boot, if ever. Fix this by trivially calling getrandom() in the host and using that seed as "bootloader randomness", which initializes the rng immediately at UML boot. The old behavior can be restored the same way as on any other arch, by way of CONFIG_TRUST_BOOTLOADER_RANDOMNESS=n or random.trust_bootloader=0. So seen from that perspective, this just makes UML act like other archs, which is positive in its own right. Additionally, wire up arch_get_random_{int,long}() in the same way, so that reseeds can also make use of the host RNG, controllable by CONFIG_TRUST_CPU_RANDOMNESS and random.trust_cpu, per usual. Cc: stable@vger.kernel.org Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-07-14um: Add missing apply_returns()Peter Zijlstra
Implement apply_returns() stub for UM, just like all the other patching routines. Fixes: 15e67227c49a ("x86: Undo return-thunk damage") Reported-by: Randy Dunlap <rdunlap@infradead.org) Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/Ys%2Ft45l%2FgarIrD0u@worktop.programming.kicks-ass.net
2022-03-15x86/alternative: Use .ibt_endbr_seal to seal indirect callsPeter Zijlstra
Objtool's --ibt option generates .ibt_endbr_seal which lists superfluous ENDBR instructions. That is those instructions for which the function is never indirectly called. Overwrite these ENDBR instructions with a NOP4 such that these function can never be indirect called, reducing the number of viable ENDBR targets in the kernel. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154319.822545231@infradead.org
2021-12-22um: Add devicetree supportVincent Whitchurch
Add a dtb=<filename> option to boot UML with a devicetree blob. This can be used for testing driver code using UML. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> [rw: Add dependency on CONFIG_OF] Signed-off-by: Richard Weinberger <richard@nod.at>
2021-10-28x86/alternative: Implement .retpoline_sites supportPeter Zijlstra
Rewrite retpoline thunk call sites to be indirect calls for spectre_v2=off. This ensures spectre_v2=off is as near to a RETPOLINE=n build as possible. This is the replacement for objtool writing alternative entries to ensure the same and achieves feature-parity with the previous approach. One noteworthy feature is that it relies on the thunks to be in machine order to compute the register index. Specifically, this does not yet address the Jcc __x86_indirect_thunk_* calls generated by clang, a future patch will add this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.232495794@infradead.org
2021-07-09Merge tag 'for-linus-5.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: - Support for optimized routines based on the host CPU - Support for PCI via virtio - Various fixes * tag 'for-linus-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: remove unneeded semicolon in um_arch.c um: Remove the repeated declaration um: fix error return code in winch_tramp() um: fix error return code in slip_open() um: Fix stack pointer alignment um: implement flush_cache_vmap/flush_cache_vunmap um: add a UML specific futex implementation um: enable the use of optimized xor routines in UML um: Add support for host CPU flags and alignment um: allow not setting extra rpaths in the linux binary um: virtio/pci: enable suspend/resume um: add PCI over virtio emulation driver um: irqs: allow invoking time-travel handler multiple times um: time-travel/signals: fix ndelay() in interrupt um: expose time-travel mode to userspace side um: export signals_enabled directly um: remove unused smp_sigio_handler() declaration lib: add iomem emulation (logic_iomem) um: allow disabling NO_IOMEM
2021-07-01kernel.h: split out panic and oops helpersAndy Shevchenko
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out panic and oops helpers. There are several purposes of doing this: - dropping dependency in bug.h - dropping a loop by moving out panic_notifier.h - unload kernel.h from something which has its own domain At the same time convert users tree-wide to use new headers, although for the time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [akpm@linux-foundation.org: thread_info.h needs limits.h] [andriy.shevchenko@linux.intel.com: ia64 fix] Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Co-developed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Sebastian Reichel <sre@kernel.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Stephen Boyd <sboyd@kernel.org> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-19um: remove unneeded semicolon in um_arch.cWan Jiabing
Fix following coccicheck warning: ./arch/um/kernel/um_arch.c:284:34-35: Unneeded semicolon Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2021-06-17um: Add support for host CPU flags and alignmentAnton Ivanov
1. Reflect host cpu flags into the UML instance so they can be used to select the correct implementations for xor, crypto, etc. 2. Reflect host cache alignment into UML instance. This is important when running 32 bit on a 64 bit host as 32 bit by default aligns to 32 while the actual alignment should be 64. Ditto for some Xeons which align at 128. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2021-02-12um: remove process stub VMAJohannes Berg
This mostly reverts the old commit 3963333fe676 ("uml: cover stubs with a VMA") which had added a VMA to the existing PTEs. However, there's no real reason to have the PTEs in the first place and the VMA cannot be 'fixed' in place, which leads to bugs that userspace could try to unmap them and be forcefully killed, or such. Also, there's a bit of an ugly hole in userspace's address space. Simplify all this: just install the stub code/page at the top of the (inner) address space, i.e. put it just above TASK_SIZE. The pages are simply hard-coded to be mapped in the userspace process we use to implement an mm context, and they're out of reach of the inner mmap/munmap/mprotect etc. since they're above TASK_SIZE. Getting rid of the VMA also makes vma_merge() no longer hit one of the VM_WARN_ON()s there because we installed a VMA while the code assumes the stack VMA is the first one. It also removes a lockdep warning about mmap_sem usage since we no longer have uml_setup_stubs() and thus no longer need to do any manipulation that would require mmap_sem in activate_mm(). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2021-01-26um: stdio_console: Make preferred consoleThomas Meyer
The addition of the "ttynull" console driver did break the ordering of the UML stdio console driver. The UML stdio console driver is added in late_initcall (7), whereby the ttynull driver is added in device_initcall (6), which always does make the ttynull driver the default console. Fix it by explicitly adding the UML stdio console as the preferred console, in case no 'console=' command line option was specified. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-12-14um: Fix build w/o CONFIG_PM_SLEEPJohannes Berg
uml_pm_wake() is unconditionally called from the SIGUSR1 wakeup handler since that's in the userspace portion of UML, and thus a bit tricky to ifdef out. Since pm_system_wakeup() can always be called (but may be an empty inline), also simply always have uml_pm_wake() to fix the build. Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-12-13um: Support suspend to RAMJohannes Berg
With all the previous bits in place, we can now also support suspend to RAM, in the sense that everything is suspended, not just most, including userspace, processes like in s2idle. Since um_idle_sleep() now waits forever, we can simply call that to "suspend" the system. As before, you can wake it up using SIGUSR1 since we're just in a pause() call that only needs to return. In order to implement selective resume from certain devices, and not have any arbitrary device interrupt wake up, suspend interrupts by removing SIGIO notification (O_ASYNC) from all the FDs that are not supposed to wake up the system. However, swap out the handler so we don't actually handle the SIGIO as an interrupt. Since we're in pause(), the mere act of receiving SIGIO wakes us up, and then after things have been restored enough, re-set O_ASYNC for all previously suspended FDs, reinstall the proper SIGIO handler, and send SIGIO to self to process anything that might now be pending. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>