summaryrefslogtreecommitdiff
path: root/arch/s390/Kconfig
AgeCommit message (Collapse)Author
2023-04-18mm/hugetlb_vmemmap: rename ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAPAneesh Kumar K.V
Now we use ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP config option to indicate devdax and hugetlb vmemmap optimization support. Hence rename that to a generic ARCH_WANT_OPTIMIZE_VMEMMAP Link: https://lkml.kernel.org/r/20230412050025.84346-2-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Tarun Sahu <tsahu@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-13s390/checksum: always use cksm instructionHeiko Carstens
Commit dfe843dce775 ("s390/checksum: support GENERIC_CSUM, enable it for KASAN") switched s390 to use the generic checksum functions, so that KASAN instrumentation also works checksum functions by avoiding architecture specific inline assemblies. There is however the problem that the generic csum_partial() function returns a 32 bit value with a 16 bit folded checksum, while the original s390 variant does not fold to 16 bit. This in turn causes that the ipib_checksum in lowcore contains different values depending on kernel config options. The ipib_checksum is used by system dumpers to verify if pointers in lowcore point to valid data. Verification is done by comparing checksum values. The system dumpers still use 32 bit checksum values which are not folded, and therefore the checksum verification fails (incorrectly). Symptom is that reboot after dump does not work anymore when a KASAN instrumented kernel is dumped. Fix this by not using the generic checksum implementation. Instead add an explicit kasan_check_read() so that KASAN knows about the read access from within the inline assembly. Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com> Fixes: dfe843dce775 ("s390/checksum: support GENERIC_CSUM, enable it for KASAN") Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-05s390/mm: try VMA lock-based page fault handling firstHeiko Carstens
Attempt VMA lock-based page fault handling first, and fall back to the existing mmap_lock-based handling if that fails. This is the s390 variant of "x86/mm: try VMA lock-based page fault handling first". Link: https://lkml.kernel.org/r/20230314132808.1266335-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-04s390: enable HAVE_ARCH_STACKLEAKHeiko Carstens
Add support for the stackleak feature. Whenever the kernel returns to user space the kernel stack is filled with a poison value. Enabling this feature is quite expensive: e.g. after instrumenting the getpid() system call function to have a 4kb stack the result is an increased runtime of the system call by a factor of 3. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-03-27s390: enable ARCH_HAS_MEMBARRIER_SYNC_COREHeiko Carstens
s390 trivially supports the ARCH_HAS_MEMBARRIER_SYNC_CORE requirements since the used lpswe(y) instruction to return from any kernel context to user space performs CPU serialization. This is very similar to arm, arm64 and powerpc. See commit 70216e18e519 ("membarrier: Provide core serializing command, *_SYNC_CORE") for further details. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-03-20s390: make use of CONFIG_FUNCTION_ALIGNMENTHeiko Carstens
Make use of CONFIG_FUNCTION_ALIGNMENT which was introduced with commit d49a0626216b ("arch: Introduce CONFIG_FUNCTION_ALIGNMENT"). Select FUNCTION_ALIGNMENT_8B for gcc in order to reflect gcc's default function alignment. For all other compilers, which is only clang, select a function alignment of 16 bytes which reflects the default function alignment for clang. Also change the __ALIGN define to follow whatever the value of CONFIG_FUNCTION_ALIGNMENT is. This makes sure that the alignment of C and assembler functions is the same. In result everything still uses the default function alignment for both compilers. However in addition this is now also true for all assembly functions, so that all functions have a consistent alignment. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-03Merge tag 's390-6.3-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - Add empty command line parameter handling stubs to kernel for all command line parameters which are handled in the decompressor. This avoids invalid "Unknown kernel command line parameters" messages from the kernel, and also avoids that these will be incorrectly passed to user space. This caused already confusion, therefore add the empty stubs - Add missing phys_to_virt() handling to machine check handler - Introduce and use a union to be used for zcrypt inline assemblies. This makes sure that only a register wide member of the union is passed as input and output parameter to inline assemblies, while usual C code uses other members of the union to access bit fields of it - Add and use a READ_ONCE_ALIGNED_128() macro, which can be used to atomically read a 128-bit value from memory. This replaces the (mis-)use of the 128-bit cmpxchg operation to do the same in cpum_sf code. Currently gcc does not generate the used lpq instruction if __READ_ONCE() is used for aligned 128-bit accesses, therefore use this s390 specific helper - Simplify machine check handler code if a task needs to be killed because of e.g. register corruption due to a machine malfunction - Perform CPU reset to clear pending interrupts and TLB entries on an already stopped target CPU before delegating work to it - Generate arch/s390/boot/vmlinux.map link map for the decompressor, when CONFIG_VMLINUX_MAP is enabled for debugging purposes - Fix segment type handling for dcssblk devices. It incorrectly always returned type "READ/WRITE" even for read-only segements, which can result in a kernel panic if somebody tries to write to a read-only device - Sort config S390 select list again - Fix two kprobe reenter bugs revealed by a recently added kprobe kunit test * tag 's390-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/kprobes: fix current_kprobe never cleared after kprobes reenter s390/kprobes: fix irq mask clobbering on kprobe reenter from post_handler s390/Kconfig: sort config S390 select list again s390/extmem: return correct segment type in __segment_load() s390/decompressor: add link map saving s390/smp: perform cpu reset before delegating work to target cpu s390/mcck: cleanup user process termination path s390/cpum_sf: use READ_ONCE_ALIGNED_128() instead of 128-bit cmpxchg s390/rwonce: add READ_ONCE_ALIGNED_128() macro s390/ap,zcrypt,vfio: introduce and use ap_queue_status_reg union s390/nmi: fix virtual-physical address confusion s390/setup: do not complain about parameters handled in decompressor
2023-03-01s390/Kconfig: sort config S390 select list againHeiko Carstens
Keep the config S390 select list sorted. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-25Merge tag 'vfio-v6.3-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: - Remove redundant resource check in vfio-platform (Angus Chen) - Use GFP_KERNEL_ACCOUNT for persistent userspace allocations, allowing removal of arbitrary kernel limits in favor of cgroup control (Yishai Hadas) - mdev tidy-ups, including removing the module-only build restriction for sample drivers, Kconfig changes to select mdev support, documentation movement to keep sample driver usage instructions with sample drivers rather than with API docs, remove references to out-of-tree drivers in docs (Christoph Hellwig) - Fix collateral breakages from mdev Kconfig changes (Arnd Bergmann) - Make mlx5 migration support match device support, improve source and target flows to improve pre-copy support and reduce downtime (Yishai Hadas) - Convert additional mdev sysfs case to use sysfs_emit() (Bo Liu) - Resolve copy-paste error in mdev mbochs sample driver Kconfig (Ye Xingchen) - Avoid propagating missing reset error in vfio-platform if reset requirement is relaxed by module option (Tomasz Duszynski) - Range size fixes in mlx5 variant driver for missed last byte and stricter range calculation (Yishai Hadas) - Fixes to suspended vaddr support and locked_vm accounting, excluding mdev configurations from the former due to potential to indefinitely block kernel threads, fix underflow and restore locked_vm on new mm (Steve Sistare) - Update outdated vfio documentation due to new IOMMUFD interfaces in recent kernels (Yi Liu) - Resolve deadlock between group_lock and kvm_lock, finally (Matthew Rosato) - Fix NULL pointer in group initialization error path with IOMMUFD (Yan Zhao) * tag 'vfio-v6.3-rc1' of https://github.com/awilliam/linux-vfio: (32 commits) vfio: Fix NULL pointer dereference caused by uninitialized group->iommufd docs: vfio: Update vfio.rst per latest interfaces vfio: Update the kdoc for vfio_device_ops vfio/mlx5: Fix range size calculation upon tracker creation vfio: no need to pass kvm pointer during device open vfio: fix deadlock between group lock and kvm lock vfio: revert "iommu driver notify callback" vfio/type1: revert "implement notify callback" vfio/type1: revert "block on invalid vaddr" vfio/type1: restore locked_vm vfio/type1: track locked_vm per dma vfio/type1: prevent underflow of locked_vm via exec() vfio/type1: exclude mdevs from VFIO_UPDATE_VADDR vfio: platform: ignore missing reset if disabled at module init vfio/mlx5: Improve the target side flow to reduce downtime vfio/mlx5: Improve the source side flow upon pre_copy vfio/mlx5: Check whether VF is migratable samples: fix the prompt about SAMPLE_VFIO_MDEV_MBOCHS vfio/mdev: Use sysfs_emit() to instead of sprintf() vfio-mdev: add back CONFIG_VFIO dependency ...
2023-01-30vfio-mdev: add back CONFIG_VFIO dependencyArnd Bergmann
CONFIG_VFIO_MDEV cannot be selected when VFIO itself is disabled, otherwise we get a link failure: WARNING: unmet direct dependencies detected for VFIO_MDEV Depends on [n]: VFIO [=n] Selected by [y]: - SAMPLE_VFIO_MDEV_MTTY [=y] && SAMPLES [=y] - SAMPLE_VFIO_MDEV_MDPY [=y] && SAMPLES [=y] - SAMPLE_VFIO_MDEV_MBOCHS [=y] && SAMPLES [=y] /home/arnd/cross/arm64/gcc-13.0.1-nolibc/x86_64-linux/bin/x86_64-linux-ld: samples/vfio-mdev/mdpy.o: in function `mdpy_remove': mdpy.c:(.text+0x1e1): undefined reference to `vfio_unregister_group_dev' /home/arnd/cross/arm64/gcc-13.0.1-nolibc/x86_64-linux/bin/x86_64-linux-ld: samples/vfio-mdev/mdpy.o: in function `mdpy_probe': mdpy.c:(.text+0x149e): undefined reference to `_vfio_alloc_device' Fixes: 8bf8c5ee1f38 ("vfio-mdev: turn VFIO_MDEV into a selectable symbol") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230126211211.1762319-1-arnd@kernel.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-01-23vfio-mdev: turn VFIO_MDEV into a selectable symbolChristoph Hellwig
VFIO_MDEV is just a library with helpers for the drivers. Stop making it a user choice and just select it by the drivers that use the helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Link: https://lore.kernel.org/r/20230110091009.474427-3-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-01-22s390/kprobes: replace kretprobe with rethookVasily Gorbik
That's an adaptation of commit f3a112c0c40d ("x86,rethook,kprobes: Replace kretprobe with rethook on x86") to s390. Replaces the kretprobe code with rethook on s390. With this patch, kretprobe on s390 uses the rethook instead of kretprobe specific trampoline code. Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-12-12Merge tag 's390-6.2-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Factor out handle_write() function and simplify 3215 console write operation - When 3170 terminal emulator is connected to the 3215 console driver the boot time could be very long due to limited buffer space or missing operator input. Add con3215_drop command line parameter and con3215_drop sysfs attribute file to instruct the kernel drop console data when such conditions are met - Fix white space errors in 3215 console driver - Move enum paiext_mode definition to a header file and rename it to paievt_mode to indicate this is now used for several events. Rename PAI_MODE_COUNTER to PAI_MODE_COUNTING to make consistent with PAI_MODE_SAMPLING - Simplify the logic of PMU pai_crypto mapped buffer reference counter and make it consistent with PMU pai_ext - Rename PMU pai_crypto mapped buffer structure member users to active_events to make it consistent with PMU pai_ext - Enable HUGETLB_PAGE_OPTIMIZE_VMEMMAP configuration option. This results in saving of 12K per 1M hugetlb page (~1.2%) and 32764K per 2G hugetlb page (~1.6%) - Use generic serial.h, bugs.h, shmparam.h and vga.h header files and scrap s390-specific versions - The generic percpu setup code does not expect the s390-like implementation and emits a warning. To get rid of that warning and provide sane CPU-to-node and CPU-to-CPU distance mappings implementat a minimal version of setup_per_cpu_areas() - Use kstrtobool() instead of strtobool() for re-IPL sysfs device attributes - Avoid unnecessary lookup of a pointer to MSI descriptor when setting IRQ affinity for a PCI device - Get rid of "an incompatible function type cast" warning by changing debug_sprintf_format_fn() function prototype so it matches the debug_format_proc_t function type - Remove unused info_blk_hdr__pcpus() and get_page_state() functions - Get rid of clang "unused unused insn cache ops function" warning by moving s390_insn definition to a private header - Get rid of clang "unused function" warning by making function raw3270_state_final() only available if CONFIG_TN3270_CONSOLE is enabled - Use kstrobool() to parse sclp_con_drop parameter to make it identical to the con3215_drop parameter and allow passing values like "yes" and "true" - Use sysfs_emit() for all SCLP sysfs show functions, which is the current standard way to generate output strings - Make SCLP con_drop sysfs attribute also writable and allow to change its value during runtime. This makes SCLP console drop handling consistent with the 3215 device driver - Virtual and physical addresses are indentical on s390. However, there is still a confusion when pointers are directly casted to physical addresses or vice versa. Use correct address converters virt_to_phys() and phys_to_virt() for s390 channel IO drivers - Support for power managemant has been removed from s390 since quite some time. Remove unused power managemant code from the appldata device driver - Allow memory tools like KASAN see memory accesses from the checksum code. Switch to GENERIC_CSUM if KASAN is enabled, just like x86 does - Add support of ECKD DASDs disks so it could be used as boot and dump devices - Follow checkpatch recommendations and use octal values instead of S_IRUGO and S_IWUSR for dump device attributes in sysfs - Changes to vx-insn.h do not cause a recompile of C files that use asm(".include \"asm/vx-insn.h\"\n") magic to access vector instruction macros from inline assemblies. Add wrapper include header file to avoid this problem - Use vector instruction macros instead of byte patterns to increase register validation routine readability - The current machine check register validation handling does not take into account various scenarios and might lead to killing a wrong user process or potentially ignore corrupted FPU registers. Simplify logic of the machine check handler and stop the whole machine if the previous context was kerenel mode. If the previous context was user mode, kill the current task - Introduce sclp_emergency_printk() function which can be used to emit a message in emergency cases. It is supposed to be used in cases where regular console device drivers may not work anymore, e.g. unrecoverable machine checks Keep the early Service-Call Control Block so it can also be used after initdata has been freed to allow sclp_emergency_printk() implementation - In case a system will be stopped because of an unrecoverable machine check error print the machine check interruption code to give a hint of what went wrong - Move storage error checking from the assembly entry code to C in order to simplify machine check handling. Enter the handler with DAT turned on, which simplifies the entry code even more - The machine check extended save areas are allocated using a private "nmi_save_areas" slab cache which guarantees a required power-of-two alignment. Get rid of that cache in favour of kmalloc() * tag 's390-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (38 commits) s390/nmi: get rid of private slab cache s390/nmi: move storage error checking back to C, enter with DAT on s390/nmi: print machine check interruption code before stopping system s390/sclp: introduce sclp_emergency_printk() s390/sclp: keep sclp_early_sccb s390/nmi: rework register validation handling s390/nmi: use vector instruction macros instead of byte patterns s390/vx: add vx-insn.h wrapper include file s390/ipl: use octal values instead of S_* macros s390/ipl: add eckd dump support s390/ipl: add eckd support vfio/ccw: identify CCW data addresses as physical vfio/ccw: sort out physical vs virtual pointers usage s390/checksum: support GENERIC_CSUM, enable it for KASAN s390/appldata: remove power management callbacks s390/cio: sort out physical vs virtual pointers usage s390/sclp: allow to change sclp_console_drop during runtime s390/sclp: convert to use sysfs_emit() s390/sclp: use kstrobool() to parse sclp_con_drop parameter s390/3270: make raw3270_state_final() depend on CONFIG_TN3270_CONSOLE ...
2022-12-12Merge tag 'rcu.2022.12.02a' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Documentation updates. This is the second in a series from an ongoing review of the RCU documentation. - Miscellaneous fixes. - Introduce a default-off Kconfig option that depends on RCU_NOCB_CPU that, on CPUs mentioned in the nohz_full or rcu_nocbs boot-argument CPU lists, causes call_rcu() to introduce delays. These delays result in significant power savings on nearly idle Android and ChromeOS systems. These savings range from a few percent to more than ten percent. This series also includes several commits that change call_rcu() to a new call_rcu_hurry() function that avoids these delays in a few cases, for example, where timely wakeups are required. Several of these are outside of RCU and thus have acks and reviews from the relevant maintainers. - Create an srcu_read_lock_nmisafe() and an srcu_read_unlock_nmisafe() for architectures that support NMIs, but which do not provide NMI-safe this_cpu_inc(). These NMI-safe SRCU functions are required by the upcoming lockless printk() work by John Ogness et al. - Changes providing minor but important increases in torture test coverage for the new RCU polled-grace-period APIs. - Changes to torturescript that avoid redundant kernel builds, thus providing about a 30% speedup for the torture.sh acceptance test. * tag 'rcu.2022.12.02a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (49 commits) net: devinet: Reduce refcount before grace period net: Use call_rcu_hurry() for dst_release() workqueue: Make queue_rcu_work() use call_rcu_hurry() percpu-refcount: Use call_rcu_hurry() for atomic switch scsi/scsi_error: Use call_rcu_hurry() instead of call_rcu() rcu/rcutorture: Use call_rcu_hurry() where needed rcu/rcuscale: Use call_rcu_hurry() for async reader test rcu/sync: Use call_rcu_hurry() instead of call_rcu rcuscale: Add laziness and kfree tests rcu: Shrinker for lazy rcu rcu: Refactor code a bit in rcu_nocb_do_flush_bypass() rcu: Make call_rcu() lazy to save power rcu: Implement lockdep_rcu_enabled for !CONFIG_DEBUG_LOCK_ALLOC srcu: Debug NMI safety even on archs that don't require it srcu: Explain the reason behind the read side critical section on GP start srcu: Warn when NMI-unsafe API is used in NMI arch/s390: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option arch/loongarch: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option rcu: Fix __this_cpu_read() lockdep warning in rcu_force_quiescent_state() rcu-tasks: Make grace-period-age message human-readable ...
2022-12-02s390/checksum: support GENERIC_CSUM, enable it for KASANHeiko Carstens
This is the s390 variant of commit d911c67e10b4 ("x86: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSAN"). Even though most of the s390 specific checksum code is written in C there is still the csum_partial() inline assembly which could prevent KASAN and KMSAN from seeing all memory accesses. Therefore switch to GENERIC_CSUM if KASAN is enabled just like x86. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-23s390/mm: provide minimal setup_per_cpu_areas() implementationHeiko Carstens
s390 allows to enable CONFIG_NUMA, mainly to enable a couple of system calls which are only present if NUMA is enabled. The NUMA specific system calls are required by a couple of applications, which wouldn't work if the system calls wouldn't be present. The NUMA implementation itself maps all CPUs and memory to node 0. A special case is the generic percpu setup code, which doesn't expect an s390 like implementation and therefore emits a message/warning: "percpu: cpu 0 has no node -1 or node-local memory". In order to get rid of this message, and also to provide sane CPU to node and CPU distance mappings implement a minimal setup_per_cpu_areas() function, which is very close to the generic variant. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-10s390: select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAPGerald Schaefer
Enable HUGETLB_PAGE_OPTIMIZE_VMEMMAP for s390. With this, vmemmap pages used to back struct pages for compound tail pages of hugetlb pages are freed and remapped to compound head page frame as RO, see also Documentation/vm/vmemmap_dedup.rst. For 1M hugetlb pages, this results in freeing 3 of 4 vmemmap pages, saving 12K of memory for each 1M hugetlb page (~1.2%). /sys/kernel/debug/kernel_page_tables will show the impact: ---[ vmemmap Area Start ]--- [...] 0x0000037202d84000-0x0000037202d85000 4K PTE RW NX 0x0000037202d85000-0x0000037202d88000 12K PTE RO NX For 2G hugetlb pages, this results in freeing 8191 of 8192 vmemmap pages, saving 32764K of memory for each 2G hugetlb page (~1.6%) /sys/kernel/debug/kernel_page_tables will show the impact: ---[ vmemmap Area Start ]--- [...] 0x000003720a000000-0x000003720a001000 4K PTE RW NX 0x000003720a001000-0x000003720c000000 32764K PTE RO NX The memory savings come with some costs: - vmemmap mapping for compound hugetlb pages is not a PMD mapping any more, but split to 4K PTE mappings, and it will not be coalesced back to PMD mapping after freeing hugetlb pages from the pool. Apart from theoretical performance impact, this will also (slightly) relativize the memory savings because of additional 2K PTE pagetable allocations. - Workload using "on the fly" hugetlb allocations via "nr_overcommit_hugepages" instead of using the hugetlb pool via "nr_hugepages" will suffer from considerably increased fault handling time, see also description from commit 78f39084b41d ("mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl"). - Freeing hugetlb pages from the pool will require re-allocation of the freed struct pages, and therefore needs some memory available to the kernel. This might fail in memory constrained scenarios. - For the same reason, memory offline might fail even for ZONE_MOVABLE when hugetlb pages are present (but not for s390, since we do not support ARCH_ENABLE_HUGEPAGE_MIGRATION, and therefore cannot have hugetlb pages in ZONE_MOVABLE). - General increased complexity and overhead in kernel handling of compound (head) pages. Therefore, this feature is disabled by default, and has to be enabled explicitly either by adding "hugetlb_free_vmemmap=on" kernel parameter, or during run-time via "/proc/sys/vm/hugetlb_optimize_vmemmap" sysctl. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-08s390: always build relocatable kernelHeiko Carstens
Nathan Chancellor reported several link errors on s390 with CONFIG_RELOCATABLE disabled, after binutils commit 906f69cf65da ("IBM zSystems: Issue error for *DBL relocs on misaligned symbols"). The binutils commit reveals potential miscompiles that might have happened already before with linker script defined symbols at odd addresses. A similar bug was recently fixed in the kernel with commit c9305b6c1f52 ("s390: fix nospec table alignments"). See https://github.com/ClangBuiltLinux/linux/issues/1747 for an analysis from Ulich Weigand. Therefore always build a relocatable kernel to avoid this problem. There is hardly any use-case for non-relocatable kernels, so this shouldn't be controversial. Link: https://github.com/ClangBuiltLinux/linux/issues/1747 Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reported-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20221030182202.2062705-1-hca@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-10-21arch/s390: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig optionPaul E. McKenney
The s390 architecture uses either a cmpxchg loop (old systems) or the laa add-to-memory instruction (new systems) to implement this_cpu_add(), both of which are NMI safe. This means that the old and more-efficient srcu_read_lock() may be used in NMI context, without the need for srcu_read_lock_nmisafe(). Therefore, add the new Kconfig option ARCH_HAS_NMI_SAFE_THIS_CPU_OPS to arch/s390/Kconfig, which will cause NEED_SRCU_NMI_SAFE to be deselected, thus preserving the current srcu_read_lock() behavior. [ paulmck: Apply Christian Borntraeger feedback. ] Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/ Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Suggested-by: Frederic Weisbecker <frederic@kernel.org> Suggested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> Cc: <linux-s390@vger.kernel.org>
2022-08-02Merge tag 'random-6.0-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "Though there's been a decent amount of RNG-related development during this last cycle, not all of it is coming through this tree, as this cycle saw a shift toward tackling early boot time seeding issues, which took place in other trees as well. Here's a summary of the various patches: - The CONFIG_ARCH_RANDOM .config option and the "nordrand" boot option have been removed, as they overlapped with the more widely supported and more sensible options, CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu". This change allowed simplifying a bit of arch code. - x86's RDRAND boot time test has been made a bit more robust, with RDRAND disabled if it's clearly producing bogus results. This would be a tip.git commit, technically, but I took it through random.git to avoid a large merge conflict. - The RNG has long since mixed in a timestamp very early in boot, on the premise that a computer that does the same things, but does so starting at different points in wall time, could be made to still produce a different RNG state. Unfortunately, the clock isn't set early in boot on all systems, so now we mix in that timestamp when the time is actually set. - User Mode Linux now uses the host OS's getrandom() syscall to generate a bootloader RNG seed and later on treats getrandom() as the platform's RDRAND-like faculty. - The arch_get_random_{seed_,}_long() family of functions is now arch_get_random_{seed_,}_longs(), which enables certain platforms, such as s390, to exploit considerable performance advantages from requesting multiple CPU random numbers at once, while at the same time compiling down to the same code as before on platforms like x86. - A small cleanup changing a cmpxchg() into a try_cmpxchg(), from Uros. - A comment spelling fix" More info about other random number changes that come in through various architecture trees in the full commentary in the pull request: https://lore.kernel.org/all/20220731232428.2219258-1-Jason@zx2c4.com/ * tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: correct spelling of "overwrites" random: handle archrandom with multiple longs um: seed rng using host OS rng random: use try_cmpxchg in _credit_init_bits timekeeping: contribute wall clock to rng on time change x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu" random: remove CONFIG_ARCH_RANDOM
2022-07-21mmu_gather: Remove per arch tlb_{start,end}_vma()Peter Zijlstra
Scattered across the archs are 3 basic forms of tlb_{start,end}_vma(). Provide two new MMU_GATHER_knobs to enumerate them and remove the per arch tlb_{start,end}_vma() implementations. - MMU_GATHER_NO_FLUSH_CACHE indicates the arch has flush_cache_range() but does *NOT* want to call it for each VMA. - MMU_GATHER_MERGE_VMAS indicates the arch wants to merge the invalidate across multiple VMAs if possible. With these it is possible to capture the three forms: 1) empty stubs; select MMU_GATHER_NO_FLUSH_CACHE and MMU_GATHER_MERGE_VMAS 2) start: flush_cache_range(), end: empty; select MMU_GATHER_MERGE_VMAS 3) start: flush_cache_range(), end: flush_tlb_range(); default Obviously, if the architecture does not have flush_cache_range() then it also doesn't need to select MMU_GATHER_NO_FLUSH_CACHE. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-18random: remove CONFIG_ARCH_RANDOMJason A. Donenfeld
When RDRAND was introduced, there was much discussion on whether it should be trusted and how the kernel should handle that. Initially, two mechanisms cropped up, CONFIG_ARCH_RANDOM, a compile time switch, and "nordrand", a boot-time switch. Later the thinking evolved. With a properly designed RNG, using RDRAND values alone won't harm anything, even if the outputs are malicious. Rather, the issue is whether those values are being *trusted* to be good or not. And so a new set of options were introduced as the real ones that people use -- CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu". With these options, RDRAND is used, but it's not always credited. So in the worst case, it does nothing, and in the best case, maybe it helps. Along the way, CONFIG_ARCH_RANDOM's meaning got sort of pulled into the center and became something certain platforms force-select. The old options don't really help with much, and it's a bit odd to have special handling for these instructions when the kernel can deal fine with the existence or untrusted existence or broken existence or non-existence of that CPU capability. Simplify the situation by removing CONFIG_ARCH_RANDOM and using the ordinary asm-generic fallback pattern instead, keeping the two options that are actually used. For now it leaves "nordrand" for now, as the removal of that will take a different route. Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-30s390: remove unneeded 'select BUILD_BIN2C'Masahiro Yamada
Since commit 4c0f032d4963 ("s390/purgatory: Omit use of bin2c"), s390 builds the purgatory without using bin2c. Remove 'select BUILD_BIN2C' to avoid the unneeded build of bin2c. Fixes: 4c0f032d4963 ("s390/purgatory: Omit use of bin2c") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20220613170902.1775211-1-masahiroy@kernel.org Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-06-10Merge tag 'for-linus-5.19a-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - a small cleanup removing "export" of an __init function - a small series adding a new infrastructure for platform flags - a series adding generic virtio support for Xen guests (frontend side) * tag 'for-linus-5.19a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: unexport __init-annotated xen_xlate_map_ballooned_pages() arm/xen: Assign xen-grant DMA ops for xen-grant DMA devices xen/grant-dma-ops: Retrieve the ID of backend's domain for DT devices xen/grant-dma-iommu: Introduce stub IOMMU driver dt-bindings: Add xen,grant-dma IOMMU description for xen-grant DMA ops xen/virtio: Enable restricted memory access using Xen grant mappings xen/grant-dma-ops: Add option to restrict memory access under Xen xen/grants: support allocating consecutive grants arm/xen: Introduce xen_setup_dma_ops() virtio: replace arch_has_restricted_virtio_memory_access() kernel: add platform_has() infrastructure
2022-06-09gcc-12: disable '-Warray-bounds' universally for nowLinus Torvalds
In commit 8b202ee21839 ("s390: disable -Warray-bounds") the s390 people disabled the '-Warray-bounds' warning for gcc-12, because the new logic in gcc would cause warnings for their use of the S390_lowcore macro, which accesses absolute pointers. It turns out gcc-12 has many other issues in this area, so this takes that s390 warning disable logic, and turns it into a kernel build config entry instead. Part of the intent is that we can make this all much more targeted, and use this conflig flag to disable it in only particular configurations that cause problems, with the s390 case as an example: select GCC12_NO_ARRAY_BOUNDS and we could do that for other configuration cases that cause issues. Or we could possibly use the CONFIG_CC_NO_ARRAY_BOUNDS thing in a more targeted way, and disable the warning only for particular uses: again the s390 case as an example: KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_CC_NO_ARRAY_BOUNDS),-Wno-array-bounds) but this ends up just doing it globally in the top-level Makefile, since the current issues are spread fairly widely all over: KBUILD_CFLAGS-$(CONFIG_CC_NO_ARRAY_BOUNDS) += -Wno-array-bounds We'll try to limit this later, since the gcc-12 problems are rare enough that *much* of the kernel can be built with it without disabling this warning. Cc: Kees Cook <keescook@chromium.org> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-06virtio: replace arch_has_restricted_virtio_memory_access()Juergen Gross
Instead of using arch_has_restricted_virtio_memory_access() together with CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS, replace those with platform_has() and a new platform feature PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 only Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Borislav Petkov <bp@suse.de>
2022-06-03Merge tag 's390-5.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: "Just a couple of small improvements, bug fixes and cleanups: - Add Eric Farman as maintainer for s390 virtio drivers. - Improve machine check handling, and avoid incorrectly injecting a machine check into a kvm guest. - Add cond_resched() call to gmap page table walker in order to avoid possible huge latencies. Also use non-quiesing sske instruction to speed up storage key handling. - Add __GFP_NORETRY to KEXEC_CONTROL_MEMORY_GFP so s390 behaves similar like common code. - Get sie control block address from correct stack slot in perf event code. This fixes potential random memory accesses. - Change uaccess code so that the exception handler sets the result of get_user() and __get_kernel_nofault() to zero in case of a fault. Until now this was done via input parameters for inline assemblies. Doing it via fault handling is what most or even all other architectures are doing. - Couple of other small cleanups and fixes" * tag 's390-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/stack: add union to reflect kvm stack slot usages s390/stack: merge empty stack frame slots s390/uaccess: whitespace cleanup s390/uaccess: use __noreturn instead of __attribute__((noreturn)) s390/uaccess: use exception handler to zero result on get_user() failure s390/uaccess: use symbolic names for inline assembler operands s390/mcck: isolate SIE instruction when setting CIF_MCCK_GUEST flag s390/mm: use non-quiescing sske for KVM switch to keyed guest s390/gmap: voluntarily schedule during key setting MAINTAINERS: Update s390 virtio-ccw s390/kexec: add __GFP_NORETRY to KEXEC_CONTROL_MEMORY_GFP s390/Kconfig.debug: fix indentation s390/Kconfig: fix indentation s390/perf: obtain sie_block from the right address s390: generate register offsets into pt_regs automatically s390: simplify early program check handler s390/crypto: fix scatterwalk_unmap() callers in AES-GCM
2022-06-01s390/Kconfig: fix indentationJuerg Haefliger
The convention for indentation seems to be a single tab. Help text is further indented by an additional two whitespaces. Fix the lines that violate these rules. Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com> Link: https://lore.kernel.org/r/20220525120140.39534-1-juerg.haefliger@canonical.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-31Merge tag 'riscv-for-linus-5.19-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the Svpbmt extension, which allows memory attributes to be encoded in pages - Support for the Allwinner D1's implementation of page-based memory attributes - Support for running rv32 binaries on rv64 systems, via the compat subsystem - Support for kexec_file() - Support for the new generic ticket-based spinlocks, which allows us to also move to qrwlock. These should have already gone in through the asm-geneic tree as well - A handful of cleanups and fixes, include some larger ones around atomics and XIP * tag 'riscv-for-linus-5.19-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits) RISC-V: Prepare dropping week attribute from arch_kexec_apply_relocations[_add] riscv: compat: Using seperated vdso_maps for compat_vdso_info RISC-V: Fix the XIP build RISC-V: Split out the XIP fixups into their own file RISC-V: ignore xipImage RISC-V: Avoid empty create_*_mapping definitions riscv: Don't output a bogus mmu-type on a no MMU kernel riscv: atomic: Add custom conditional atomic operation implementation riscv: atomic: Optimize dec_if_positive functions riscv: atomic: Cleanup unnecessary definition RISC-V: Load purgatory in kexec_file RISC-V: Add purgatory RISC-V: Support for kexec_file on panic RISC-V: Add kexec_file support RISC-V: use memcpy for kexec_file mode kexec_file: Fix kexec_file.c build error for riscv platform riscv: compat: Add COMPAT Kbuild skeletal support riscv: compat: ptrace: Add compat_arch_ptrace implement riscv: compat: signal: Add rt_frame implementation riscv: add memory-type errata for T-Head ...
2022-05-26Merge tag 'kbuild-v5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add HOSTPKG_CONFIG env variable to allow users to override pkg-config - Support W=e as a shorthand for KCFLAGS=-Werror - Fix CONFIG_IKHEADERS build to support toybox cpio - Add scripts/dummy-tools/pahole to ease distro packagers' life - Suppress false-positive warnings from checksyscalls.sh for W=2 build - Factor out the common code of arch/*/boot/install.sh into scripts/install.sh - Support 'kernel-install' tool in scripts/prune-kernel - Refactor module-versioning to link the symbol versions at the final link of vmlinux and modules - Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in an arch-agnostic way - Refactor modpost, Makefiles * tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (56 commits) genksyms: adjust the output format to modpost kbuild: stop merging *.symversions kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS modpost: extract symbol versions from *.cmd files modpost: add sym_find_with_module() helper modpost: change the license of EXPORT_SYMBOL to bool type modpost: remove left-over cross_compile declaration kbuild: record symbol versions in *.cmd files kbuild: generate a list of objects in vmlinux modpost: move *.mod.c generation to write_mod_c_files() modpost: merge add_{intree_flag,retpoline,staging_flag} to add_header scripts/prune-kernel: Use kernel-install if available kbuild: factor out the common installation code into scripts/install.sh modpost: split new_symbol() to symbol allocation and hash table addition modpost: make sym_add_exported() always allocate a new symbol modpost: make multiple export error modpost: dump Module.symvers in the same order of modules.order modpost: traverse the namespace_list in order modpost: use doubly linked list for dump_lists modpost: traverse unresolved symbols in order ...
2022-05-24kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCSMasahiro Yamada
include/{linux,asm-generic}/export.h defines a weak symbol, __crc_* as a placeholder. Genksyms writes the version CRCs into the linker script, which will be used for filling the __crc_* symbols. The linker script format depends on CONFIG_MODULE_REL_CRCS. If it is enabled, __crc_* holds the offset to the reference of CRC. It is time to get rid of this complexity. Now that modpost parses text files (.*.cmd) to collect all the CRCs, it can generate C code that will be linked to the vmlinux or modules. Generate a new C file, .vmlinux.export.c, which contains the CRCs of symbols exported by vmlinux. It is compiled and linked to vmlinux in scripts/link-vmlinux.sh. Put the CRCs of symbols exported by modules into the existing *.mod.c files. No additional build step is needed for modules. As before, *.mod.c are compiled and linked to *.ko in scripts/Makefile.modfinal. No linker magic is used here. The new C implementation works in the same way, whether CONFIG_RELOCATABLE is enabled or not. CONFIG_MODULE_REL_CRCS is no longer needed. Previously, Kbuild invoked additional $(LD) to update the CRCs in objects, but this step is unneeded too. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nicolas Schier <nicolas@fjasle.eu> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
2022-04-26arch: Add SYSVIPC_COMPAT for all architecturesGuo Ren
The existing per-arch definitions are pretty much historic cruft. Move SYSVIPC_COMPAT into init/Kconfig. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Link: https://lore.kernel.org/r/20220405071314.3225832-5-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-06s390: allow to compile with z16 optimizationsHeiko Carstens
Add config and compile options which allow to compile with z16 optimizations if the compiler supports it. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-03-27s390/unwind: avoid duplicated unwinding entries for kretprobesVasily Gorbik
Currently when unwinding starts from pt_regs or encounters pt_regs along the way unwinder tries to yield 2 unwinding entries: 1. (reliable) ip1: pt_regs->psw.addr, sp1: regs->gprs[15] 2. (non-reliable) ip2: sp1->gprs[8] (r14), sp2: regs->gprs[15] In case of kretprobes those are identical and serves no other purpose than causing confusion over duplicated entries and cause kprobes tests to fail. So, skip a duplicate non-reliable entry in this case. With that kretprobes and unwinder implementation now comply with ARCH_CORRECT_STACKTRACE_ON_KRETPROBE. Reviewed-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-25Merge tag 's390-5.18-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Raise minimum supported machine generation to z10, which comes with various cleanups and code simplifications (usercopy/spectre mitigation/etc). - Rework extables and get rid of anonymous out-of-line fixups. - Page table helpers cleanup. Add set_pXd()/set_pte() helper functions. Covert pte_val()/pXd_val() macros to functions. - Optimize kretprobe handling by avoiding extra kprobe on __kretprobe_trampoline. - Add support for CEX8 crypto cards. - Allow to trigger AP bus rescan via writing to /sys/bus/ap/scans. - Add CONFIG_EXPOLINE_EXTERN option to build the kernel without COMDAT group sections which simplifies kpatch support. - Always use the packed stack layout and extend kernel unwinder tests. - Add sanity checks for ftrace code patching. - Add s390dbf debug log for the vfio_ap device driver. - Various virtual vs physical address confusion fixes. - Various small fixes and improvements all over the code. * tag 's390-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (69 commits) s390/test_unwind: add kretprobe tests s390/kprobes: Avoid additional kprobe in kretprobe handling s390: convert ".insn" encoding to instruction names s390: assume stckf is always present s390/nospec: move to single register thunks s390: raise minimum supported machine generation to z10 s390/uaccess: Add copy_from/to_user_key functions s390/nospec: align and size extern thunks s390/nospec: add an option to use thunk-extern s390/nospec: generate single register thunks if possible s390/pci: make zpci_set_irq()/zpci_clear_irq() static s390: remove unused expoline to BC instructions s390/irq: use assignment instead of cast s390/traps: get rid of magic cast for per code s390/traps: get rid of magic cast for program interruption code s390/signal: fix typo in comments s390/asm-offsets: remove unused defines s390/test_unwind: avoid build warning with W=1 s390: remove .fixup section s390/bpf: encode register within extable entry ...
2022-03-23Merge tag 'asm-generic-5.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are three sets of updates for 5.18 in the asm-generic tree: - The set_fs()/get_fs() infrastructure gets removed for good. This was already gone from all major architectures, but now we can finally remove it everywhere, which loses some particularly tricky and error-prone code. There is a small merge conflict against a parisc cleanup, the solution is to use their new version. - The nds32 architecture ends its tenure in the Linux kernel. The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release. - A series from Masahiro Yamada cleans up some of the uapi header files to pass the compile-time checks" * tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits) nds32: Remove the architecture uaccess: remove CONFIG_SET_FS ia64: remove CONFIG_SET_FS support sh: remove CONFIG_SET_FS support sparc64: remove CONFIG_SET_FS support lib/test_lockup: fix kernel pointer check for separate address spaces uaccess: generalize access_ok() uaccess: fix type mismatch warnings from access_ok() arm64: simplify access_ok() m68k: fix access_ok for coldfire MIPS: use simpler access_ok() MIPS: Handle address errors for accesses above CPU max virtual user address uaccess: add generic __{get,put}_kernel_nofault nios2: drop access_ok() check from __put_user() x86: use more conventional access_ok() definition x86: remove __range_not_ok() sparc64: add __{get,put}_kernel_nofault() nds32: fix access_ok() checks in get/put_user uaccess: fix nios2 and microblaze get_user_8() sparc64: fix building assembly files ...
2022-03-10s390: raise minimum supported machine generation to z10Vasily Gorbik
Machine generations up to z9 (released in May 2006) have been officially out of service for several years now (z9 end of service - January 31, 2019). No distributions build kernels supporting those old machine generations anymore, except Debian, which seems to pick the oldest supported generation. The team supporting Debian on s390 has been notified about the change. Raising minimum supported machine generation to z10 helps to reduce maintenance cost and effectively remove code, which is not getting enough testing coverage due to lack of older hardware and distributions support. Besides that this unblocks some optimization opportunities and allows to use wider instruction set in asm files for future features implementation. Due to this change spectre mitigation and usercopy implementations could be drastically simplified and many newer instructions could be converted from ".insn" encoding to instruction names. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-10s390/nospec: add an option to use thunk-externVasily Gorbik
Currently with -mindirect-branch=thunk and -mfunction-return=thunk compiler options expoline thunks are put into individual COMDAT group sections. s390 is the only architecture which has group sections and it has implications for kpatch and objtool tools support. Using -mindirect-branch=thunk-extern and -mfunction-return=thunk-extern is an alternative, which comes with a need to generate all required expoline thunks manually. Unfortunately modules area is too far away from the kernel image, and expolines from the kernel image cannon be used. But since all new distributions (except Debian) build kernels for machine generations newer than z10, where "exrl" instruction is available, that leaves only 16 expolines thunks possible. Provide an option to build the kernel with -mindirect-branch=thunk-extern and -mfunction-return=thunk-extern for z10 or newer. This also requires to postlink expoline thunks into all modules explicitly. Currently modules already contain most expolines anyhow. Unfortunately -mindirect-branch=thunk-extern and -mfunction-return=thunk-extern options support is broken in gcc <= 11.2. Additional compile test is required to verify proper gcc support. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Co-developed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-01s390: always use the packed stack layoutVasily Gorbik
-mpacked-stack option has been supported by both minimum gcc and clang versions for a while. With commit e2bc3e91d91e ("scripts/min-tool-version.sh: Raise minimum clang version to 13.0.0 for s390") minimum clang version now also supports a combination of flags -mpacked-stack -mbackchain -pg -mfentry and fulfills all requirements to always enable the packed stack layout. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-02-25usercopy: Check valid lifetime via stack depthKees Cook
One of the things that CONFIG_HARDENED_USERCOPY sanity-checks is whether an object that is about to be copied to/from userspace is overlapping the stack at all. If it is, it performs a number of inexpensive bounds checks. One of the finer-grained checks is whether an object crosses stack frames within the stack region. Doing this on x86 with CONFIG_FRAME_POINTER was cheap/easy. Doing it with ORC was deemed too heavy, and was left out (a while ago), leaving the courser whole-stack check. The LKDTM tests USERCOPY_STACK_FRAME_TO and USERCOPY_STACK_FRAME_FROM try to exercise these cross-frame cases to validate the defense is working. They have been failing ever since ORC was added (which was expected). While Muhammad was investigating various LKDTM failures[1], he asked me for additional details on them, and I realized that when exact stack frame boundary checking is not available (i.e. everything except x86 with FRAME_POINTER), it could check if a stack object is at least "current depth valid", in the sense that any object within the stack region but not between start-of-stack and current_stack_pointer should be considered unavailable (i.e. its lifetime is from a call no longer present on the stack). Introduce ARCH_HAS_CURRENT_STACK_POINTER to track which architectures have actually implemented the common global register alias. Additionally report usercopy bounds checking failures with an offset from current_stack_pointer, which may assist with diagnosing failures. The LKDTM USERCOPY_STACK_FRAME_TO and USERCOPY_STACK_FRAME_FROM tests (once slightly adjusted in a separate patch) pass again with this fixed. [1] https://github.com/kernelci/kernelci-project/issues/84 Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Kees Cook <keescook@chromium.org> --- v1: https://lore.kernel.org/lkml/20220216201449.2087956-1-keescook@chromium.org v2: https://lore.kernel.org/lkml/20220224060342.1855457-1-keescook@chromium.org v3: https://lore.kernel.org/lkml/20220225173345.3358109-1-keescook@chromium.org v4: - improve commit log (akpm)
2022-02-25uaccess: generalize access_ok()Arnd Bergmann
There are many different ways that access_ok() is defined across architectures, but in the end, they all just compare against the user_addr_max() value or they accept anything. Provide one definition that works for most architectures, checking against TASK_SIZE_MAX for user processes or skipping the check inside of uaccess_kernel() sections. For architectures without CONFIG_SET_FS(), this should be the fastest check, as it comes down to a single comparison of a pointer against a compile-time constant, while the architecture specific versions tend to do something more complex for historic reasons or get something wrong. Type checking for __user annotations is handled inconsistently across architectures, but this is easily simplified as well by using an inline function that takes a 'const void __user *' argument. A handful of callers need an extra __user annotation for this. Some architectures had trick to use 33-bit or 65-bit arithmetic on the addresses to calculate the overflow, however this simpler version uses fewer registers, which means it can produce better object code in the end despite needing a second (statically predicted) branch. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mark Rutland <mark.rutland@arm.com> [arm64, asm-generic] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Stafford Horne <shorne@gmail.com> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-01-24s390/module: test loading modules with a lot of relocationsIlya Leoshkevich
Add a test in order to prevent regressions. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-01-23Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: - introduce for_each_set_bitrange() - use find_first_*_bit() instead of find_next_*_bit() where possible - unify for_each_bit() macros * tag 'bitmap-5.17-rc1' of git://github.com/norov/linux: vsprintf: rework bitmap_list_string lib: bitmap: add performance test for bitmap_print_to_pagebuf bitmap: unify find_bit operations mm/percpu: micro-optimize pcpu_is_populated() Replace for_each_*_bit_from() with for_each_*_bit() where appropriate find: micro-optimize for_each_{set,clear}_bit() include/linux: move for_each_bit() macros from bitops.h to find.h cpumask: replace cpumask_next_* with cpumask_first_* where appropriate tools: sync tools/bitmap with mother linux all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate cpumask: use find_first_and_bit() lib: add find_first_and_bit() arch: remove GENERIC_FIND_FIRST_BIT entirely include: move find.h from asm_generic to linux bitops: move find_bit_*_le functions from le.h to find.h bitops: protect find_first_{,zero}_bit properly
2022-01-15arch: remove GENERIC_FIND_FIRST_BIT entirelyYury Norov
In 5.12 cycle we enabled GENERIC_FIND_FIRST_BIT config option for ARM64 and MIPS. It increased performance and shrunk .text size; and so far I didn't receive any negative feedback on the change. https://lore.kernel.org/linux-arch/20210225135700.1381396-1-yury.norov@gmail.com/ Now I think it's a good time to switch all architectures to use find_{first,last}_bit() unconditionally, and so remove corresponding config option. The patch does't introduce functioal changes for arc, arm, arm64, mips, m68k, s390 and x86, for other architectures I expect improvement both in performance and .text size. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Alexander Lobakin <alobakin@pm.me> (mips) Reviewed-by: Alexander Lobakin <alobakin@pm.me> (mips) Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Will Deacon <will@kernel.org> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2021-12-13Merge tag 'v5.16-rc5' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-11-25futex: Remove futex_cmpxchg detectionArnd Bergmann
Now that all architectures have a working futex implementation in any configuration, remove the runtime detection code. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Vineet Gupta <vgupta@kernel.org> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20211026100432.1730393-2-arnd@kernel.org
2021-11-18ftrace/samples: add s390 support for ftrace direct multi sampleHeiko Carstens
Add s390 architecture support for the ftrace direct multi sample. See commit 5fae941b9a6f ("ftrace/samples: Add multi direct interface test module") for further details. Link: https://lore.kernel.org/r/20211115195614.3173346-3-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/boot: simplify and fix kernel memory layout setupVasily Gorbik
Initial KASAN shadow memory range was picked to preserve original kernel modules area position. With protected execution support, which might impose addressing limitation on vmalloc area and hence affect modules area position, current fixed KASAN shadow memory range is only making kernel memory layout setup more complex. So move it to the very end of available virtual space and simplify calculations. At the same time return to previous kernel address space split. In particular commit 0c4f2623b957 ("s390: setup kernel memory layout early") introduced precise identity map size calculation and keeping vmemmap left most starting from a fresh region table entry. This didn't take into account additional mapping region requirement for potential DCSS mapping above available physical memory. So go back to virtual space split between 1:1 mapping & vmemmap array once vmalloc area size is subtracted. Cc: stable@vger.kernel.org Fixes: 0c4f2623b957 ("s390: setup kernel memory layout early") Reported-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-10-26s390: make command line configurableSven Schnelle
Allow to configure the command line to an arbitrary length, with a default of 4096 bytes. Also remove COMMAND_LINE_SIZE from include/uapi/asm/setup.h as this is dynamic now and doesn't tell anything about the command line size limitations of a new kernel that might be loaded. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-19samples: add s390 support for ftrace direct call samplesHeiko Carstens
Add s390 support for ftrace direct call samples, which also enables ftrace direct call selftests within ftrace selftests. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20211012133802.2460757-5-hca@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>