summaryrefslogtreecommitdiff
path: root/arch/s390/include
AgeCommit message (Collapse)Author
2024-11-23Merge tag 'mm-stable-2024-11-18-19-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - The series "zram: optimal post-processing target selection" from Sergey Senozhatsky improves zram's post-processing selection algorithm. This leads to improved memory savings. - Wei Yang has gone to town on the mapletree code, contributing several series which clean up the implementation: - "refine mas_mab_cp()" - "Reduce the space to be cleared for maple_big_node" - "maple_tree: simplify mas_push_node()" - "Following cleanup after introduce mas_wr_store_type()" - "refine storing null" - The series "selftests/mm: hugetlb_fault_after_madv improvements" from David Hildenbrand fixes this selftest for s390. - The series "introduce pte_offset_map_{ro|rw}_nolock()" from Qi Zheng implements some rationaizations and cleanups in the page mapping code. - The series "mm: optimize shadow entries removal" from Shakeel Butt optimizes the file truncation code by speeding up the handling of shadow entries. - The series "Remove PageKsm()" from Matthew Wilcox completes the migration of this flag over to being a folio-based flag. - The series "Unify hugetlb into arch_get_unmapped_area functions" from Oscar Salvador implements a bunch of consolidations and cleanups in the hugetlb code. - The series "Do not shatter hugezeropage on wp-fault" from Dev Jain takes away the wp-fault time practice of turning a huge zero page into small pages. Instead we replace the whole thing with a THP. More consistent cleaner and potentiall saves a large number of pagefaults. - The series "percpu: Add a test case and fix for clang" from Andy Shevchenko enhances and fixes the kernel's built in percpu test code. - The series "mm/mremap: Remove extra vma tree walk" from Liam Howlett optimizes mremap() by avoiding doing things which we didn't need to do. - The series "Improve the tmpfs large folio read performance" from Baolin Wang teaches tmpfs to copy data into userspace at the folio size rather than as individual pages. A 20% speedup was observed. - The series "mm/damon/vaddr: Fix issue in damon_va_evenly_split_region()" fro Zheng Yejian fixes DAMON splitting. - The series "memcg-v1: fully deprecate charge moving" from Shakeel Butt removes the long-deprecated memcgv2 charge moving feature. - The series "fix error handling in mmap_region() and refactor" from Lorenzo Stoakes cleanup up some of the mmap() error handling and addresses some potential performance issues. - The series "x86/module: use large ROX pages for text allocations" from Mike Rapoport teaches x86 to use large pages for read-only-execute module text. - The series "page allocation tag compression" from Suren Baghdasaryan is followon maintenance work for the new page allocation profiling feature. - The series "page->index removals in mm" from Matthew Wilcox remove most references to page->index in mm/. A slow march towards shrinking struct page. - The series "damon/{self,kunit}tests: minor fixups for DAMON debugfs interface tests" from Andrew Paniakin performs maintenance work for DAMON's self testing code. - The series "mm: zswap swap-out of large folios" from Kanchana Sridhar improves zswap's batching of compression and decompression. It is a step along the way towards using Intel IAA hardware acceleration for this zswap operation. - The series "kasan: migrate the last module test to kunit" from Sabyrzhan Tasbolatov completes the migration of the KASAN built-in tests over to the KUnit framework. - The series "implement lightweight guard pages" from Lorenzo Stoakes permits userapace to place fault-generating guard pages within a single VMA, rather than requiring that multiple VMAs be created for this. Improved efficiencies for userspace memory allocators are expected. - The series "memcg: tracepoint for flushing stats" from JP Kobryn uses tracepoints to provide increased visibility into memcg stats flushing activity. - The series "zram: IDLE flag handling fixes" from Sergey Senozhatsky fixes a zram buglet which potentially affected performance. - The series "mm: add more kernel parameters to control mTHP" from Maíra Canal enhances our ability to control/configuremultisize THP from the kernel boot command line. - The series "kasan: few improvements on kunit tests" from Sabyrzhan Tasbolatov has a couple of fixups for the KASAN KUnit tests. - The series "mm/list_lru: Split list_lru lock into per-cgroup scope" from Kairui Song optimizes list_lru memory utilization when lockdep is enabled. * tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (215 commits) cma: enforce non-zero pageblock_order during cma_init_reserved_mem() mm/kfence: add a new kunit test test_use_after_free_read_nofault() zram: fix NULL pointer in comp_algorithm_show() memcg/hugetlb: add hugeTLB counters to memcg vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event mm: mmap_lock: check trace_mmap_lock_$type_enabled() instead of regcount zram: ZRAM_DEF_COMP should depend on ZRAM MAINTAINERS/MEMORY MANAGEMENT: add document files for mm Docs/mm/damon: recommend academic papers to read and/or cite mm: define general function pXd_init() kmemleak: iommu/iova: fix transient kmemleak false positive mm/list_lru: simplify the list_lru walk callback function mm/list_lru: split the lock to per-cgroup scope mm/list_lru: simplify reparenting and initial allocation mm/list_lru: code clean up for reparenting mm/list_lru: don't export list_lru_add mm/list_lru: don't pass unnecessary key parameters kasan: add kunit tests for kmalloc_track_caller, kmalloc_node_track_caller kasan: change kasan_atomics kunit test as KUNIT_CASE_SLOW kasan: use EXPORT_SYMBOL_IF_KUNIT to export symbols ...
2024-11-22Merge tag 'iommu-updates-v6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core Updates: - Convert call-sites using iommu_domain_alloc() to more specific versions and remove function - Introduce iommu_paging_domain_alloc_flags() - Extend support for allocating PASID-capable domains to more drivers - Remove iommu_present() - Some smaller improvements New IOMMU driver for RISC-V Intel VT-d Updates: - Add domain_alloc_paging support - Enable user space IOPFs in non-PASID and non-svm cases - Small code refactoring and cleanups - Add domain replacement support for pasid AMD-Vi Updates: - Adapt to iommu_paging_domain_alloc_flags() interface and alloc V2 page-tables by default - Replace custom domain ID allocator with IDA allocator - Add ops->release_domain() support - Other improvements to device attach and domain allocation code paths ARM-SMMU Updates: - SMMUv2: - Return -EPROBE_DEFER for client devices probing before their SMMU - Devicetree binding updates for Qualcomm MMU-500 implementations - SMMUv3: - Minor fixes and cleanup for NVIDIA's virtual command queue driver - IO-PGTable: - Fix indexing of concatenated PGDs and extend selftest coverage - Remove unused block-splitting support S390 IOMMU: - Implement support for blocking domain Mediatek IOMMU: - Enable 35-bit physical address support for mt8186 OMAP IOMMU driver: - Adapt to recent IOMMU core changes and unbreak driver" * tag 'iommu-updates-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (92 commits) iommu/tegra241-cmdqv: Fix alignment failure at max_n_shift iommu: Make set_dev_pasid op support domain replacement iommu/arm-smmu-v3: Make set_dev_pasid() op support replace iommu/vt-d: Add set_dev_pasid callback for nested domain iommu/vt-d: Make identity_domain_set_dev_pasid() to handle domain replacement iommu/vt-d: Make intel_svm_set_dev_pasid() support domain replacement iommu/vt-d: Limit intel_iommu_set_dev_pasid() for paging domain iommu/vt-d: Make intel_iommu_set_dev_pasid() to handle domain replacement iommu/vt-d: Add iommu_domain_did() to get did iommu/vt-d: Consolidate the struct dev_pasid_info add/remove iommu/vt-d: Add pasid replace helpers iommu/vt-d: Refactor the pasid setup helpers iommu/vt-d: Add a helper to flush cache for updating present pasid entry iommu: Pass old domain to set_dev_pasid op iommu/iova: Fix typo 'adderss' iommu: Add a kdoc to iommu_unmap() iommu/io-pgtable-arm-v7s: Remove split on unmap behavior iommu/io-pgtable-arm: Remove split on unmap behavior iommu/vt-d: Drain PRQs when domain removed from RID iommu/vt-d: Drop pasid requirement for prq initialization ...
2024-11-21Merge tag 'net-next-6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "The most significant set of changes is the per netns RTNL. The new behavior is disabled by default, regression risk should be contained. Notably the new config knob PTP_1588_CLOCK_VMCLOCK will inherit its default value from PTP_1588_CLOCK_KVM, as the first is intended to be a more reliable replacement for the latter. Core: - Started a very large, in-progress, effort to make the RTNL lock scope per network-namespace, thus reducing the lock contention significantly in the containerized use-case, comprising: - RCU-ified some relevant slices of the FIB control path - introduce basic per netns locking helpers - namespacified the IPv4 address hash table - remove rtnl_register{,_module}() in favour of rtnl_register_many() - refactor rtnl_{new,del,set}link() moving as much validation as possible out of RTNL lock - convert all phonet doit() and dumpit() handlers to RCU - convert IPv4 addresses manipulation to per-netns RTNL - convert virtual interface creation to per-netns RTNL the per-netns lock infrastructure is guarded by the CONFIG_DEBUG_NET_SMALL_RTNL knob, disabled by default ad interim. - Introduce NAPI suspension, to efficiently switching between busy polling (NAPI processing suspended) and normal processing. - Migrate the IPv4 routing input, output and control path from direct ToS usage to DSCP macros. This is a work in progress to make ECN handling consistent and reliable. - Add drop reasons support to the IPv4 rotue input path, allowing better introspection in case of packets drop. - Make FIB seqnum lockless, dropping RTNL protection for read access. - Make inet{,v6} addresses hashing less predicable. - Allow providing timestamp OPT_ID via cmsg, to correlate TX packets and timestamps Things we sprinkled into general kernel code: - Add small file operations for debugfs, to reduce the struct ops size. - Refactoring and optimization for the implementation of page_frag API, This is a preparatory work to consolidate the page_frag implementation. Netfilter: - Optimize set element transactions to reduce memory consumption - Extended netlink error reporting for attribute parser failure. - Make legacy xtables configs user selectable, giving users the option to configure iptables without enabling any other config. - Address a lot of false-positive RCU issues, pointed by recent CI improvements. BPF: - Put xsk sockets on a struct diet and add various cleanups. Overall, this helps to bump performance by 12% for some workloads. - Extend BPF selftests to increase coverage of XDP features in combination with BPF cpumap. - Optimize and homogenize bpf_csum_diff helper for all archs and also add a batch of new BPF selftests for it. - Extend netkit with an option to delegate skb->{mark,priority} scrubbing to its BPF program. - Make the bpf_get_netns_cookie() helper available also to tc(x) BPF programs. Protocols: - Introduces 4-tuple hash for connected udp sockets, speeding-up significantly connected sockets lookup. - Add a fastpath for some TCP timers that usually expires after close, the socket lock contention. - Add inbound and outbound xfrm state caches to speed up state lookups. - Avoid sending MPTCP advertisements on stale subflows, reducing risks on loosing them. - Make neighbours table flushing more scalable, maintaining per device neigh lists. Driver API: - Introduce a unified interface to configure transmission H/W shaping, and expose it to user-space via generic-netlink. - Add support for per-NAPI config via netlink. This makes napi configuration persistent across queues removal and re-creation. Requires driver updates, currently supported drivers are: nVidia/Mellanox mlx4 and mlx5, Broadcom brcm and Intel ice. - Add ethtool support for writing SFP / PHY firmware blocks. - Track RSS context allocation from ethtool core. - Implement support for mirroring to DSA CPU port, via TC mirror offload. - Consolidate FDB updates notification, to avoid duplicates on device-specific entries. - Expose DPLL clock quality level to the user-space. - Support master-slave PHY config via device tree. Tests and tooling: - forwarding: introduce deferred commands, to simplify the cleanup phase Drivers: - Updated several drivers - Amazon vNic, Google vNic, Microsoft vNic, Intel e1000e and Broadcom Tigon3 - to use netdev-genl to link the IRQs and queues to NAPI IDs, allowing busy polling and better introspection. - Ethernet high-speed NICs: - nVidia/Mellanox: - mlx5: - a large refactor to implement support for cross E-Switch scheduling - refactor H/W conter management to let it scale better - H/W GRO cleanups - Intel (100G, ice):: - add support for ethtool reset - implement support for per TX queue H/W shaping - AMD/Solarflare: - implement per device queue stats support - Broadcom (bnxt): - improve wildcard l4proto on IPv4/IPv6 ntuple rules - Marvell Octeon: - Add representor support for each Resource Virtualization Unit (RVU) device. - Hisilicon: - add support for the BMC Gigabit Ethernet - IBM (EMAC): - driver cleanup and modernization - Cisco (VIC): - raise the queues number limit to 256 - Ethernet virtual: - Google vNIC: - implement page pool support - macsec: - inherit lower device's features and TSO limits when offloading - virtio_net: - enable premapped mode by default - support for XDP socket(AF_XDP) zerocopy TX - wireguard: - set the TSO max size to be GSO_MAX_SIZE, to aggregate larger packets. - Ethernet NICs embedded and virtual: - Broadcom ASP: - enable software timestamping - Freescale: - add enetc4 PF driver - MediaTek: Airoha SoC: - implement BQL support - RealTek r8169: - enable TSO by default on r8168/r8125 - implement extended ethtool stats - Renesas AVB: - enable TX checksum offload - Synopsys (stmmac): - support header splitting for vlan tagged packets - move common code for DWMAC4 and DWXGMAC into a separate FPE module. - add dwmac driver support for T-HEAD TH1520 SoC - Synopsys (xpcs): - driver refactor and cleanup - TI: - icssg_prueth: add VLAN offload support - Xilinx emaclite: - add clock support - Ethernet switches: - Microchip: - implement support for the lan969x Ethernet switch family - add LAN9646 switch support to KSZ DSA driver - Ethernet PHYs: - Marvel: 88q2x: enable auto negotiation - Microchip: add support for LAN865X Rev B1 and LAN867X Rev C1/C2 - PTP: - Add support for the Amazon virtual clock device - Add PtP driver for s390 clocks - WiFi: - mac80211 - EHT 1024 aggregation size for transmissions - new operation to indicate that a new interface is to be added - support radio separation of multi-band devices - move wireless extension spy implementation to libiw - Broadcom: - brcmfmac: optional LPO clock support - Microchip: - add support for Atmel WILC3000 - Qualcomm (ath12k): - firmware coredump collection support - add debugfs support for a multitude of statistics - Qualcomm (ath5k): - Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support - Realtek: - rtw88: 8821au and 8812au USB adapters support - rtw89: add thermal protection - rtw89: fine tune BT-coexsitence to improve user experience - rtw89: firmware secure boot for WiFi 6 chip - Bluetooth - add Qualcomm WCN785x support for ids Foxconn 0xe0fc/0xe0f3 and 0x13d3:0x3623 - add Realtek RTL8852BE support for id Foxconn 0xe123 - add MediaTek MT7920 support for wireless module ids - btintel_pcie: add handshake between driver and firmware - btintel_pcie: add recovery mechanism - btnxpuart: add GPIO support to power save feature" * tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1475 commits) mm: page_frag: fix a compile error when kernel is not compiled Documentation: tipc: fix formatting issue in tipc.rst selftests: nic_performance: Add selftest for performance of NIC driver selftests: nic_link_layer: Add selftest case for speed and duplex states selftests: nic_link_layer: Add link layer selftest for NIC driver bnxt_en: Add FW trace coredump segments to the coredump bnxt_en: Add a new ethtool -W dump flag bnxt_en: Add 2 parameters to bnxt_fill_coredump_seg_hdr() bnxt_en: Add functions to copy host context memory bnxt_en: Do not free FW log context memory bnxt_en: Manage the FW trace context memory bnxt_en: Allocate backing store memory for FW trace logs bnxt_en: Add a 'force' parameter to bnxt_free_ctx_mem() bnxt_en: Refactor bnxt_free_ctx_mem() bnxt_en: Add mem_valid bit to struct bnxt_ctx_mem_type bnxt_en: Update firmware interface spec to 1.10.3.85 selftests/bpf: Add some tests with sockmap SK_PASS bpf: fix recursive lock when verdict program return SK_PASS wireguard: device: support big tcp GSO wireguard: selftests: load nf_conntrack if not present ...
2024-11-20Merge tag 'asm-generic-3.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "These are a number of unrelated cleanups, generally simplifying the architecture specific header files: - A series from Al Viro simplifies asm/vga.h, after it turns out that most of it can be generalized. - A series from Julian Vetter adds a common version of memcpy_{to,from}io() and memset_io() and changes most architectures to use that instead of their own implementation - A series from Niklas Schnelle concludes his work to make PC style inb()/outb() optional - Nicolas Pitre contributes improvements for the generic do_div() helper - Christoph Hellwig adds a generic version of page_to_phys() and phys_to_page(), replacing the slightly different architecture specific definitions. - Uwe Kleine-Koenig has a minor cleanup for ioctl definitions" * tag 'asm-generic-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (24 commits) empty include/asm-generic/vga.h sparc: get rid of asm/vga.h asm/vga.h: don't bother with scr_mem{cpy,move}v() unless we need to vt_buffer.h: get rid of dead code in default scr_...() instances tty: serial: export serial_8250_warn_need_ioport lib/iomem_copy: fix kerneldoc format style hexagon: simplify asm/io.h for !HAS_IOPORT loongarch: Use new fallback IO memcpy/memset csky: Use new fallback IO memcpy/memset arm64: Use new fallback IO memcpy/memset New implementation for IO memcpy and IO memset watchdog: Add HAS_IOPORT dependency for SBC8360 and SBC7240 __arch_xprod64(): make __always_inline when optimizing for performance ARM: div64: improve __arch_xprod_64() asm-generic/div64: optimize/simplify __div64_const32() lib/math/test_div64: add some edge cases relevant to __div64_const32() asm-generic: add an optional pfn_valid check to page_to_phys asm-generic: provide generic page_to_phys and phys_to_page implementations asm-generic/io.h: Remove I/O port accessors for HAS_IOPORT=n tty: serial: handle HAS_IOPORT dependencies ...
2024-11-20Merge tag 'ftrace-v6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace updates from Steven Rostedt: - Restructure the function graph shadow stack to prepare it for use with kretprobes With the goal of merging the shadow stack logic of function graph and kretprobes, some more restructuring of the function shadow stack is required. Move out function graph specific fields from the fgraph infrastructure and store it on the new stack variables that can pass data from the entry callback to the exit callback. Hopefully, with this change, the merge of kretprobes to use fgraph shadow stacks will be ready by the next merge window. - Make shadow stack 4k instead of using PAGE_SIZE. Some architectures have very large PAGE_SIZE values which make its use for shadow stacks waste a lot of memory. - Give shadow stacks its own kmem cache. When function graph is started, every task on the system gets a shadow stack. In the future, shadow stacks may not be 4K in size. Have it have its own kmem cache so that whatever size it becomes will still be efficient in allocations. - Initialize profiler graph ops as it will be needed for new updates to fgraph - Convert to use guard(mutex) for several ftrace and fgraph functions - Add more comments and documentation - Show function return address in function graph tracer Add an option to show the caller of a function at each entry of the function graph tracer, similar to what the function tracer does. - Abstract out ftrace_regs from being used directly like pt_regs ftrace_regs was created to store a partial pt_regs. It holds only the registers and stack information to get to the function arguments and return values. On several archs, it is simply a wrapper around pt_regs. But some users would access ftrace_regs directly to get the pt_regs which will not work on all archs. Make ftrace_regs an abstract structure that requires all access to its fields be through accessor functions. - Show how long it takes to do function code modifications When code modification for function hooks happen, it always had the time recorded in how long it took to do the conversion. But this value was never exported. Recently the code was touched due to new ROX modification handling that caused a large slow down in doing the modifications and had a significant impact on boot times. Expose the timings in the dyn_ftrace_total_info file. This file was created a while ago to show information about memory usage and such to implement dynamic function tracing. It's also an appropriate file to store the timings of this modification as well. This will make it easier to see the impact of changes to code modification on boot up timings. - Other clean ups and small fixes * tag 'ftrace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (22 commits) ftrace: Show timings of how long nop patching took ftrace: Use guard to take ftrace_lock in ftrace_graph_set_hash() ftrace: Use guard to take the ftrace_lock in release_probe() ftrace: Use guard to lock ftrace_lock in cache_mod() ftrace: Use guard for match_records() fgraph: Use guard(mutex)(&ftrace_lock) for unregister_ftrace_graph() fgraph: Give ret_stack its own kmem cache fgraph: Separate size of ret_stack from PAGE_SIZE ftrace: Rename ftrace_regs_return_value to ftrace_regs_get_return_value selftests/ftrace: Fix check of return value in fgraph-retval.tc test ftrace: Use arch_ftrace_regs() for ftrace_regs_*() macros ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs ftrace: Make ftrace_regs abstract from direct use fgragh: No need to invoke the function call_filter_check_discard() fgraph: Simplify return address printing in function graph tracer function_graph: Remove unnecessary initialization in ftrace_graph_ret_addr() function_graph: Support recording and printing the function return address ftrace: Have calltime be saved in the fgraph storage ftrace: Use a running sleeptime instead of saving on shadow stack fgraph: Use fgraph data to store subtime for profiler ...
2024-11-19Merge tag 'timers-vdso-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull vdso data page handling updates from Thomas Gleixner: "First steps of consolidating the VDSO data page handling. The VDSO data page handling is architecture specific for historical reasons, but there is no real technical reason to do so. Aside of that VDSO data has become a dump ground for various mechanisms and fail to provide a clear separation of the functionalities. Clean this up by: - consolidating the VDSO page data by getting rid of architecture specific warts especially in x86 and PowerPC. - removing the last includes of header files which are pulling in other headers outside of the VDSO namespace. - seperating timekeeping and other VDSO data accordingly. Further consolidation of the VDSO page handling is done in subsequent changes scheduled for the next merge window. This also lays the ground for expanding the VDSO time getters for independent PTP clocks in a generic way without making every architecture add support seperately" * tag 'timers-vdso-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) x86/vdso: Add missing brackets in switch case vdso: Rename struct arch_vdso_data to arch_vdso_time_data powerpc: Split systemcfg struct definitions out from vdso powerpc: Split systemcfg data out of vdso data page powerpc: Add kconfig option for the systemcfg page powerpc/pseries/lparcfg: Use num_possible_cpus() for potential processors powerpc/pseries/lparcfg: Fix printing of system_active_processors powerpc/procfs: Propagate error of remap_pfn_range() powerpc/vdso: Remove offset comment from 32bit vdso_arch_data x86/vdso: Split virtual clock pages into dedicated mapping x86/vdso: Delete vvar.h x86/vdso: Access vdso data without vvar.h x86/vdso: Move the rng offset to vsyscall.h x86/vdso: Access rng vdso data without vvar.h x86/vdso: Access timens vdso data without vvar.h x86/vdso: Allocate vvar page from C code x86/vdso: Access rng data from kernel without vvar x86/vdso: Place vdso_data at beginning of vvar page x86/vdso: Use __arch_get_vdso_data() to access vdso data x86/mm/mmap: Remove arch_vma_name() ...
2024-11-19Merge tag 'perf-core-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: "Uprobes: - Add BPF session support (Jiri Olsa) - Switch to RCU Tasks Trace flavor for better performance (Andrii Nakryiko) - Massively increase uretprobe SMP scalability by SRCU-protecting the uretprobe lifetime (Andrii Nakryiko) - Kill xol_area->slot_count (Oleg Nesterov) Core facilities: - Implement targeted high-frequency profiling by adding the ability for an event to "pause" or "resume" AUX area tracing (Adrian Hunter) VM profiling/sampling: - Correct perf sampling with guest VMs (Colton Lewis) New hardware support: - x86/intel: Add PMU support for Intel ArrowLake-H CPUs (Dapeng Mi) Misc fixes and enhancements: - x86/intel/pt: Fix buffer full but size is 0 case (Adrian Hunter) - x86/amd: Warn only on new bits set (Breno Leitao) - x86/amd/uncore: Avoid a false positive warning about snprintf truncation in amd_uncore_umc_ctx_init (Jean Delvare) - uprobes: Re-order struct uprobe_task to save some space (Christophe JAILLET) - x86/rapl: Move the pmu allocation out of CPU hotplug (Kan Liang) - x86/rapl: Clean up cpumask and hotplug (Kan Liang) - uprobes: Deuglify xol_get_insn_slot/xol_free_insn_slot paths (Oleg Nesterov)" * tag 'perf-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits) perf/core: Correct perf sampling with guest VMs perf/x86: Refactor misc flag assignments perf/powerpc: Use perf_arch_instruction_pointer() perf/core: Hoist perf_instruction_pointer() and perf_misc_flags() perf/arm: Drop unused functions uprobes: Re-order struct uprobe_task to save some space perf/x86/amd/uncore: Avoid a false positive warning about snprintf truncation in amd_uncore_umc_ctx_init perf/x86/intel: Do not enable large PEBS for events with aux actions or aux sampling perf/x86/intel/pt: Add support for pause / resume perf/core: Add aux_pause, aux_resume, aux_start_paused perf/x86/intel/pt: Fix buffer full but size is 0 case uprobes: SRCU-protect uretprobe lifetime (with timeout) uprobes: allow put_uprobe() from non-sleepable softirq context perf/x86/rapl: Clean up cpumask and hotplug perf/x86/rapl: Move the pmu allocation out of CPU hotplug uprobe: Add support for session consumer uprobe: Add data pointer to consumer handlers perf/x86/amd: Warn only on new bits set uprobes: fold xol_take_insn_slot() into xol_get_insn_slot() uprobes: kill xol_area->slot_count ...
2024-11-18Merge tag 's390-6.13-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Add firmware sysfs interface which allows user space to retrieve the dump area size of the machine - Add 'measurement_chars_full' CHPID sysfs attribute to make the complete associated Channel-Measurements Characteristics Block available - Add virtio-mem support - Move gmap aka KVM page fault handling from the main fault handler to KVM code. This is the first step to make s390 KVM page fault handling similar to other architectures. With this first step the main fault handler does not have any special handling anymore, and therefore convert it to support LOCK_MM_AND_FIND_VMA - With gcc 14 s390 support for flag output operand support for inline assemblies was added. This allows for several optimizations: - Provide a cmpxchg inline assembly which makes use of this, and provide all variants of arch_try_cmpxchg() so that the compiler can generate slightly better code - Convert a few cmpxchg() loops to try_cmpxchg() loops - Similar to x86 add a CC_OUT() helper macro (and other macros), and convert all inline assemblies to make use of them, so that depending on compiler version better code can be generated - List installed host-key hashes in sysfs if the machine supports the Query Ultravisor Keys UVC - Add 'Retrieve Secret' ioctl which allows user space in protected execution guests to retrieve previously stored secrets from the Ultravisor - Add pkey-uv module which supports the conversion of Ultravisor retrievable secrets to protected keys - Extend the existing paes cipher to exploit the full AES-XTS hardware acceleration introduced with message-security assist extension 10 - Convert hopefully all sysfs show functions to use sysfs_emit() so that the constant flow of such patches stop - For PCI devices make use of the newly added Topology ID attribute to enable whole card multi-function support despite the change to PCHID per port. Additionally improve the overall robustness and usability of the multifunction support - Various other small improvements, fixes, and cleanups * tag 's390-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (133 commits) s390/cio/ioasm: Convert to use flag output macros s390/cio/qdio: Convert to use flag output macros s390/sclp: Convert to use flag output macros s390/dasd: Convert to use flag output macros s390/boot/physmem: Convert to use flag output macros s390/pci: Convert to use flag output macros s390/kvm: Convert to use flag output macros s390/extmem: Convert to use flag output macros s390/string: Convert to use flag output macros s390/diag: Convert to use flag output macros s390/irq: Convert to use flag output macros s390/smp: Convert to use flag output macros s390/uv: Convert to use flag output macros s390/pai: Convert to use flag output macros s390/mm: Convert to use flag output macros s390/cpu_mf: Convert to use flag output macros s390/cpcmd: Convert to use flag output macros s390/topology: Convert to use flag output macros s390/time: Convert to use flag output macros s390/pageattr: Convert to use flag output macros ...
2024-11-15Merge branches 'arm/smmu', 'mediatek', 's390', 'ti/omap', 'riscv' and 'core' ↵Joerg Roedel
into next
2024-11-14perf/core: Hoist perf_instruction_pointer() and perf_misc_flags()Colton Lewis
For clarity, rename the arch-specific definitions of these functions to perf_arch_* to denote they are arch-specifc. Define the generic-named functions in one place where they can call the arch-specific ones as needed. Signed-off-by: Colton Lewis <coltonlewis@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Madhavan Srinivasan <maddy@linux.ibm.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20241113190156.2145593-3-coltonlewis@google.com
2024-11-13s390/smp: Convert to use flag output macrosHeiko Carstens
Use flag output macros in inline asm to allow for better code generation if the compiler has support for the flag output constraint. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-13s390/uv: Convert to use flag output macrosHeiko Carstens
Use flag output macros in inline asm to allow for better code generation if the compiler has support for the flag output constraint. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-13s390/pai: Convert to use flag output macrosHeiko Carstens
Use flag output macros in inline asm to allow for better code generation if the compiler has support for the flag output constraint. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-13s390/mm: Convert to use flag output macrosHeiko Carstens
Use flag output macros in inline asm to allow for better code generation if the compiler has support for the flag output constraint. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-13s390/cpu_mf: Convert to use flag output macrosHeiko Carstens
Use flag output macros in inline asm to allow for better code generation if the compiler has support for the flag output constraint. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-13s390/time: Convert to use flag output macrosHeiko Carstens
Use flag output macros in inline asm to allow for better code generation if the compiler has support for the flag output constraint. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-13s390/asm: Helper macros for flag output operand handlingHeiko Carstens
Since gcc supports flag out operands for inline assemblies there is always the question when this feature should be used and if it is worth all the ifdefs that come with that. In order to avoid that provide similar macros like x86 which can be used for all inline assemblies which extract the condition code. Depending on compiler features the generated code will either always contain an ipm+srl instruction pair, which extracts the condition code, or alternatively let the compiler handle this completely. Suggested-by: Juergen Christ <jchrist@linux.ibm.com> Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/atomic: Remove __atomic_cmpxchg() variantsHeiko Carstens
With users converted to the standard arch_cmpxchg() variants, remove the now unused __atomic_cmpxchg() and __atomic_cmpxchg_bool() variants. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/locking: Use arch_try_cmpxchg() instead of __atomic_cmpxchg_bool()Heiko Carstens
Use arch_try_cmpxchg() instead of __atomic_cmpxchg_bool() everywhere. This generates the same code like before, but uses the standard cmpxchg() implementation instead of a custom __atomic_cmpxchg_bool(). Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/preempt: Use arch_try_cmpxchg() instead of __atomic_cmpxchg()Heiko Carstens
Use arch_try_cmpxchg() instead of __atomic_cmpxchg() in preempt_count_set() to generate similar or better code, depending in compiler features. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/atomic: Provide arch_atomic_try_cmpxchg()Heiko Carstens
Since gcc 14 flag output operands are supported also for s390. Provide an arch_atomic try_cmpxchg() implementation so that all existing atomic_try_cmpxchg() usages generate slightly better code, if compiled with gcc 14 or newer. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/cmpxchg: Use arch_cmpxchg() instead of __atomic_cmpxchg()Heiko Carstens
Use arch_cmpxchg() instead of __atomic_cmpxchg() for the arch_atomic_cmpxchg() implementations. arch_cmpxchg() generates the same code and doesn't need a cast like it is required for arch_atomic64_cmpxchg(). Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/atomic: Convert arch_atomic_xchg() to C functionHeiko Carstens
Convert the arch_atomic_xchg define to a C function so that proper type checking is provided. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/cmpxchg: Provide arch_try_cmpxchg128()Heiko Carstens
Since gcc 14 flag output operands are supported also for s390. Provide an arch_try_cmpxchg128() implementation so that all existing try_cmpxchg128() variants provide slightly better code, if compiled with gcc 14 or newer. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/cmpxchg: Provide arch_cmpxchg128_local()Heiko Carstens
Just like x86 and arm64 provide a trivial arch_cmpxchg128_local() implementation by mapping it to arch_cmpxchg128(). Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/cmpxchg: Implement arch_xchg() with arch_try_cmpxchg()Heiko Carstens
Get rid of the arch_xchg() inline assemblies by converting the inline assemblies to C functions which make use of arch_try_cmpxchg(). With flag output operand support the generated code is at least as good as the previous version. Without it is slightly worse, however getting rid of all the inline assembly code is worth it. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/cmpxchg: Provide arch_try_cmpxchg()Heiko Carstens
Since gcc 14 flag output operands are supported also for s390. Provide an arch_try_cmpxchg() implementation so that all existing try_cmpxchg() variants provide slightly better code, if compiled with gcc 14 or newer. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/cmpxchg: Convert one and two byte case inline assemblies to CHeiko Carstens
Rewrite __cmpxchg() in order to get rid of the large inline assemblies. Convert the one and two byte inline assemblies to C functions. The generated code of the new implementation is nearly as good or bad as the old variant, but easier to read. Note that the new variants are quite close to the generic cmpxchg_emu_u8() implementation, however a conversion to the generic variant will not follow since with mm/vmstat.c there is heavy user of one byte cmpxchg(). A not inlined variant would have a negative performance impact. Also note that the calls within __arch_cmpxchg() come with rather pointless "& 0xff..." operations. They exist only to avoid false positive sparse warnings like "warning: cast truncates bits from constant value ...". Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07arch: introduce set_direct_map_valid_noflush()Mike Rapoport (Microsoft)
Add an API that will allow updates of the direct/linear map for a set of physically contiguous pages. It will be used in the following patches. Link: https://lkml.kernel.org/r/20241023162711.2579610-6-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Tested-by: kdevops <kdevops@lists.linux.dev> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07s390/sparsemem: Provide phys_to_target_node() with CONFIG_NUMAHeiko Carstens
Add a trival phys_to_target_node() implementation which always returns 0 if CONFIG_NUMA is enabled, since the s390 NUMA implementation only supports node 0. This is similar to memory_add_physaddr_to_nid() in order to avoid runtime warnings. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/sparsemem: Provide memory_add_physaddr_to_nid() with CONFIG_NUMADavid Hildenbrand
virtio-mem uses memory_add_physaddr_to_nid() to determine the NID to use for memory it adds. We currently fallback to the dummy implementation in mm/numa.c with CONFIG_NUMA, which will end up triggering an undesired pr_info_once(): Unknown online node for memory at 0x100000000, assuming node 0 On s390, we map all cpus and memory to node 0, so let's add a simple memory_add_physaddr_to_nid() implementation that does exactly that, but without complaining. Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241025141453.1210600-8-david@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/sparsemem: Reduce section size to 128 MiBDavid Hildenbrand
Ever since commit 421c175c4d609 ("[S390] Add support for memory hot-add.") we've been using a section size of 256 MiB on s390 and 32 MiB on s390. Before that, we were using a section size of 32 MiB on both architectures. Now that we have a new mechanism to expose additional memory to a VM -- virtio-mem -- reduce the section size to 128 MiB to allow for more flexibility and reduce the metadata overhead when dealing with hot(un)plug granularity smaller than 256 MiB. 128 MiB has been used by x86-64 since the very beginning. arm64 with 4k base pages switched to 128 MiB as well: it's just big enough on these architectures to allows for using a huge page (2 MiB) in the vmemmap in sane setups with sizeof(struct page) == 64 bytes and a huge page mapping in the direct mapping, while still allowing for small hot(un)plug granularity. For s390, we could even switch to a 64 MiB section size, as our huge page size is 1 MiB: but the smaller the section size, the more sections we'll have to manage especially on bigger machines. Making it consistent with x86-64 and arm64 feels like the right thing for now. Note that the smallest memory hot(un)plug granularity is also limited by the memory block size, determined by extracting the memory increment size from SCLP. Under QEMU/KVM, implementing virtio-mem, we expose 0; therefore, we'll end up with a memory block size of 128 MiB with a 128 MiB section size. Tested-by: Mario Casquero <mcasquer@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241025141453.1210600-7-david@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/physmem_info: Query diag500(STORAGE LIMIT) to support QEMU/KVM memory ↵David Hildenbrand
devices To support memory devices under QEMU/KVM, such as virtio-mem, we have to prepare our kernel virtual address space accordingly and have to know the highest possible physical memory address we might see later: the storage limit. The good old SCLP interface is not suitable for this use case. In particular, memory owned by memory devices has no relationship to storage increments, it is always detected using the device driver, and unaware OSes (no driver) must never try making use of that memory. Consequently this memory is located outside of the "maximum storage increment"-indicated memory range. Let's use our new diag500 STORAGE_LIMIT subcode to query this storage limit that can exceed the "maximum storage increment", and use the existing interfaces (i.e., SCLP) to obtain information about the initial memory that is not owned+managed by memory devices. If a hypervisor does not support such memory devices, the address exposed through diag500 STORAGE_LIMIT will correspond to the maximum storage increment exposed through SCLP. To teach kdump on s390 to include memory owned by memory devices, there will be ways to query the relevant memory ranges from the device via a driver running in special kdump mode (like virtio-mem already implements to filter /proc/vmcore access so we don't end up reading from unplugged device blocks). Update setup_ident_map_size(), to clarify that there can be more than just online and standby memory. Tested-by: Mario Casquero <mcasquer@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241025141453.1210600-4-david@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-06mm: remove unused hugepage for vma_alloc_folio()Kefeng Wang
The hugepage parameter was deprecated since commit ddc1a5cbc05d ("mempolicy: alloc_pages_mpol() for NUMA policy without vma"), for PMD-sized THP, it still tries only preferred node if possible in vma_alloc_folio() by checking the order of the folio allocation. Link: https://lkml.kernel.org/r/20241010061556.1846751-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Barry Song <baohua@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06mm: consolidate common checks in hugetlb_get_unmapped_areaOscar Salvador
prepare_hugepage_range() performs almost the same checks for all architectures that define it, with the exception of mips and loongarch that also check for overflows. The rest checks for the addr and len to be properly aligned, so we can move that to hugetlb_get_unmapped_area() and get rid of a fair amount of duplicated code. [akpm@linux-foundation.org: remove now-unused local] Link: https://lore.kernel.org/oe-kbuild-all/202410081210.uNLbf3Jk-lkp@intel.com/ Link: https://lkml.kernel.org/r/20241007075037.267650-10-osalvador@suse.de Signed-off-by: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-06arch/s390: clean up hugetlb definitionsOscar Salvador
s390 redefines functions that are already defined (and the same) in include/asm-generic/hugetlb.h. Do as the other architectures: 1) include include/asm-generic/hugetlb.h 2) drop the already defined functions in the generic hugetlb.h and 3) use the __HAVE_ARCH_HUGE_* macros to define our own. This gets rid of quite some code. Link: https://lkml.kernel.org/r/20241007075037.267650-9-osalvador@suse.de Signed-off-by: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-02vdso: Rename struct arch_vdso_data to arch_vdso_time_dataNam Cao
The struct arch_vdso_data is only about vdso time data. So rename it to arch_vdso_time_data to make it obvious. Non time-related data will be migrated out of these structs soon. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-28-b64f0842d512@linutronix.de
2024-11-02s390/vdso: Drop LBASE_VDSOThomas Weißschuh
This constant is always "0", providing no value and making the logic harder to understand. Also prepare for a consolidation of the vdso linkerscript logic by aligning it with other architectures. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-3-b64f0842d512@linutronix.de
2024-10-30s390/time: Add PtP driverSven Schnelle
Add a small PtP driver which allows user space to get the values of the physical and tod clock. This allows programs like chrony to use STP as clock source and steer the kernel clock. The physical clock can be used as a debugging aid to get the clock without any additional offsets like STP steering or LPAR offset. Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Link: https://patch.msgid.link/20241023065601.449586-3-svens@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29s390: Remove gmap pointer from lowcoreClaudio Imbrenda
Remove the gmap pointer from lowcore, since it is not used anymore. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20241022120601.167009-9-imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/mm/gmap: Remove gmap_{en,dis}able()Claudio Imbrenda
Remove gmap_enable(), gmap_disable(), and gmap_get_enabled() since they do not have any users anymore. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20241022120601.167009-8-imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/kvm: Stop using gmap_{en,dis}able()Claudio Imbrenda
Stop using gmap_enable(), gmap_disable(), gmap_get_enabled(). The correct guest ASCE is passed as a parameter of sie64a(), there is no need to save the current gmap in lowcore. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20241022120601.167009-7-imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/mm/fault: Handle guest-related program interrupts in KVMClaudio Imbrenda
Any program interrupt that happens in the host during the execution of a KVM guest will now short circuit the fault handler and return to KVM immediately. Guest fault handling (including pfault) will happen entirely inside KVM. When sie64a() returns zero, current->thread.gmap_int_code will contain the program interrupt number that caused the exit, or zero if the exit was not caused by a host program interrupt. KVM will now take care of handling all guest faults in vcpu_post_run(). Since gmap faults will not be visible by the rest of the kernel, remove GMAP_FAULT, the linux fault handlers for secure execution faults, the exception table entries for the sie instruction, the nop padding after the sie instruction, and all other references to guest faults from the s390 code. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Co-developed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20241022120601.167009-6-imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/entry: Remove __GMAP_ASCE and use _PIF_GUEST_FAULT againClaudio Imbrenda
Now that the guest ASCE is passed as a parameter to __sie64a(), _PIF_GUEST_FAULT can be used again to determine whether the fault was a guest or host fault. Since the guest ASCE will not be taken from the gmap pointer in lowcore anymore, __GMAP_ASCE can be removed. For the same reason the guest ASCE needs now to be saved into the cr1 save area unconditionally. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20241022120601.167009-2-imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/crypto: Add hardware acceleration for full AES-XTS modeHolger Dengler
Extend the existing paes cipher to exploit the full AES-XTS hardware acceleration introduced with message-security assist extension 10. The full AES-XTS mode requires a protected key of type PKEY_KEYTYPE_AES_XTS_128 or PKEY_KEYTYPE_AES_XTS_256. Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/pkey: Add new pkey handler module pkey-uvHarald Freudenberger
This new pkey handler module supports the conversion of Ultravisor retrievable secrets to protected keys. The new module pkey-uv.ko is able to retrieve and verify protected keys backed up by the Ultravisor layer which is only available within protected execution environment. The module is only automatically loaded if there is the UV CPU feature flagged as available. Additionally on module init there is a check for protected execution environment and for UV supporting retrievable secrets. Also if the kernel is not running as a protected execution guest, the module unloads itself with errno ENODEV. The pkey UV module currently supports these Ultravisor secrets and is able to retrieve a protected key for these UV secret types: - UV_SECRET_AES_128 - UV_SECRET_AES_192 - UV_SECRET_AES_256 - UV_SECRET_AES_XTS_128 - UV_SECRET_AES_XTS_256 - UV_SECRET_HMAC_SHA_256 - UV_SECRET_HMAC_SHA_512 - UV_SECRET_ECDSA_P256 - UV_SECRET_ECDSA_P384 - UV_SECRET_ECDSA_P521 - UV_SECRET_ECDSA_ED25519 - UV_SECRET_ECDSA_ED448 Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/pkey: Fix checkpatch findings in pkey header fileHarald Freudenberger
Fix all the complains from checkpatch for the pkey header file: CHECK: No space is necessary after a cast + PKEY_TYPE_CCA_DATA = (__u32) 1, CHECK: Please use a blank line after function/struct/union/enum declarations +}; +#define PKEY_GENSECK _IOWR(PKEY_IOCTL_MAGIC, 0x01, struct pkey_genseck) Suggested-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/uv: Retrieve UV secrets sysfs supportSteffen Eiden
Reflect the updated content in the query information UVC to the sysfs at /sys/firmware/query * new UV-query sysfs entry for the maximum number of retrievable secrets the UV can store for one secure guest. * new UV-query sysfs entry for the maximum number of association secrets the UV can store for one secure guest. * max_secrets contains the sum of max association and max retrievable secrets. Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Signed-off-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20241024062638.1465970-7-seiden@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/uvdevice: Increase indent in IOCTL definitionsSteffen Eiden
Increase the indentations in the IOCTL defines so that we will not have problems with upcoming, longer constant names. While at it, fix a minor typo. Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Signed-off-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20241024062638.1465970-5-seiden@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/uvdevice: Add Retrieve Secret IOCTLSteffen Eiden
Add a new IOCL number to support the new Retrieve Secret UVC for user-space. User-space provides the index of the secret (u16) to retrieve. The uvdevice calls the Retrieve Secret UVC and copies the secret into the provided buffer if it fits. To get the secret type, index, and size user-space needs to call the List UVC first. Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20241024062638.1465970-4-seiden@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>