summaryrefslogtreecommitdiff
path: root/arch/arm64/include/uapi
AgeCommit message (Collapse)Author
2019-09-18Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "s390: - ioctl hardening - selftests ARM: - ITS translation cache - support for 512 vCPUs - various cleanups and bugfixes PPC: - various minor fixes and preparation x86: - bugfixes all over the place (posted interrupts, SVM, emulation corner cases, blocked INIT) - some IPI optimizations" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (75 commits) KVM: X86: Use IPI shorthands in kvm guest when support KVM: x86: Fix INIT signal handling in various CPU states KVM: VMX: Introduce exit reason for receiving INIT signal on guest-mode KVM: VMX: Stop the preemption timer during vCPU reset KVM: LAPIC: Micro optimize IPI latency kvm: Nested KVM MMUs need PAE root too KVM: x86: set ctxt->have_exception in x86_decode_insn() KVM: x86: always stop emulation on page fault KVM: nVMX: trace nested VM-Enter failures detected by H/W KVM: nVMX: add tracepoint for failed nested VM-Enter x86: KVM: svm: Fix a check in nested_svm_vmrun() KVM: x86: Return to userspace with internal error on unexpected exit reason KVM: x86: Add kvm_emulate_{rd,wr}msr() to consolidate VXM/SVM code KVM: x86: Refactor up kvm_{g,s}et_msr() to simplify callers doc: kvm: Fix return description of KVM_SET_MSRS KVM: X86: Tune PLE Window tracepoint KVM: VMX: Change ple_window type to unsigned int KVM: X86: Remove tailing newline for tracepoints KVM: X86: Trace vcpu_id for vmexit KVM: x86: Manually calculate reserved bits when loading PDPTRS ...
2019-09-09KVM: arm/arm64: vgic: Allow more than 256 vcpus for KVM_IRQ_LINEMarc Zyngier
While parts of the VGIC support a large number of vcpus (we bravely allow up to 512), other parts are more limited. One of these limits is visible in the KVM_IRQ_LINE ioctl, which only allows 256 vcpus to be signalled when using the CPU or PPI types. Unfortunately, we've cornered ourselves badly by allocating all the bits in the irq field. Since the irq_type subfield (8 bit wide) is currently only taking the values 0, 1 and 2 (and we have been careful not to allow anything else), let's reduce this field to only 4 bits, and allocate the remaining 4 bits to a vcpu2_index, which acts as a multiplier: vcpu_id = 256 * vcpu2_index + vcpu_index With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2) allowing this to be discovered, it becomes possible to inject PPIs to up to 4096 vcpus. But please just don't. Whilst we're there, add a clarification about the use of KVM_IRQ_LINE on arm, which is not completely conditionned by KVM_CAP_IRQCHIP. Reported-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-08-05arm64: remove unneeded uapi/asm/stat.hMasahiro Yamada
stat.h is listed in include/uapi/asm-generic/Kbuild, so Kbuild will automatically generate it. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-25treewide: add "WITH Linux-syscall-note" to SPDX tag of uapi headersMasahiro Yamada
UAPI headers licensed under GPL are supposed to have exception "WITH Linux-syscall-note" so that they can be included into non-GPL user space application code. The exception note is missing in some UAPI headers. Some of them slipped in by the treewide conversion commit b24413180f56 ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license"). Just run: $ git show --oneline b24413180f56 -- arch/x86/include/uapi/asm/ I believe they are not intentional, and should be fixed too. This patch was generated by the following script: git grep -l --not -e Linux-syscall-note --and -e SPDX-License-Identifier \ -- :arch/*/include/uapi/asm/*.h :include/uapi/ :^*/Kbuild | while read file do sed -i -e '/[[:space:]]OR[[:space:]]/s/\(GPL-[^[:space:]]*\)/(\1 WITH Linux-syscall-note)/g' \ -e '/[[:space:]]or[[:space:]]/s/\(GPL-[^[:space:]]*\)/(\1 WITH Linux-syscall-note)/g' \ -e '/[[:space:]]OR[[:space:]]/!{/[[:space:]]or[[:space:]]/!s/\(GPL-[^[:space:]]*\)/\1 WITH Linux-syscall-note/g}' $file done After this patch is applied, there are 5 UAPI headers that do not contain "WITH Linux-syscall-note". They are kept untouched since this exception applies only to GPL variants. $ git grep --not -e Linux-syscall-note --and -e SPDX-License-Identifier \ -- :arch/*/include/uapi/asm/*.h :include/uapi/ :^*/Kbuild include/uapi/drm/panfrost_drm.h:/* SPDX-License-Identifier: MIT */ include/uapi/linux/batman_adv.h:/* SPDX-License-Identifier: MIT */ include/uapi/linux/qemu_fw_cfg.h:/* SPDX-License-Identifier: BSD-3-Clause */ include/uapi/linux/vbox_err.h:/* SPDX-License-Identifier: MIT */ include/uapi/linux/virtio_iommu.h:/* SPDX-License-Identifier: BSD-3-Clause */ Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - support for chained PMU counters in guests - improved SError handling - handle Neoverse N1 erratum #1349291 - allow side-channel mitigation status to be migrated - standardise most AArch64 system register accesses to msr_s/mrs_s - fix host MPIDR corruption on 32bit - selftests ckleanups x86: - PMU event {white,black}listing - ability for the guest to disable host-side interrupt polling - fixes for enlightened VMCS (Hyper-V pv nested virtualization), - new hypercall to yield to IPI target - support for passing cstate MSRs through to the guest - lots of cleanups and optimizations Generic: - Some txt->rST conversions for the documentation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (128 commits) Documentation: virtual: Add toctree hooks Documentation: kvm: Convert cpuid.txt to .rst Documentation: virtual: Convert paravirt_ops.txt to .rst KVM: x86: Unconditionally enable irqs in guest context KVM: x86: PMU Event Filter kvm: x86: Fix -Wmissing-prototypes warnings KVM: Properly check if "page" is valid in kvm_vcpu_unmap KVM: arm/arm64: Initialise host's MPIDRs by reading the actual register KVM: LAPIC: Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane kvm: LAPIC: write down valid APIC registers KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_s KVM: doc: Add API documentation on the KVM_REG_ARM_WORKAROUNDS register KVM: arm/arm64: Add save/restore support for firmware workaround state arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests KVM: arm/arm64: Support chained PMU counters KVM: arm/arm64: Remove pmc->bitmask KVM: arm/arm64: Re-create event when setting counter value KVM: arm/arm64: Extract duplicated code to own function KVM: arm/arm64: Rename kvm_pmu_{enable/disable}_counter functions KVM: LAPIC: ARBPRI is a reserved register for x2APIC ...
2019-07-09Merge tag 'docs-5.3' of git://git.lwn.net/linuxLinus Torvalds
Pull Documentation updates from Jonathan Corbet: "It's been a relatively busy cycle for docs: - A fair pile of RST conversions, many from Mauro. These create more than the usual number of simple but annoying merge conflicts with other trees, unfortunately. He has a lot more of these waiting on the wings that, I think, will go to you directly later on. - A new document on how to use merges and rebases in kernel repos, and one on Spectre vulnerabilities. - Various improvements to the build system, including automatic markup of function() references because some people, for reasons I will never understand, were of the opinion that :c:func:``function()`` is unattractive and not fun to type. - We now recommend using sphinx 1.7, but still support back to 1.4. - Lots of smaller improvements, warning fixes, typo fixes, etc" * tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits) docs: automarkup.py: ignore exceptions when seeking for xrefs docs: Move binderfs to admin-guide Disable Sphinx SmartyPants in HTML output doc: RCU callback locks need only _bh, not necessarily _irq docs: format kernel-parameters -- as code Doc : doc-guide : Fix a typo platform: x86: get rid of a non-existent document Add the RCU docs to the core-api manual Documentation: RCU: Add TOC tree hooks Documentation: RCU: Rename txt files to rst Documentation: RCU: Convert RCU UP systems to reST Documentation: RCU: Convert RCU linked list to reST Documentation: RCU: Convert RCU basic concepts to reST docs: filesystems: Remove uneeded .rst extension on toctables scripts/sphinx-pre-install: fix out-of-tree build docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/ Documentation: PGP: update for newer HW devices Documentation: Add section about CPU vulnerabilities for Spectre Documentation: platform: Delete x86-laptop-drivers.txt docs: Note that :c:func: should no longer be used ...
2019-07-08Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP} - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to manage the permissions of executable vmalloc regions more strictly - Slight performance improvement by keeping softirqs enabled while touching the FPSIMD/SVE state (kernel_neon_begin/end) - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG and AXFLAG instructions for floating point comparison flags manipulation) and FRINT (rounding floating point numbers to integers) - Re-instate ARM64_PSEUDO_NMI support which was previously marked as BROKEN due to some bugs (now fixed) - Improve parking of stopped CPUs and implement an arm64-specific panic_smp_self_stop() to avoid warning on not being able to stop secondary CPUs during panic - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI platforms - perf: DDR performance monitor support for iMX8QXP - cache_line_size() can now be set from DT or ACPI/PPTT if provided to cope with a system cache info not exposed via the CPUID registers - Avoid warning on hardware cache line size greater than ARCH_DMA_MINALIGN if the system is fully coherent - arm64 do_page_fault() and hugetlb cleanups - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep) - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags' introduced in 5.1) - CONFIG_RANDOMIZE_BASE now enabled in defconfig - Allow the selection of ARM64_MODULE_PLTS, currently only done via RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill over into the vmalloc area - Make ZONE_DMA32 configurable * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits) perf: arm_spe: Enable ACPI/Platform automatic module loading arm_pmu: acpi: spe: Add initial MADT/SPE probing ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens ACPI/PPTT: Modify node flag detection to find last IDENTICAL x86/entry: Simplify _TIF_SYSCALL_EMU handling arm64: rename dump_instr as dump_kernel_instr arm64/mm: Drop [PTE|PMD]_TYPE_FAULT arm64: Implement panic_smp_self_stop() arm64: Improve parking of stopped CPUs arm64: Expose FRINT capabilities to userspace arm64: Expose ARMv8.5 CondM capability to userspace arm64: defconfig: enable CONFIG_RANDOMIZE_BASE arm64: ARM64_MODULES_PLTS must depend on MODULES arm64: bpf: do not allocate executable memory arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP arm64: module: create module allocations without exec permissions arm64: Allow user selection of ARM64_MODULE_PLTS acpi/arm64: ignore 5.1 FADTs that are reported as 5.0 arm64: Allow selecting Pseudo-NMI again ...
2019-07-05KVM: arm/arm64: Add save/restore support for firmware workaround stateAndre Przywara
KVM implements the firmware interface for mitigating cache speculation vulnerabilities. Guests may use this interface to ensure mitigation is active. If we want to migrate such a guest to a host with a different support level for those workarounds, migration might need to fail, to ensure that critical guests don't loose their protection. Introduce a way for userland to save and restore the workarounds state. On restoring we do checks that make sure we don't downgrade our mitigation level. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-06-25arm64: Expose FRINT capabilities to userspaceMark Brown
ARMv8.5 introduces the FRINT series of instructions for rounding floating point numbers to integers. Provide a capability to userspace in order to allow applications to determine if the system supports these instructions. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-25arm64: Expose ARMv8.5 CondM capability to userspaceMark Brown
ARMv8.5 adds new instructions XAFLAG and AXFLAG to translate the representation of the results of floating point comparisons between the native ARM format and an alternative format used by some software. Add a hwcap allowing userspace to determine if they are present, since we referred to earlier CondM extensions as FLAGM call these extensions FLAGM2. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-18arm64/sve: <uapi/asm/ptrace.h> should not depend on <uapi/linux/prctl.h>Anisse Astier
Pulling linux/prctl.h into asm/ptrace.h in the arm64 UAPI headers causes userspace build issues for any program (e.g. strace and qemu) that includes both <sys/prctl.h> and <linux/ptrace.h> when using musl libc: | error: redefinition of 'struct prctl_mm_map' | struct prctl_mm_map { See https://github.com/foundriesio/meta-lmp/commit/6d4a106e191b5d79c41b9ac78fd321316d3013c0 for a public example of people working around this issue. Although it's a bit grotty, fix this breakage by duplicating the prctl constant definitions. Since these are part of the kernel ABI, they cannot be changed in future and so it's not the end of the world to have them open-coded. Fixes: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support") Cc: stable@vger.kernel.org Acked-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Anisse Astier <aastier@freebox.fr> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-06-14docs: arm64: convert docs to ReST and rename to .rstMauro Carvalho Chehab
The documentation is in a format that is very close to ReST format. The conversion is actually: - add blank lines in order to identify paragraphs; - fixing tables markups; - adding some lists markups; - marking literal blocks; - adjust some title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-06-13arm64/sve: Fix missing SVE/FPSIMD endianness conversionsDave Martin
The in-memory representation of SVE and FPSIMD registers is different: the FPSIMD V-registers are stored as single 128-bit host-endian values, whereas SVE registers are stored in an endianness-invariant byte order. This means that the two representations differ when running on a big-endian host. But we blindly copy data from one representation to another when converting between the two, resulting in the register contents being unintentionally byteswapped in certain situations. Currently this can be triggered by the first SVE instruction after a syscall, for example (though the potential trigger points may vary in future). So, fix the conversion functions fpsimd_to_sve(), sve_to_fpsimd() and sve_sync_from_fpsimd_zeropad() to swab where appropriate. There is no common swahl128() or swab128() that we could use here. Maybe it would be worth making this generic, but for now add a simple local hack. Since the byte order differences are exposed in ABI, also clarify the documentation. Cc: Alex Bennée <alex.bennee@linaro.org> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Alan Hayward <alan.hayward@arm.com> Cc: Julien Grall <julien.grall@arm.com> Fixes: bc0ee4760364 ("arm64/sve: Core task context handling") Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support") Fixes: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support") Signed-off-by: Dave Martin <Dave.Martin@arm.com> [will: Fix typos in comments and docs spotted by Julien] Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-06-05arm64: add PTRACE_SYSEMU{,SINGLESTEP} definations to uapi headersSudeep Holla
x86 and um use 31 and 32 for PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP while powerpc uses different value maybe for legacy reasons. Though handling of PTRACE_SYSEMU can be made architecture independent, it's hard to make these definations generic. To add to this existing mess few architectures like arm, c6x and sh use 31 for PTRACE_GETFDPIC (get the ELF fdpic loadmap address). It's not possible to move the definations to generic headers. So we unfortunately have to duplicate the same defination to ARM64 if we need to support PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP. Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-05-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - support for SVE and Pointer Authentication in guests - PMU improvements POWER: - support for direct access to the POWER9 XIVE interrupt controller - memory and performance optimizations x86: - support for accessing memory not backed by struct page - fixes and refactoring Generic: - dirty page tracking improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (155 commits) kvm: fix compilation on aarch64 Revert "KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU" kvm: x86: Fix L1TF mitigation for shadow MMU KVM: nVMX: Disable intercept for FS/GS base MSRs in vmcs02 when possible KVM: PPC: Book3S: Remove useless checks in 'release' method of KVM device KVM: PPC: Book3S HV: XIVE: Fix spelling mistake "acessing" -> "accessing" KVM: PPC: Book3S HV: Make sure to load LPID for radix VCPUs kvm: nVMX: Set nested_run_pending in vmx_set_nested_state after checks complete tests: kvm: Add tests for KVM_SET_NESTED_STATE KVM: nVMX: KVM_SET_NESTED_STATE - Tear down old EVMCS state before setting new state tests: kvm: Add tests for KVM_CAP_MAX_VCPUS and KVM_CAP_MAX_CPU_ID tests: kvm: Add tests to .gitignore KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 KVM: Fix kvm_clear_dirty_log_protect off-by-(minus-)one KVM: Fix the bitmap range to copy during clear dirty KVM: arm64: Fix ptrauth ID register masking logic KVM: x86: use direct accessors for RIP and RSP KVM: VMX: Use accessors for GPRs outside of dedicated caching logic KVM: x86: Omit caching logic for always-available GPRs kvm, x86: Properly check whether a pfn is an MMIO or not ...
2019-04-24KVM: arm64: Add userspace flag to enable pointer authenticationAmit Daniel Kachhap
Now that the building blocks of pointer authentication are present, lets add userspace flags KVM_ARM_VCPU_PTRAUTH_ADDRESS and KVM_ARM_VCPU_PTRAUTH_GENERIC. These flags will enable pointer authentication for the KVM guest on a per-vcpu basis through the ioctl KVM_ARM_VCPU_INIT. This features will allow the KVM guest to allow the handling of pointer authentication instructions or to treat them as undefined if not set. Necessary documentations are added to reflect the changes done. Reviewed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-23arm64: Expose SVE2 features for userspaceDave Martin
This patch provides support for reporting the presence of SVE2 and its optional features to userspace. This will also enable visibility of SVE2 for guests, when KVM support for SVE-enabled guests is available. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-18KVM: arm64/sve: Simplify KVM_REG_ARM64_SVE_VLS array sizingDave Martin
A complicated DIV_ROUND_UP() expression is currently written out explicitly in multiple places in order to specify the size of the bitmap exchanged with userspace to represent the value of the KVM_REG_ARM64_SVE_VLS pseudo-register. Userspace currently has no direct way to work this out either: for documentation purposes, the size is just quoted as 8 u64s. To make this more intuitive, this patch replaces these with a single define, which is also exported to userspace as KVM_ARM64_SVE_VLS_WORDS. Since the number of words in a bitmap is just the index of the last word used + 1, this patch expresses the bound that way instead. This should make it clearer what is being expressed. For userspace convenience, the minimum and maximum possible vector lengths relevant to the KVM ABI are exposed to UAPI as KVM_ARM64_SVE_VQ_MIN, KVM_ARM64_SVE_VQ_MAX. Since the only direct use for these at present is manipulation of KVM_REG_ARM64_SVE_VLS, no corresponding _VL_ macros are defined. They could be added later if a need arises. Since use of DIV_ROUND_UP() was the only reason for including <linux/kernel.h> in guest.c, this patch also removes that #include. Suggested-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-18KVM: arm64/sve: Clean up UAPI register ID definitionsDave Martin
Currently, the SVE register ID macros are not all defined in the same way, and advertise the fact that FFR maps onto the nonexistent predicate register P16. This is really just for kernel convenience, and may lead userspace into bad habits. Instead, this patch masks the ID macro arguments so that architecturally invalid register numbers will not be passed through any more, and uses a literal KVM_REG_ARM64_SVE_FFR_BASE macro to define KVM_REG_ARM64_SVE_FFR(), similarly to the way the _ZREG() and _PREG() macros are defined. Rather than plugging in magic numbers for the number of Z- and P- registers and the maximum possible number of register slices, this patch provides definitions for those too. Userspace is going to need them in any case, and it makes sense for them to come from <uapi/asm/kvm.h>. sve_reg_to_region() uses convenience constants that are defined in a different way, and also makes use of the fact that the FFR IDs are really contiguous with the P15 IDs, so this patch retains the existing convenience constants in guest.c, supplemented with a couple of sanity checks to check for consistency with the UAPI header. Fixes: e1c9c98345b3 ("KVM: arm64/sve: Add SVE support to register access ioctl interface") Suggested-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-16arm64: Expose DC CVADP to userspaceAndrew Murray
ARMv8.5 builds upon the ARMv8.2 DC CVAP instruction by introducing a DC CVADP instruction which cleans the data cache to the point of deep persistence. Let's expose this support via the arm64 ELF hwcaps. Signed-off-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-16arm64: HWCAP: add support for AT_HWCAP2Andrew Murray
As we will exhaust the first 32 bits of AT_HWCAP let's start exposing AT_HWCAP2 to userspace to give us up to 64 caps. Whilst it's possible to use the remaining 32 bits of AT_HWCAP, we prefer to expand into AT_HWCAP2 in order to provide a consistent view to userspace between ILP32 and LP64. However internal to the kernel we prefer to continue to use the full space of elf_hwcap. To reduce complexity and allow for future expansion, we now represent hwcaps in the kernel as ordinals and use a KERNEL_HWCAP_ prefix. This allows us to support automatic feature based module loading for all our hwcaps. We introduce cpu_set_feature to set hwcaps which complements the existing cpu_have_feature helper. These helpers allow us to clean up existing direct uses of elf_hwcap and reduce any future effort required to move beyond 64 caps. For convenience we also introduce cpu_{have,set}_named_feature which makes use of the cpu_feature macro to allow providing a hwcap name without a {KERNEL_}HWCAP_ prefix. Signed-off-by: Andrew Murray <andrew.murray@arm.com> [will: use const_ilog2() and tweak documentation] Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-03-29KVM: arm64/sve: Add pseudo-register for the guest's vector lengthsDave Martin
This patch adds a new pseudo-register KVM_REG_ARM64_SVE_VLS to allow userspace to set and query the set of vector lengths visible to the guest. In the future, multiple register slices per SVE register may be visible through the ioctl interface. Once the set of slices has been determined we would not be able to allow the vector length set to be changed any more, in order to avoid userspace seeing inconsistent sets of registers. For this reason, this patch adds support for explicit finalization of the SVE configuration via the KVM_ARM_VCPU_FINALIZE ioctl. Finalization is the proper place to allocate the SVE register state storage in vcpu->arch.sve_state, so this patch adds that as appropriate. The data is freed via kvm_arch_vcpu_uninit(), which was previously a no-op on arm64. To simplify the logic for determining what vector lengths can be supported, some code is added to KVM init to work this out, in the kvm_arm_init_arch_resources() hook. The KVM_REG_ARM64_SVE_VLS pseudo-register is not exposed yet. Subsequent patches will allow SVE to be turned on for guest vcpus, making it visible. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Tested-by: zhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-03-29KVM: arm64/sve: Add SVE support to register access ioctl interfaceDave Martin
This patch adds the following registers for access via the KVM_{GET,SET}_ONE_REG interface: * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices) * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices) * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices) In order to adapt gracefully to future architectural extensions, the registers are logically divided up into slices as noted above: the i parameter denotes the slice index. This allows us to reserve space in the ABI for future expansion of these registers. However, as of today the architecture does not permit registers to be larger than a single slice, so no code is needed in the kernel to expose additional slices, for now. The code can be extended later as needed to expose them up to a maximum of 32 slices (as carved out in the architecture itself) if they really exist someday. The registers are only visible for vcpus that have SVE enabled. They are not enumerated by KVM_GET_REG_LIST on vcpus that do not have SVE. Accesses to the FPSIMD registers via KVM_REG_ARM_CORE is not allowed for SVE-enabled vcpus: SVE-aware userspace can use the KVM_REG_ARM64_SVE_ZREG() interface instead to access the same register state. This avoids some complex and pointless emulation in the kernel to convert between the two views of these aliased registers. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Tested-by: zhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-03-17kbuild: force all architectures except um to include mandatory-yMasahiro Yamada
Currently, every arch/*/include/uapi/asm/Kbuild explicitly includes the common Kbuild.asm file. Factor out the duplicated include directives to scripts/Makefile.asm-generic so that no architecture would opt out of the mandatory-y mechanism. um is not forced to include mandatory-y since it is a very exceptional case which does not support UAPI. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-03-10Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Pseudo NMI support for arm64 using GICv3 interrupt priorities - uaccess macros clean-up (unsafe user accessors also merged but reverted, waiting for objtool support on arm64) - ptrace regsets for Pointer Authentication (ARMv8.3) key management - inX() ordering w.r.t. delay() on arm64 and riscv (acks in place by the riscv maintainers) - arm64/perf updates: PMU bindings converted to json-schema, unused variable and misleading comment removed - arm64/debug fixes to ensure checking of the triggering exception level and to avoid the propagation of the UNKNOWN FAR value into the si_code for debug signals - Workaround for Fujitsu A64FX erratum 010001 - lib/raid6 ARM NEON optimisations - NR_CPUS now defaults to 256 on arm64 - Minor clean-ups (documentation/comments, Kconfig warning, unused asm-offsets, clang warnings) - MAINTAINERS update for list information to the ARM64 ACPI entry * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits) arm64: mmu: drop paging_init comments arm64: debug: Ensure debug handlers check triggering exception level arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals Revert "arm64: uaccess: Implement unsafe accessors" arm64: avoid clang warning about self-assignment arm64: Kconfig.platforms: fix warning unmet direct dependencies lib/raid6: arm: optimize away a mask operation in NEON recovery routine lib/raid6: use vdupq_n_u8 to avoid endianness warnings arm64: io: Hook up __io_par() for inX() ordering riscv: io: Update __io_[p]ar() macros to take an argument asm-generic/io: Pass result of I/O accessor to __io_[p]ar() arm64: Add workaround for Fujitsu A64FX erratum 010001 arm64: Rename get_thread_info() arm64: Remove documentation about TIF_USEDFPU arm64: irqflags: Fix clang build warnings arm64: Enable the support of pseudo-NMIs arm64: Skip irqflags tracing for NMI in IRQs disabled context arm64: Skip preemption when exiting an NMI arm64: Handle serror in NMI context irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI ...
2019-02-19asm-generic: Make time32 syscall numbers optionalArnd Bergmann
We don't want new architectures to even provide the old 32-bit time_t based system calls any more, or define the syscall number macros. Add a new __ARCH_WANT_TIME32_SYSCALLS macro that gets enabled for all existing 32-bit architectures using the generic system call table, so we don't change any current behavior. Since this symbol is evaluated in user space as well, we cannot use a Kconfig CONFIG_* macro but have to define it in uapi/asm/unistd.h. On 64-bit architectures, the same system call numbers mostly refer to the system calls we want to keep, as they already pass 64-bit time_t. As new architectures no longer provide these, we need new exceptions in checksyscalls.sh. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-19asm-generic: Drop getrlimit and setrlimit syscalls from default listYury Norov
The newer prlimit64 syscall provides all the functionality of getrlimit and setrlimit syscalls and adds the pid of target process, so future architectures won't need to include getrlimit and setrlimit. Therefore drop getrlimit and setrlimit syscalls from the generic syscall list unless __ARCH_WANT_SET_GET_RLIMIT is defined by the architecture's unistd.h prior to including asm-generic/unistd.h, and adjust all architectures using the generic syscall list to define it so that no in-tree architectures are affected. Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-hexagon@vger.kernel.org Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: James Hogan <james.hogan@imgtec.com> [metag] Acked-by: Ley Foon Tan <lftan@altera.com> [nios2] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Will Deacon <will.deacon@arm.com> [arm64] Acked-by: Vineet Gupta <vgupta@synopsys.com> #arch/arc bits Signed-off-by: Yury Norov <ynorov@marvell.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-01arm64: add ptrace regsets for ptrauth key managementKristina Martsenko
Add two new ptrace regsets, which can be used to request and change the pointer authentication keys of a thread. NT_ARM_PACA_KEYS gives access to the instruction/data address keys, and NT_ARM_PACG_KEYS to the generic authentication key. The keys are also part of the core dump file of the process. The regsets are only exposed if the kernel is compiled with CONFIG_CHECKPOINT_RESTORE=y, as the only intended use case is checkpointing and restoring processes that are using pointer authentication. (This can be changed later if there are other use cases.) Reviewed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-01-06arch: remove redundant UAPI generic-y definesMasahiro Yamada
Now that Kbuild automatically creates asm-generic wrappers for missing mandatory headers, it is redundant to list the same headers in generic-y and mandatory-y. Suggested-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Sam Ravnborg <sam@ravnborg.org>
2019-01-06arch: remove stale comments "UAPI Header export list"Masahiro Yamada
These comments are leftovers of commit fcc8487d477a ("uapi: export all headers under uapi directories"). Prior to that commit, exported headers must be explicitly added to header-y. Now, all headers under the uapi/ directories are exported. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-01-04arm64/sve: Disentangle <uapi/asm/ptrace.h> from <uapi/asm/sigcontext.h>Dave Martin
Currently, <uapi/asm/sigcontext.h> provides common definitions for describing SVE context structures that are also used by the ptrace definitions in <uapi/asm/ptrace.h>. For this reason, a #include of <asm/sigcontext.h> was added in ptrace.h, but it this turns out that this can interact badly with userspace code that tries to include ptrace.h on top of the libc headers (which may provide their own shadow definitions for sigcontext.h). To make the headers easier for userspace to consume, this patch bounces the common definitions into an __SVE_* namespace and moves them to a backend header <uapi/asm/sve_context.h> that can be included by the other headers as appropriate. This should allow ptrace.h to be used alongside libc's sigcontext.h (if any) without ill effects. This should make the situation unambiguous: <asm/sigcontext.h> is the header to include for the sigframe-specific definitions, while <asm/ptrace.h> is the header to include for ptrace-specific definitions. To avoid conflicting with existing usage, <asm/sigcontext.h> remains the canonical way to get the common definitions for SVE_VQ_MIN, sve_vq_from_vl() etc., both in userspace and in the kernel: relying on these being defined as a side effect of including just <asm/ptrace.h> was never intended to be safe. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-01-04arm64/sve: ptrace: Fix SVE_PT_REGS_OFFSET definitionDave Martin
SVE_PT_REGS_OFFSET is supposed to indicate the offset for skipping over the ptrace NT_ARM_SVE header (struct user_sve_header) to the start of the SVE register data proper. However, currently SVE_PT_REGS_OFFSET is defined in terms of struct sve_context, which is wrong: that structure describes the SVE header in the signal frame, not in the ptrace regset. This patch fixes the definition to use the ptrace header structure struct user_sve_header instead. By good fortune, the two structures are the same size anyway, so there is no functional or ABI change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13arm64: expose user PAC bit positions via ptraceMark Rutland
When pointer authentication is in use, data/instruction pointers have a number of PAC bits inserted into them. The number and position of these bits depends on the configured TCR_ELx.TxSZ and whether tagging is enabled. ARMv8.3 allows tagging to differ for instruction and data pointers. For userspace debuggers to unwind the stack and/or to follow pointer chains, they need to be able to remove the PAC bits before attempting to use a pointer. This patch adds a new structure with masks describing the location of the PAC bits in userspace instruction and data pointers (i.e. those addressable via TTBR0), which userspace can query via PTRACE_GETREGSET. By clearing these bits from pointers (and replacing them with the value of bit 55), userspace can acquire the PAC-less versions. This new regset is exposed when the kernel is built with (user) pointer authentication support, and the address authentication feature is enabled. Otherwise, the regset is hidden. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Cc: Will Deacon <will.deacon@arm.com> [will: Fix to use vabits_user instead of VA_BITS and rename macro] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13arm64: add basic pointer authentication supportMark Rutland
This patch adds basic support for pointer authentication, allowing userspace to make use of APIAKey, APIBKey, APDAKey, APDBKey, and APGAKey. The kernel maintains key values for each process (shared by all threads within), which are initialised to random values at exec() time. The ID_AA64ISAR1_EL1.{APA,API,GPA,GPI} fields are exposed to userspace, to describe that pointer authentication instructions are available and that the kernel is managing the keys. Two new hwcaps are added for the same reason: PACA (for address authentication) and PACG (for generic authentication). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Tested-by: Adam Wallis <awallis@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> [will: Fix sizeof() usage and unroll address key initialisation] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: Add support for SB barrier and patch in over DSB; ISB sequencesWill Deacon
We currently use a DSB; ISB sequence to inhibit speculation in set_fs(). Whilst this works for current CPUs, future CPUs may implement a new SB barrier instruction which acts as an architected speculation barrier. On CPUs that support it, patch in an SB; NOP sequence over the DSB; ISB sequence and advertise the presence of the new instruction to userspace. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-10-25Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timekeeping updates from Thomas Gleixner: "The timers and timekeeping departement provides: - Another large y2038 update with further preparations for providing the y2038 safe timespecs closer to the syscalls. - An overhaul of the SHCMT clocksource driver - SPDX license identifier updates - Small cleanups and fixes all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) tick/sched : Remove redundant cpu_online() check clocksource/drivers/dw_apb: Add reset control clocksource: Remove obsolete CLOCKSOURCE_OF_DECLARE clocksource/drivers: Unify the names to timer-* format clocksource/drivers/sh_cmt: Add R-Car gen3 support dt-bindings: timer: renesas: cmt: document R-Car gen3 support clocksource/drivers/sh_cmt: Properly line-wrap sh_cmt_of_table[] initializer clocksource/drivers/sh_cmt: Fix clocksource width for 32-bit machines clocksource/drivers/sh_cmt: Fixup for 64-bit machines clocksource/drivers/sh_tmu: Convert to SPDX identifiers clocksource/drivers/sh_mtu2: Convert to SPDX identifiers clocksource/drivers/sh_cmt: Convert to SPDX identifiers clocksource/drivers/renesas-ostm: Convert to SPDX identifiers clocksource: Convert to using %pOFn instead of device_node.name tick/broadcast: Remove redundant check RISC-V: Request newstat syscalls y2038: signal: Change rt_sigtimedwait to use __kernel_timespec y2038: socket: Change recvmmsg to use __kernel_timespec y2038: sched: Change sched_rr_get_interval to use __kernel_timespec y2038: utimes: Rework #ifdef guards for compat syscalls ...
2018-10-24Merge branch 'siginfo-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull siginfo updates from Eric Biederman: "I have been slowly sorting out siginfo and this is the culmination of that work. The primary result is in several ways the signal infrastructure has been made less error prone. The code has been updated so that manually specifying SEND_SIG_FORCED is never necessary. The conversion to the new siginfo sending functions is now complete, which makes it difficult to send a signal without filling in the proper siginfo fields. At the tail end of the patchset comes the optimization of decreasing the size of struct siginfo in the kernel from 128 bytes to about 48 bytes on 64bit. The fundamental observation that enables this is by definition none of the known ways to use struct siginfo uses the extra bytes. This comes at the cost of a small user space observable difference. For the rare case of siginfo being injected into the kernel only what can be copied into kernel_siginfo is delivered to the destination, the rest of the bytes are set to 0. For cases where the signal and the si_code are known this is safe, because we know those bytes are not used. For cases where the signal and si_code combination is unknown the bits that won't fit into struct kernel_siginfo are tested to verify they are zero, and the send fails if they are not. I made an extensive search through userspace code and I could not find anything that would break because of the above change. If it turns out I did break something it will take just the revert of a single change to restore kernel_siginfo to the same size as userspace siginfo. Testing did reveal dependencies on preferring the signo passed to sigqueueinfo over si->signo, so bit the bullet and added the complexity necessary to handle that case. Testing also revealed bad things can happen if a negative signal number is passed into the system calls. Something no sane application will do but something a malicious program or a fuzzer might do. So I have fixed the code that performs the bounds checks to ensure negative signal numbers are handled" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits) signal: Guard against negative signal numbers in copy_siginfo_from_user32 signal: Guard against negative signal numbers in copy_siginfo_from_user signal: In sigqueueinfo prefer sig not si_signo signal: Use a smaller struct siginfo in the kernel signal: Distinguish between kernel_siginfo and siginfo signal: Introduce copy_siginfo_from_user and use it's return value signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE signal: Fail sigqueueinfo if si_signo != sig signal/sparc: Move EMT_TAGOVF into the generic siginfo.h signal/unicore32: Use force_sig_fault where appropriate signal/unicore32: Generate siginfo in ucs32_notify_die signal/unicore32: Use send_sig_fault where appropriate signal/arc: Use force_sig_fault where appropriate signal/arc: Push siginfo generation into unhandled_exception signal/ia64: Use force_sig_fault where appropriate signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn signal/ia64: Use the generic force_sigsegv in setup_frame signal/arm/kvm: Use send_sig_mceerr signal/arm: Use send_sig_fault where appropriate signal/arm: Use force_sig_fault where appropriate ...
2018-10-03signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZEEric W. Biederman
Rework the defintion of struct siginfo so that the array padding struct siginfo to SI_MAX_SIZE can be placed in a union along side of the rest of the struct siginfo members. The result is that we no longer need the __ARCH_SI_PREAMBLE_SIZE or SI_PAD_SIZE definitions. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-09-14arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3Will Deacon
On CPUs with support for PSTATE.SSBS, the kernel can toggle the SSBD state without needing to call into firmware. This patch hooks into the existing SSBD infrastructure so that SSBS is used on CPUs that support it, but it's all made horribly complicated by the very real possibility of big/little systems that don't uniformly provide the new capability. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-14arm64: cpufeature: Detect SSBS and advertise to userspaceWill Deacon
Armv8.5 introduces a new PSTATE bit known as Speculative Store Bypass Safe (SSBS) which can be used as a mitigation against Spectre variant 4. Additionally, a CPU may provide instructions to manipulate PSTATE.SSBS directly, so that userspace can toggle the SSBS control without trapping to the kernel. This patch probes for the existence of SSBS and advertise the new instructions to userspace if they exist. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-08-29y2038: Remove newstat family from default syscall setArnd Bergmann
We have four generations of stat() syscalls: - the oldstat syscalls that are only used on the older architectures - the newstat family that is used on all 64-bit architectures but lacked support for large files on 32-bit architectures. - the stat64 family that is used mostly on 32-bit architectures to replace newstat - statx() to replace all of the above, adding 64-bit timestamps among other things. We already compile stat64 only on those architectures that need it, but newstat is always built, including on those that don't reference it. This adds a new __ARCH_WANT_NEW_STAT symbol along the lines of __ARCH_WANT_OLD_STAT and __ARCH_WANT_STAT64 to control compilation of newstat. All architectures that need it use an explict define, the others now get a little bit smaller, and future architecture (including 64-bit targets) won't ever see it. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-07-21arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTSDongjiu Geng
For the migrating VMs, user space may need to know the exception state. For example, in the machine A, KVM make an SError pending, when migrate to B, KVM also needs to pend an SError. This new IOCTL exports user-invisible states related to SError. Together with appropriate user space changes, user space can get/set the SError exception state to do migrate/snapshot/suspend. Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> [expanded documentation wording] Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "Small update for KVM: ARM: - lazy context-switching of FPSIMD registers on arm64 - "split" regions for vGIC redistributor s390: - cleanups for nested - clock handling - crypto - storage keys - control register bits x86: - many bugfixes - implement more Hyper-V super powers - implement lapic_timer_advance_ns even when the LAPIC timer is emulated using the processor's VMX preemption timer. - two security-related bugfixes at the top of the branch" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (79 commits) kvm: fix typo in flag name kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor access KVM: x86: pass kvm_vcpu to kvm_read_guest_virt and kvm_write_guest_virt_system KVM: x86: introduce linear_{read,write}_system kvm: nVMX: Enforce cpl=0 for VMX instructions kvm: nVMX: Add support for "VMWRITE to any supported field" kvm: nVMX: Restrict VMX capability MSR changes KVM: VMX: Optimize tscdeadline timer latency KVM: docs: nVMX: Remove known limitations as they do not exist now KVM: docs: mmu: KVM support exposing SLAT to guests kvm: no need to check return value of debugfs_create functions kvm: Make VM ioctl do valloc for some archs kvm: Change return type to vm_fault_t KVM: docs: mmu: Fix link to NPT presentation from KVM Forum 2008 kvm: x86: Amend the KVM_GET_SUPPORTED_CPUID API documentation KVM: x86: hyperv: declare KVM_CAP_HYPERV_TLBFLUSH capability KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE}_EX implementation KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE} implementation KVM: introduce kvm_make_vcpus_request_mask() API KVM: x86: hyperv: do rep check for each hypercall separately ...
2018-06-01arm64: signal: Report signal frame size to userspace via auxvDave Martin
Stateful CPU architecture extensions may require the signal frame to grow to a size that exceeds the arch's MINSIGSTKSZ #define. However, changing this #define is an ABI break. To allow userspace the option of determining the signal frame size in a more forwards-compatible way, this patch adds a new auxv entry tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame size that the process can observe during its lifetime. If AT_MINSIGSTKSZ is absent from the aux vector, the caller can assume that the MINSIGSTKSZ #define is sufficient. This allows for a consistent interface with older kernels that do not provide AT_MINSIGSTKSZ. The idea is that libc could expose this via sysconf() or some similar mechanism. There is deliberately no AT_SIGSTKSZ. The kernel knows nothing about userspace's own stack overheads and should not pretend to know. For arm64: The primary motivation for this interface is the Scalable Vector Extension, which can require at least 4KB or so of extra space in the signal frame for the largest hardware implementations. To determine the correct value, a "Christmas tree" mode (via the add_all argument) is added to setup_sigframe_layout(), to simulate addition of all possible records to the signal frame at maximum possible size. If this procedure goes wrong somehow, resulting in a stupidly large frame layout and hence failure of sigframe_alloc() to allocate a record to the frame, then this is indicative of a kernel bug. In this case, we WARN() and no attempt is made to populate AT_MINSIGSTKSZ for userspace. For arm64 SVE: The SVE context block in the signal frame needs to be considered too when computing the maximum possible signal frame size. Because the size of this block depends on the vector length, this patch computes the size based not on the thread's current vector length but instead on the maximum possible vector length: this determines the maximum size of SVE context block that can be observed in any signal frame for the lifetime of the process. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-25KVM: arm/arm64: Add KVM_VGIC_V3_ADDR_TYPE_REDIST_REGIONEric Auger
This new attribute allows the userspace to set the base address of a reditributor region, relaxing the constraint of having all consecutive redistibutor frames contiguous. Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-04-20arm/arm64: KVM: Add PSCI version selection APIMarc Zyngier
Although we've implemented PSCI 0.1, 0.2 and 1.0, we expose either 0.1 or 1.0 to a guest, defaulting to the latest version of the PSCI implementation that is compatible with the requested version. This is no different from doing a firmware upgrade on KVM. But in order to give a chance to hypothetical badly implemented guests that would have a fit by discovering something other than PSCI 0.2, let's provide a new API that allows userspace to pick one particular version of the API. This is implemented as a new class of "firmware" registers, where we expose the PSCI version. This allows the PSCI version to be save/restored as part of a guest migration, and also set to any supported version if the guest requires it. Cc: stable@vger.kernel.org #4.16 Reviewed-by: Christoffer Dall <cdall@kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-20arm64: fpsimd: Fix bad si_code for undiagnosed SIGFPEDave Martin
Currently a SIGFPE delivered in response to a floating-point exception trap may have si_code set to 0 on arm64. As reported by Eric, this is a bad idea since this is the value of SI_USER -- yet this signal is definitely not the result of kill(2), tgkill(2) etc. and si_uid and si_pid make limited sense whereas we do want to yield a value for si_addr (which doesn't exist for SI_USER). It's not entirely clear whether the architecure permits a "spurious" fp exception trap where none of the exception flag bits in ESR_ELx is set. (IMHO the architectural intent is to forbid this.) However, it does permit those bits to contain garbage if the TFV bit in ESR_ELx is 0. That case isn't currently handled at all and may result in si_code == 0 or si_code containing a FPE_FLT* constant corresponding to an exception that did not in fact happen. There is nothing sensible we can return for si_code in such cases, but SI_USER is certainly not appropriate and will lead to violation of legitimate userspace assumptions. This patch allocates a new si_code value FPE_UNKNOWN that at least does not conflict with any existing SI_* or FPE_* code, and yields this in si_code for undiagnosable cases. This is probably the best simplicity/incorrectness tradeoff achieveable without relying on implementation-dependent features or adding a lot of code. In any case, there appears to be no perfect solution possible that would justify a lot of effort here. Yielding FPE_UNKNOWN when some well-defined fp exception caused the trap is a violation of POSIX, but this is forced by the architecture. We have no realistic prospect of yielding the correct code in such cases. At present I am not aware of any ARMv8 implementation that supports trapped floating-point exceptions in any case. The new code may be applicable to other architectures for similar reasons. No attempt is made to provide ESR_ELx to userspace in the signal frame, since architectural limitations mean that it is unlikely to provide much diagnostic value, doesn't benefit existing software and would create ABI with no proven purpose. The existing mechanism for passing it also has problems of its own which may result in the wrong value being passed to userspace due to interaction with mm faults. The implied rework does not appear justified. Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-19arm64: Expose Arm v8.4 featuresSuzuki K Poulose
Expose the new features introduced by Arm v8.4 extensions to Arm v8-A profile. These include : 1) Data indpendent timing of instructions. (DIT, exposed as HWCAP_DIT) 2) Unaligned atomic instructions and Single-copy atomicity of loads and stores. (AT, expose as HWCAP_USCAT) 3) LDAPR and STLR instructions with immediate offsets (extension to LRCPC, exposed as HWCAP_ILRCPC) 4) Flag manipulation instructions (TS, exposed as HWCAP_FLAGM). Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Dave Martin <dave.martin@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-09arm64: signal: Ensure si_code is valid for all fault signalsDave Martin
Currently, as reported by Eric, an invalid si_code value 0 is passed in many signals delivered to userspace in response to faults and other kernel errors. Typically 0 is passed when the fault is insufficiently diagnosable or when there does not appear to be any sensible alternative value to choose. This appears to violate POSIX, and is intuitively wrong for at least two reasons arising from the fact that 0 == SI_USER: 1) si_code is a union selector, and SI_USER (and si_code <= 0 in general) implies the existence of a different set of fields (siginfo._kill) from that which exists for a fault signal (siginfo._sigfault). However, the code raising the signal typically writes only the _sigfault fields, and the _kill fields make no sense in this case. Thus when userspace sees si_code == 0 (SI_USER) it may legitimately inspect fields in the inactive union member _kill and obtain garbage as a result. There appears to be software in the wild relying on this, albeit generally only for printing diagnostic messages. 2) Software that wants to be robust against spurious signals may discard signals where si_code == SI_USER (or <= 0), or may filter such signals based on the si_uid and si_pid fields of siginfo._sigkill. In the case of fault signals, this means that important (and usually fatal) error conditions may be silently ignored. In practice, many of the faults for which arm64 passes si_code == 0 are undiagnosable conditions such as exceptions with syndrome values in ESR_ELx to which the architecture does not yet assign any meaning, or conditions indicative of a bug or error in the kernel or system and thus that are unrecoverable and should never occur in normal operation. The approach taken in this patch is to translate all such undiagnosable or "impossible" synchronous fault conditions to SIGKILL, since these are at least probably localisable to a single process. Some of these conditions should really result in a kernel panic, but due to the lack of diagnostic information it is difficult to be certain: this patch does not add any calls to panic(), but this could change later if justified. Although si_code will not reach userspace in the case of SIGKILL, it is still desirable to pass a nonzero value so that the common siginfo handling code can detect incorrect use of si_code == 0 without false positives. In this case the si_code dependent siginfo fields will not be correctly initialised, but since they are not passed to userspace I deem this not to matter. A few faults can reasonably occur in realistic userspace scenarios, and _should_ raise a regular, handleable (but perhaps not ignorable/blockable) signal: for these, this patch attempts to choose a suitable standard si_code value for the raised signal in each case instead of 0. arm64 was the only arch to define a BUS_FIXME code, so after this patch nobody defines it. This patch therefore also removes the relevant code from siginfo_layout(). Cc: James Morse <james.morse@arm.com> Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-01-30Merge branch 'siginfo-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull siginfo cleanups from Eric Biederman: "Long ago when 2.4 was just a testing release copy_siginfo_to_user was made to copy individual fields to userspace, possibly for efficiency and to ensure initialized values were not copied to userspace. Unfortunately the design was complex, it's assumptions unstated, and humans are fallible and so while it worked much of the time that design failed to ensure unitialized memory is not copied to userspace. This set of changes is part of a new design to clean up siginfo and simplify things, and hopefully make the siginfo handling robust enough that a simple inspection of the code can be made to ensure we don't copy any unitializied fields to userspace. The design is to unify struct siginfo and struct compat_siginfo into a single definition that is shared between all architectures so that anyone adding to the set of information shared with struct siginfo can see the whole picture. Hopefully ensuring all future si_code assignments are arch independent. The design is to unify copy_siginfo_to_user32 and copy_siginfo_from_user32 so that those function are complete and cope with all of the different cases documented in signinfo_layout. I don't think there was a single implementation of either of those functions that was complete and correct before my changes unified them. The design is to introduce a series of helpers including force_siginfo_fault that take the values that are needed in struct siginfo and build the siginfo structure for their callers. Ensuring struct siginfo is built correctly. The remaining work for 4.17 (unless someone thinks it is post -rc1 material) is to push usage of those helpers down into the architectures so that architecture specific code will not need to deal with the fiddly work of intializing struct siginfo, and then when struct siginfo is guaranteed to be fully initialized change copy siginfo_to_user into a simple wrapper around copy_to_user. Further there is work in progress on the issues that have been documented requires arch specific knowledge to sort out. The changes below fix or at least document all of the issues that have been found with siginfo generation. Then proceed to unify struct siginfo the 32 bit helpers that copy siginfo to and from userspace, and generally clean up anything that is not arch specific with regards to siginfo generation. It is a lot but with the unification you can of siginfo you can already see the code reduction in the kernel" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (45 commits) signal/memory-failure: Use force_sig_mceerr and send_sig_mceerr mm/memory_failure: Remove unused trapno from memory_failure signal/ptrace: Add force_sig_ptrace_errno_trap and use it where needed signal/powerpc: Remove unnecessary signal_code parameter of do_send_trap signal: Helpers for faults with specialized siginfo layouts signal: Add send_sig_fault and force_sig_fault signal: Replace memset(info,...) with clear_siginfo for clarity signal: Don't use structure initializers for struct siginfo signal/arm64: Better isolate the COMPAT_TASK portion of ptrace_hbptriggered ptrace: Use copy_siginfo in setsiginfo and getsiginfo signal: Unify and correct copy_siginfo_to_user32 signal: Remove the code to clear siginfo before calling copy_siginfo_from_user32 signal: Unify and correct copy_siginfo_from_user32 signal/blackfin: Remove pointless UID16_SIGINFO_COMPAT_NEEDED signal/blackfin: Move the blackfin specific si_codes to asm-generic/siginfo.h signal/tile: Move the tile specific si_codes to asm-generic/siginfo.h signal/frv: Move the frv specific si_codes to asm-generic/siginfo.h signal/ia64: Move the ia64 specific si_codes to asm-generic/siginfo.h signal/powerpc: Remove redefinition of NSIGTRAP on powerpc signal: Move addr_lsb into the _sigfault union for clarity ...