Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"There are three independent sets of changes:
- Sai Prakash Ranjan adds tracing support to the asm-generic version
of the MMIO accessors, which is intended to help understand
problems with device drivers and has been part of Qualcomm's vendor
kernels for many years
- A patch from Sebastian Siewior to rework the handling of IRQ stacks
in softirqs across architectures, which is needed for enabling
PREEMPT_RT
- The last patch to remove the CONFIG_VIRT_TO_BUS option and some of
the code behind that, after the last users of this old interface
made it in through the netdev, scsi, media and staging trees"
* tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
uapi: asm-generic: fcntl: Fix typo 'the the' in comment
arch/*/: remove CONFIG_VIRT_TO_BUS
soc: qcom: geni: Disable MMIO tracing for GENI SE
serial: qcom_geni_serial: Disable MMIO tracing for geni serial
asm-generic/io: Add logging support for MMIO accessors
KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM
lib: Add register read/write tracing support
drm/meson: Fix overflow implicit truncation warnings
irqchip/tegra: Fix overflow implicit truncation warnings
coresight: etm4x: Use asm-generic IO memory barriers
arm64: io: Use asm-generic high level MMIO accessors
arch/*: Disable softirq stacks on PREEMPT_RT.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Consolidate duplicated 'next function' scanning and extend to allow
'isolated functions' on s390, similar to existing hypervisors
(Niklas Schnelle)
Resource management:
- Implement pci_iobar_pfn() for sparc, which allows us to remove the
sparc-specific pci_mmap_page_range() and pci_mmap_resource_range().
This removes the ability to map the entire PCI I/O space using
/proc/bus/pci, but we believe that's already been broken since
v2.6.28 (Arnd Bergmann)
- Move common PCI definitions to asm-generic/pci.h and rework others
to be be more specific and more encapsulated in arches that need
them (Stafford Horne)
Power management:
- Convert drivers to new *_PM_OPS macros to avoid need for '#ifdef
CONFIG_PM_SLEEP' or '__maybe_unused' (Bjorn Helgaas)
Virtualization:
- Add ACS quirk for Broadcom BCM5750x multifunction NICs that isolate
the functions but don't advertise an ACS capability (Pavan Chebbi)
Error handling:
- Clear PCI Status register during enumeration in case firmware left
errors logged (Kai-Heng Feng)
- When we have native control of AER, enable error reporting for all
devices that support AER. Previously only a few drivers enabled
this (Stefan Roese)
- Keep AER error reporting enabled for switches. Previously we
enabled this during enumeration but immediately disabled it (Stefan
Roese)
- Iterate over error counters instead of error strings to avoid
printing junk in AER sysfs counters (Mohamed Khalfella)
ASPM:
- Remove pcie_aspm_pm_state_change() so ASPM config changes, e.g.,
via sysfs, are not lost across power state changes (Kai-Heng Feng)
Endpoint framework:
- Don't stop an EPC when unbinding an EPF from it (Shunsuke Mie)
Endpoint embedded DMA controller driver:
- Simplify and clean up support for the DesignWare embedded DMA
(eDMA) controller (Frank Li, Serge Semin)
Broadcom STB PCIe controller driver:
- Avoid config space accesses when link is down because we can't
recover from the CPU aborts these cause (Jim Quinlan)
- Look for power regulators described under Root Ports in DT and
enable them before scanning the secondary bus (Jim Quinlan)
- Disable/enable regulators in suspend/resume (Jim Quinlan)
Freescale i.MX6 PCIe controller driver:
- Simplify and clean up clock and PHY management (Richard Zhu)
- Disable/enable regulators in suspend/resume (Richard Zhu)
- Set PCIE_DBI_RO_WR_EN before writing DBI registers (Richard Zhu)
- Allow speeds faster than Gen2 (Richard Zhu)
- Make link being down a non-fatal error so controller probe doesn't
fail if there are no Endpoints connected (Richard Zhu)
Loongson PCIe controller driver:
- Add ACPI and MCFG support for Loongson LS7A (Huacai Chen)
- Avoid config reads to non-existent LS2K/LS7A devices because a
hardware defect causes machine hangs (Huacai Chen)
- Work around LS7A integrated devices that report incorrect Interrupt
Pin values (Jianmin Lv)
Marvell Aardvark PCIe controller driver:
- Add support for AER and Slot capability on emulated bridge (Pali
Rohár)
MediaTek PCIe controller driver:
- Add Airoha EN7532 to DT binding (John Crispin)
- Allow building of driver for ARCH_AIROHA (Felix Fietkau)
MediaTek PCIe Gen3 controller driver:
- Print decoded LTSSM state when the link doesn't come up (Jianjun
Wang)
NVIDIA Tegra194 PCIe controller driver:
- Convert DT binding to json-schema (Vidya Sagar)
- Add DT bindings and driver support for Tegra234 Root Port and
Endpoint mode (Vidya Sagar)
- Fix some Root Port interrupt handling issues (Vidya Sagar)
- Set default Max Payload Size to 256 bytes (Vidya Sagar)
- Fix Data Link Feature capability programming (Vidya Sagar)
- Extend Endpoint mode support to devices beyond Controller-5 (Vidya
Sagar)
Qualcomm PCIe controller driver:
- Rework clock, reset, PHY power-on ordering to avoid hangs and
improve consistency (Robert Marko, Christian Marangi)
- Move pipe_clk handling to PHY drivers (Dmitry Baryshkov)
- Add IPQ60xx support (Selvam Sathappan Periakaruppan)
- Allow ASPM L1 and substates for 2.7.0 (Krishna chaitanya chundru)
- Add support for more than 32 MSI interrupts (Dmitry Baryshkov)
Renesas R-Car PCIe controller driver:
- Convert DT binding to json-schema (Herve Codina)
- Add Renesas RZ/N1D (R9A06G032) to rcar-gen2 DT binding and driver
(Herve Codina)
Samsung Exynos PCIe controller driver:
- Fix phy-exynos-pcie driver so it follows the 'phy_init() before
phy_power_on()' PHY programming model (Marek Szyprowski)
Synopsys DesignWare PCIe controller driver:
- Simplify and clean up the DWC core extensively (Serge Semin)
- Fix an issue with programming the ATU for regions that cross a 4GB
boundary (Serge Semin)
- Enable the CDM check if 'snps,enable-cdm-check' exists; previously
we skipped it if 'num-lanes' was absent (Serge Semin)
- Allocate a 32-bit DMA-able page to be MSI target instead of using a
driver data structure that may not be addressable with 32-bit
address (Will McVicker)
- Add DWC core support for more than 32 MSI interrupts (Dmitry
Baryshkov)
Xilinx Versal CPM PCIe controller driver:
- Add DT binding and driver support for Versal CPM5 Gen5 Root Port
(Bharat Kumar Gogada)"
* tag 'pci-v5.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (150 commits)
PCI: imx6: Support more than Gen2 speed link mode
PCI: imx6: Set PCIE_DBI_RO_WR_EN before writing DBI registers
PCI: imx6: Reformat suspend callback to keep symmetric with resume
PCI: imx6: Move the imx6_pcie_ltssm_disable() earlier
PCI: imx6: Disable clocks in reverse order of enable
PCI: imx6: Do not hide PHY driver callbacks and refine the error handling
PCI: imx6: Reduce resume time by only starting link if it was up before suspend
PCI: imx6: Mark the link down as non-fatal error
PCI: imx6: Move regulator enable out of imx6_pcie_deassert_core_reset()
PCI: imx6: Turn off regulator when system is in suspend mode
PCI: imx6: Call host init function directly in resume
PCI: imx6: Disable i.MX6QDL clock when disabling ref clocks
PCI: imx6: Propagate .host_init() errors to caller
PCI: imx6: Collect clock enables in imx6_pcie_clk_enable()
PCI: imx6: Factor out ref clock disable to match enable
PCI: imx6: Move imx6_pcie_clk_disable() earlier
PCI: imx6: Move imx6_pcie_enable_ref_clk() earlier
PCI: imx6: Move PHY management functions together
PCI: imx6: Move imx6_pcie_grp_offset(), imx6_pcie_configure_type() earlier
PCI: imx6: Convert to NOIRQ_SYSTEM_SLEEP_PM_OPS()
...
|
|
Pull kvm updates from Paolo Bonzini:
"Quite a large pull request due to a selftest API overhaul and some
patches that had come in too late for 5.19.
ARM:
- Unwinder implementations for both nVHE modes (classic and
protected), complete with an overflow stack
- Rework of the sysreg access from userspace, with a complete rewrite
of the vgic-v3 view to allign with the rest of the infrastructure
- Disagregation of the vcpu flags in separate sets to better track
their use model.
- A fix for the GICv2-on-v3 selftest
- A small set of cosmetic fixes
RISC-V:
- Track ISA extensions used by Guest using bitmap
- Added system instruction emulation framework
- Added CSR emulation framework
- Added gfp_custom flag in struct kvm_mmu_memory_cache
- Added G-stage ioremap() and iounmap() functions
- Added support for Svpbmt inside Guest
s390:
- add an interface to provide a hypervisor dump for secure guests
- improve selftests to use TAP interface
- enable interpretive execution of zPCI instructions (for PCI
passthrough)
- First part of deferred teardown
- CPU Topology
- PV attestation
- Minor fixes
x86:
- Permit guests to ignore single-bit ECC errors
- Intel IPI virtualization
- Allow getting/setting pending triple fault with
KVM_GET/SET_VCPU_EVENTS
- PEBS virtualization
- Simplify PMU emulation by just using PERF_TYPE_RAW events
- More accurate event reinjection on SVM (avoid retrying
instructions)
- Allow getting/setting the state of the speaker port data bit
- Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls
are inconsistent
- "Notify" VM exit (detect microarchitectural hangs) for Intel
- Use try_cmpxchg64 instead of cmpxchg64
- Ignore benign host accesses to PMU MSRs when PMU is disabled
- Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior
- Allow NX huge page mitigation to be disabled on a per-vm basis
- Port eager page splitting to shadow MMU as well
- Enable CMCI capability by default and handle injected UCNA errors
- Expose pid of vcpu threads in debugfs
- x2AVIC support for AMD
- cleanup PIO emulation
- Fixes for LLDT/LTR emulation
- Don't require refcounted "struct page" to create huge SPTEs
- Miscellaneous cleanups:
- MCE MSR emulation
- Use separate namespaces for guest PTEs and shadow PTEs bitmasks
- PIO emulation
- Reorganize rmap API, mostly around rmap destruction
- Do not workaround very old KVM bugs for L0 that runs with nesting enabled
- new selftests API for CPUID
Generic:
- Fix races in gfn->pfn cache refresh; do not pin pages tracked by
the cache
- new selftests API using struct kvm_vcpu instead of a (vm, id)
tuple"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (606 commits)
selftests: kvm: set rax before vmcall
selftests: KVM: Add exponent check for boolean stats
selftests: KVM: Provide descriptive assertions in kvm_binary_stats_test
selftests: KVM: Check stat name before other fields
KVM: x86/mmu: remove unused variable
RISC-V: KVM: Add support for Svpbmt inside Guest/VM
RISC-V: KVM: Use PAGE_KERNEL_IO in kvm_riscv_gstage_ioremap()
RISC-V: KVM: Add G-stage ioremap() and iounmap() functions
KVM: Add gfp_custom flag in struct kvm_mmu_memory_cache
RISC-V: KVM: Add extensible CSR emulation framework
RISC-V: KVM: Add extensible system instruction emulation framework
RISC-V: KVM: Factor-out instruction emulation into separate sources
RISC-V: KVM: move preempt_disable() call in kvm_arch_vcpu_ioctl_run
RISC-V: KVM: Make kvm_riscv_guest_timer_init a void function
RISC-V: KVM: Fix variable spelling mistake
RISC-V: KVM: Improve ISA extension by using a bitmap
KVM, x86/mmu: Fix the comment around kvm_tdp_mmu_zap_leafs()
KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog
KVM: x86/mmu: Treat NX as a valid SPTE bit for NPT
KVM: x86: Do not block APIC write for non ICR registers
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
- Enable mirrored memory for arm64
- Fix up several abuses of the efivar API
- Refactor the efivar API in preparation for moving the 'business
logic' part of it into efivarfs
- Enable ACPI PRM on arm64
* tag 'efi-next-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
ACPI: Move PRM config option under the main ACPI config
ACPI: Enable Platform Runtime Mechanism(PRM) support on ARM64
ACPI: PRM: Change handler_addr type to void pointer
efi: Simplify arch_efi_call_virt() macro
drivers: fix typo in firmware/efi/memmap.c
efi: vars: Drop __efivar_entry_iter() helper which is no longer used
efi: vars: Use locking version to iterate over efivars linked lists
efi: pstore: Omit efivars caching EFI varstore access layer
efi: vars: Add thin wrapper around EFI get/set variable interface
efi: vars: Don't drop lock in the middle of efivar_init()
pstore: Add priv field to pstore_record for backend specific use
Input: applespi - avoid efivars API and invoke EFI services directly
selftests/kexec: remove broken EFI_VARS secure boot fallback check
brcmfmac: Switch to appropriate helper to load EFI variable contents
iwlwifi: Switch to proper EFI variable store interface
media: atomisp_gmin_platform: stop abusing efivar API
efi: efibc: avoid efivar API for setting variables
efi: avoid efivars layer when loading SSDTs from variables
efi: Correct comment on efi_memmap_alloc
memblock: Disable mirror feature if kernelcore is not specified
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"Though there's been a decent amount of RNG-related development during
this last cycle, not all of it is coming through this tree, as this
cycle saw a shift toward tackling early boot time seeding issues,
which took place in other trees as well.
Here's a summary of the various patches:
- The CONFIG_ARCH_RANDOM .config option and the "nordrand" boot
option have been removed, as they overlapped with the more widely
supported and more sensible options, CONFIG_RANDOM_TRUST_CPU and
"random.trust_cpu". This change allowed simplifying a bit of arch
code.
- x86's RDRAND boot time test has been made a bit more robust, with
RDRAND disabled if it's clearly producing bogus results. This would
be a tip.git commit, technically, but I took it through random.git
to avoid a large merge conflict.
- The RNG has long since mixed in a timestamp very early in boot, on
the premise that a computer that does the same things, but does so
starting at different points in wall time, could be made to still
produce a different RNG state. Unfortunately, the clock isn't set
early in boot on all systems, so now we mix in that timestamp when
the time is actually set.
- User Mode Linux now uses the host OS's getrandom() syscall to
generate a bootloader RNG seed and later on treats getrandom() as
the platform's RDRAND-like faculty.
- The arch_get_random_{seed_,}_long() family of functions is now
arch_get_random_{seed_,}_longs(), which enables certain platforms,
such as s390, to exploit considerable performance advantages from
requesting multiple CPU random numbers at once, while at the same
time compiling down to the same code as before on platforms like
x86.
- A small cleanup changing a cmpxchg() into a try_cmpxchg(), from
Uros.
- A comment spelling fix"
More info about other random number changes that come in through various
architecture trees in the full commentary in the pull request:
https://lore.kernel.org/all/20220731232428.2219258-1-Jason@zx2c4.com/
* tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
random: correct spelling of "overwrites"
random: handle archrandom with multiple longs
um: seed rng using host OS rng
random: use try_cmpxchg in _credit_init_bits
timekeeping: contribute wall clock to rng on time change
x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu"
random: remove CONFIG_ARCH_RANDOM
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity updates from Mimi Zohar:
"Aside from the one EVM cleanup patch, all the other changes are kexec
related.
On different architectures different keyrings are used to verify the
kexec'ed kernel image signature. Here are a number of preparatory
cleanup patches and the patches themselves for making the keyrings -
builtin_trusted_keyring, .machine, .secondary_trusted_keyring, and
.platform - consistent across the different architectures"
* tag 'integrity-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
kexec, KEYS, s390: Make use of built-in and secondary keyring for signature verification
arm64: kexec_file: use more system keyrings to verify kernel image signature
kexec, KEYS: make the code in bzImage64_verify_sig generic
kexec: clean up arch_kexec_kernel_verify_sig
kexec: drop weak attribute from functions
kexec_file: drop weak attribute from functions
evm: Use IS_ENABLED to initialize .enabled
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:
"Only updating the turbostat tool here, no kernel changes"
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: version 2022.07.28
tools/power turbostat: do not decode ACC for ICX and SPR
tools/power turbostat: fix SPR PC6 limits
tools/power turbostat: cleanup 'automatic_cstate_conversion_probe()'
tools/power turbostat: separate SPR from ICX
tools/power turbosstat: fix comment
tools/power turbostat: Support RAPTORLAKE P
tools/power turbostat: add support for ALDERLAKE_N
tools/power turbostat: dump secondary Turbo-Ratio-Limit
tools/power turbostat: simplify dump_turbo_ratio_limits()
tools/power turbostat: dump CPUID.7.EDX.Hybrid
tools/power turbostat: update turbostat.8
tools/power turbostat: Show uncore frequency
tools/power turbostat: Fix file pointer leak
tools/power turbostat: replace strncmp with single character compare
tools/power turbostat: print the kernel boot commandline
tools/power turbostat: Introduce support for RaptorLake
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events updates from Ingo Molnar:
- Fix Intel Alder Lake PEBS memory access latency & data source
profiling info bugs.
- Use Intel large-PEBS hardware feature in more circumstances, to
reduce PMI overhead & reduce sampling data.
- Extend the lost-sample profiling output with the PERF_FORMAT_LOST ABI
variant, which tells tooling the exact number of samples lost.
- Add new IBS register bits definitions.
- AMD uncore events: Add PerfMonV2 DF (Data Fabric) enhancements.
* tag 'perf-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/ibs: Add new IBS register bits into header
perf/x86/intel: Fix PEBS data source encoding for ADL
perf/x86/intel: Fix PEBS memory access info encoding for ADL
perf/core: Add a new read format to get a number of lost samples
perf/x86/amd/uncore: Add PerfMonV2 RDPMC assignments
perf/x86/amd/uncore: Add PerfMonV2 DF event format
perf/x86/amd/uncore: Detect available DF counters
perf/x86/amd/uncore: Use attr_update for format attributes
perf/x86/amd/uncore: Use dynamic events array
x86/events/intel/ds: Enable large PEBS for PERF_SAMPLE_WEIGHT_TYPE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Borislav Petkov:
- Have invalid MSR accesses warnings appear only once after a
pr_warn_once() change broke that
- Simplify {JMP,CALL}_NOSPEC and let the objtool retpoline patching
infra take care of them instead of having unreadable alternative
macros there
* tag 'x86_core_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/extable: Fix ex_handler_msr() print condition
x86,nospec: Simplify {JMP,CALL}_NOSPEC
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Borislav Petkov:
- Remove the vendor check when selecting MWAIT as the default idle
state
- Respect idle=nomwait when supplied on the kernel cmdline
- Two small cleanups
* tag 'x86_cpu_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Use MSR_IA32_MISC_ENABLE constants
x86: Fix comment for X86_FEATURE_ZEN
x86: Remove vendor checks from prefer_mwait_c1_over_halt
x86: Handle idle=nomwait cmdline properly for x86_idle
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fpu update from Borislav Petkov:
- Add machinery to initialize AMX register state in order for
AMX-capable CPUs to be able to enter deeper low-power state
* tag 'x86_fpu_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
intel_idle: Add a new flag to initialize the AMX state
x86/fpu: Add a helper to prepare AMX state for low-power CPU idle
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Borislav Petkov:
- Rename a PKRU macro to make more sense when reading the code
- Update pkeys documentation
- Avoid reading contended mm's TLB generation var if not absolutely
necessary along with fixing a case where arch_tlbbatch_flush()
doesn't adhere to the generation scheme and thus violates the
conditions for the above avoidance.
* tag 'x86_mm_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/tlb: Ignore f->new_tlb_gen when zero
x86/pkeys: Clarify PKRU_AD_KEY macro
Documentation/protection-keys: Clean up documentation for User Space pkeys
x86/mm/tlb: Avoid reading mm_tlb_gen when possible
|
|
KVM/s390, KVM/x86 and common infrastructure changes for 5.20
x86:
* Permit guests to ignore single-bit ECC errors
* Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache
* Intel IPI virtualization
* Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS
* PEBS virtualization
* Simplify PMU emulation by just using PERF_TYPE_RAW events
* More accurate event reinjection on SVM (avoid retrying instructions)
* Allow getting/setting the state of the speaker port data bit
* Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent
* "Notify" VM exit (detect microarchitectural hangs) for Intel
* Cleanups for MCE MSR emulation
s390:
* add an interface to provide a hypervisor dump for secure guests
* improve selftests to use TAP interface
* enable interpretive execution of zPCI instructions (for PCI passthrough)
* First part of deferred teardown
* CPU Topology
* PV attestation
* Minor fixes
Generic:
* new selftests API using struct kvm_vcpu instead of a (vm, id) tuple
x86:
* Use try_cmpxchg64 instead of cmpxchg64
* Bugfixes
* Ignore benign host accesses to PMU MSRs when PMU is disabled
* Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior
* x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
* Port eager page splitting to shadow MMU as well
* Enable CMCI capability by default and handle injected UCNA errors
* Expose pid of vcpu threads in debugfs
* x2AVIC support for AMD
* cleanup PIO emulation
* Fixes for LLDT/LTR emulation
* Don't require refcounted "struct page" to create huge SPTEs
x86 cleanups:
* Use separate namespaces for guest PTEs and shadow PTEs bitmasks
* PIO emulation
* Reorganize rmap API, mostly around rmap destruction
* Do not workaround very old KVM bugs for L0 that runs with nesting enabled
* new selftests API for CPUID
|
|
Intel Performance Hybrid processors have a 2nd MSR
describing the turbo limits enforced on the Ecores.
Note, TRL and Secondary-TRL are usually R/O information,
but on overclock-capable parts, they can be written.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
This reverts commit 007faec014cb5d26983c1f86fd08c6539b41392e.
Now that hyperv does its own protocol negotiation:
49d6a3c062a1 ("x86/Hyper-V: Add SEV negotiate protocol support in Isolation VM")
revert this exposure of the sev_es_ghcb_hv_call() helper.
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by:Tianyu Lan <tiala@microsoft.com>
Link: https://lore.kernel.org/r/20220614014553.1915929-1-ltykernel@gmail.com
|
|
IBS support has been enhanced with two new features in upcoming uarch:
1. DataSrc extension and
2. L3 miss filtering.
Additional set of bits has been introduced in IBS registers to use these
features. Define these new bits into arch/x86/ header.
[ bp: Massage commit message. ]
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20220604044519.594-7-ravi.bangoria@amd.com
|
|
The archrandom interface was originally designed for x86, which supplies
RDRAND/RDSEED for receiving random words into registers, resulting in
one function to generate an int and another to generate a long. However,
other architectures don't follow this.
On arm64, the SMCCC TRNG interface can return between one and three
longs. On s390, the CPACF TRNG interface can return arbitrary amounts,
with four longs having the same cost as one. On UML, the os_getrandom()
interface can return arbitrary amounts.
So change the api signature to take a "max_longs" parameter designating
the maximum number of longs requested, and then return the number of
longs generated.
Since callers need to check this return value and loop anyway, each arch
implementation does not bother implementing its own loop to try again to
fill the maximum number of longs. Additionally, all existing callers
pass in a constant max_longs parameter. Taken together, these two things
mean that the codegen doesn't really change much for one-word-at-a-time
platforms, while performance is greatly improved on platforms such as
s390.
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
"A couple more retbleed fallout fixes.
It looks like their urgency is decreasing so it seems like we've
managed to catch whatever snafus the limited -rc testing has exposed.
Maybe we're getting ready... :)
- Make retbleed mitigations 64-bit only (32-bit will need a bit more
work if even needed, at all).
- Prevent return thunks patching of the LKDTM modules as it is not
needed there
- Avoid writing the SPEC_CTRL MSR on every kernel entry on eIBRS
parts
- Enhance error output of apply_returns() when it fails to patch a
return thunk
- A sparse fix to the sev-guest module
- Protect EFI fw calls by issuing an IBPB on AMD"
* tag 'x86_urgent_for_v5.19_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation: Make all RETbleed mitigations 64-bit only
lkdtm: Disable return thunks in rodata.c
x86/bugs: Warn when "ibrs" mitigation is selected on Enhanced IBRS parts
x86/alternative: Report missing return thunk details
virt: sev-guest: Pass the appropriate argument type to iounmap()
x86/amd: Use IBPB for firmware calls
|
|
The isa_dma_bridge_buggy symbol is only used for x86_32, and only x86_32
platforms or quirks ever set it.
Add a new linux/isa-dma.h header that #defines isa_dma_bridge_buggy to 0
except on x86_32, where we keep it as a variable, and remove all the arch-
specific definitions.
[bhelgaas: commit log]
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/r/20220722214944.831438-3-shorne@gmail.com
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
pci_get_legacy_ide_irq() is only used on platforms that support PNP, so
many architectures define it but never use it. Replace uses of it with
ATA_PRIMARY_IRQ() and ATA_SECONDARY_IRQ(), which provide the same
functionality.
Since pci_get_legacy_ide_irq() is no longer used, remove all the
architecture-specific definitions of it as well as asm-generic/pci.h, which
only provides pci_get_legacy_ide_irq()
[bhelgaas: commit log]
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20220722214944.831438-2-shorne@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Scattered across the archs are 3 basic forms of tlb_{start,end}_vma().
Provide two new MMU_GATHER_knobs to enumerate them and remove the per
arch tlb_{start,end}_vma() implementations.
- MMU_GATHER_NO_FLUSH_CACHE indicates the arch has flush_cache_range()
but does *NOT* want to call it for each VMA.
- MMU_GATHER_MERGE_VMAS indicates the arch wants to merge the
invalidate across multiple VMAs if possible.
With these it is possible to capture the three forms:
1) empty stubs;
select MMU_GATHER_NO_FLUSH_CACHE and MMU_GATHER_MERGE_VMAS
2) start: flush_cache_range(), end: empty;
select MMU_GATHER_MERGE_VMAS
3) start: flush_cache_range(), end: flush_tlb_range();
default
Obviously, if the architecture does not have flush_cache_range() then
it also doesn't need to select MMU_GATHER_NO_FLUSH_CACHE.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Have {JMP,CALL}_NOSPEC generate the same code GCC does for indirect
calls and rely on the objtool retpoline patching infrastructure.
There's no reason these should be alternatives while the vast bulk of
compiler generated retpolines are not.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
When a CPU enters an idle state, a non-initialized AMX register state may
be the cause of preventing a deeper low-power state. Other extended
register states whether initialized or not do not impact the CPU idle
state.
The new helper can ensure the AMX state is initialized before the CPU is
idle, and it will be used by the intel idle driver.
Check the AMX_TILE feature bit before using XGETBV1 as a chain of
dependencies was established via cpuid_deps[]: AMX->XFD->XGETBV1.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20220608164748.11864-2-chang.seok.bae@intel.com
|
|
Commit aa44284960d5 ("x86/mm/tlb: Avoid reading mm_tlb_gen when
possible") introduced an optimization to skip superfluous TLB
flushes based on the generation provided in flush_tlb_info.
However, arch_tlbbatch_flush() does not provide any generation in
flush_tlb_info and populates the flush_tlb_info generation with
0. This 0 is causes the flush_tlb_info to be interpreted as a
superfluous, old flush. As a result, try_to_unmap_one() would
not perform any TLB flushes.
Fix it by checking whether f->new_tlb_gen is nonzero. Zero value
is anyhow is an invalid generation value. To avoid future
confusion, introduce TLB_GENERATION_INVALID constant and use it
properly. Add warnings to ensure no partial flushes are done with
TLB_GENERATION_INVALID or when f->mm is NULL, since this does not
make any sense.
In addition, add the missing unlikely().
[ dhansen: change VM_BUG_ON() -> VM_WARN_ON(), clarify changelog ]
Fixes: aa44284960d5 ("x86/mm/tlb: Avoid reading mm_tlb_gen when possible")
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Hugh Dickins <hughd@google.com>
Link: https://lkml.kernel.org/r/20220710232837.3618-1-namit@vmware.com
|
|
On AMD IBRS does not prevent Retbleed; as such use IBPB before a
firmware call to flush the branch history state.
And because in order to do an EFI call, the kernel maps a whole lot of
the kernel page table into the EFI page table, do an IBPB just in case
in order to prevent the scenario of poisoning the BTB and causing an EFI
call using the unprotected RET there.
[ bp: Massage. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220715194550.793957-1-cascardo@canonical.com
|
|
When RDRAND was introduced, there was much discussion on whether it
should be trusted and how the kernel should handle that. Initially, two
mechanisms cropped up, CONFIG_ARCH_RANDOM, a compile time switch, and
"nordrand", a boot-time switch.
Later the thinking evolved. With a properly designed RNG, using RDRAND
values alone won't harm anything, even if the outputs are malicious.
Rather, the issue is whether those values are being *trusted* to be good
or not. And so a new set of options were introduced as the real
ones that people use -- CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu".
With these options, RDRAND is used, but it's not always credited. So in
the worst case, it does nothing, and in the best case, maybe it helps.
Along the way, CONFIG_ARCH_RANDOM's meaning got sort of pulled into the
center and became something certain platforms force-select.
The old options don't really help with much, and it's a bit odd to have
special handling for these instructions when the kernel can deal fine
with the existence or untrusted existence or broken existence or
non-existence of that CPU capability.
Simplify the situation by removing CONFIG_ARCH_RANDOM and using the
ordinary asm-generic fallback pattern instead, keeping the two options
that are actually used. For now it leaves "nordrand" for now, as the
removal of that will take a different route.
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Drop __weak attribute from functions in kexec_core.c:
- machine_kexec_post_load()
- arch_kexec_protect_crashkres()
- arch_kexec_unprotect_crashkres()
- crash_free_reserved_phys_range()
Link: https://lkml.kernel.org/r/c0f6219e03cb399d166d518ab505095218a902dd.1656659357.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Suggested-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
As requested
(http://lkml.kernel.org/r/87ee0q7b92.fsf@email.froward.int.ebiederm.org),
this series converts weak functions in kexec to use the #ifdef approach.
Quoting the 3e35142ef99fe ("kexec_file: drop weak attribute from
arch_kexec_apply_relocations[_add]") changelog:
: Since commit d1bcae833b32f1 ("ELF: Don't generate unused section symbols")
: [1], binutils (v2.36+) started dropping section symbols that it thought
: were unused. This isn't an issue in general, but with kexec_file.c, gcc
: is placing kexec_arch_apply_relocations[_add] into a separate
: .text.unlikely section and the section symbol ".text.unlikely" is being
: dropped. Due to this, recordmcount is unable to find a non-weak symbol in
: .text.unlikely to generate a relocation record against.
This patch (of 2);
Drop __weak attribute from functions in kexec_file.c:
- arch_kexec_kernel_image_probe()
- arch_kimage_file_post_load_cleanup()
- arch_kexec_kernel_image_load()
- arch_kexec_locate_mem_hole()
- arch_kexec_kernel_verify_sig()
arch_kexec_kernel_image_load() calls into kexec_image_load_default(), so
drop the static attribute for the latter.
arch_kexec_kernel_verify_sig() is not overridden by any architecture, so
drop the __weak attribute.
Link: https://lkml.kernel.org/r/cover.1656659357.git.naveen.n.rao@linux.vnet.ibm.com
Link: https://lkml.kernel.org/r/2cd7ca1fe4d6bb6ca38e3283c717878388ed6788.1656659357.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Suggested-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
Clang warns:
arch/x86/kernel/cpu/bugs.c:58:21: error: section attribute is specified on redeclared variable [-Werror,-Wsection]
DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
^
arch/x86/include/asm/nospec-branch.h:283:12: note: previous declaration is here
extern u64 x86_spec_ctrl_current;
^
1 error generated.
The declaration should be using DECLARE_PER_CPU instead so all
attributes stay in sync.
Cc: stable@vger.kernel.org
Fixes: fc02735b14ff ("KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Restrict get_mt_mask() to a u8 and reintroduce using a RET0 static_call
for the SVM implementation. EPT stores the memtype information in the
lower 8 bits (bits 6:3 to be precise), and even returns a shifted u8
without an explicit cast to a larger type; there's no need to return a
full u64.
Note, RET0 doesn't play nice with a u64 return on 32-bit kernels, see
commit bf07be36cd88 ("KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for
get_mt_mask").
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220714153707.3239119-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add a "UD" clause to KVM_X86_QUIRK_MWAIT_NEVER_FAULTS to make it clear
that the quirk only controls the #UD behavior of MONITOR/MWAIT. KVM
doesn't currently enforce fault checks when MONITOR/MWAIT are supported,
but that could change in the future. SVM also has a virtualization hole
in that it checks all faults before intercepts, and so "never faults" is
already a lie when running on SVM.
Fixes: bfbcc81bb82c ("KVM: x86: Add a quirk for KVM's "MONITOR/MWAIT are NOPs!" behavior")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220711225753.1073989-4-seanjc@google.com
|
|
The result of gva_to_gpa() is physical address not virtual address,
it is odd that UNMAPPED_GVA macro is used as the result for physical
address. Replace UNMAPPED_GVA with INVALID_GPA and drop UNMAPPED_GVA
macro.
No functional change intended.
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/6104978956449467d3c68f1ad7f2c2f6d771d0ee.1656667239.git.houwenlong.hwl@antgroup.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 retbleed fixes from Borislav Petkov:
"Just when you thought that all the speculation bugs were addressed and
solved and the nightmare is complete, here's the next one: speculating
after RET instructions and leaking privileged information using the
now pretty much classical covert channels.
It is called RETBleed and the mitigation effort and controlling
functionality has been modelled similar to what already existing
mitigations provide"
* tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
x86/speculation: Disable RRSBA behavior
x86/kexec: Disable RET on kexec
x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry
x86/bugs: Add Cannon lake to RETBleed affected CPU list
x86/retbleed: Add fine grained Kconfig knobs
x86/cpu/amd: Enumerate BTC_NO
x86/common: Stamp out the stepping madness
KVM: VMX: Prevent RSB underflow before vmenter
x86/speculation: Fill RSB on vmexit for IBRS
KVM: VMX: Fix IBRS handling after vmexit
KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS
KVM: VMX: Convert launched argument to flags
KVM: VMX: Flatten __vmx_vcpu_run()
objtool: Re-add UNWIND_HINT_{SAVE_RESTORE}
x86/speculation: Remove x86_spec_ctrl_mask
x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit
x86/speculation: Fix SPEC_CTRL write on SMT state change
x86/speculation: Fix firmware entry SPEC_CTRL handling
x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n
...
|
|
Some Intel processors may use alternate predictors for RETs on
RSB-underflow. This condition may be vulnerable to Branch History
Injection (BHI) and intramode-BTI.
Kernel earlier added spectre_v2 mitigation modes (eIBRS+Retpolines,
eIBRS+LFENCE, Retpolines) which protect indirect CALLs and JMPs against
such attacks. However, on RSB-underflow, RET target prediction may
fallback to alternate predictors. As a result, RET's predicted target
may get influenced by branch history.
A new MSR_IA32_SPEC_CTRL bit (RRSBA_DIS_S) controls this fallback
behavior when in kernel mode. When set, RETs will not take predictions
from alternate predictors, hence mitigating RETs as well. Support for
this is enumerated by CPUID.7.2.EDX[RRSBA_CTRL] (bit2).
For spectre v2 mitigation, when a user selects a mitigation that
protects indirect CALLs and JMPs against BHI and intramode-BTI, set
RRSBA_DIS_S also to protect RETs for RSB-underflow case.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Instead of clearing the bss area in assembly code, use the clear_bss()
function.
This requires to pass the start_info address as parameter to
xen_start_kernel() in order to avoid the xen_start_info being zeroed
again.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220630071441.28576-2-jgross@suse.com
|
|
Do fine-grained Kconfig for all the various retbleed parts.
NOTE: if your compiler doesn't support return thunks this will
silently 'upgrade' your mitigation to IBPB, you might not like this.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Currently, the arch_efi_call_virt() assumes all users of it will have
defined a type 'efi_##f##_t' to make use of it.
Simplify the arch_efi_call_virt() macro by eliminating the explicit
need for efi_##f##_t type for every user of this macro.
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
[ardb: apply Sudeep's ARM fix to i686, Loongarch and RISC-V too]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
All architecture-independent users of virt_to_bus() and bus_to_virt()
have been fixed to use the dma mapping interfaces or have been
removed now. This means the definitions on most architectures, and the
CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed.
The only exceptions to this are a few network and scsi drivers for m68k
Amiga and VME machines and ppc32 Macintosh. These drivers work correctly
with the old interfaces and are probably not worth changing.
On alpha and parisc, virt_to_bus() were still used in asm/floppy.h.
alpha can use isa_virt_to_bus() like x86 does, and parisc can just
open-code the virt_to_phys() here, as this is architecture specific
code.
I tried updating the bus-virt-phys-mapping.rst documentation, which
started as an email from Linus to explain some details of the Linux-2.0
driver interfaces. The bits about virt_to_bus() were declared obsolete
backin 2000, and the rest is not all that relevant any more, so in the
end I just decided to remove the file completely.
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
BTC_NO indicates that hardware is not susceptible to Branch Type Confusion.
Zen3 CPUs don't suffer BTC.
Hypervisors are expected to synthesise BTC_NO when it is appropriate
given the migration pool, to prevent kernels using heuristics.
[ bp: Massage. ]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Prevent RSB underflow/poisoning attacks with RSB. While at it, add a
bunch of comments to attempt to document the current state of tribal
knowledge about RSB attacks and what exactly is being mitigated.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
On eIBRS systems, the returns in the vmexit return path from
__vmx_vcpu_run() to vmx_vcpu_run() are exposed to RSB poisoning attacks.
Fix that by moving the post-vmexit spec_ctrl handling to immediately
after the vmexit.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Commit
c536ed2fffd5 ("objtool: Remove SAVE/RESTORE hints")
removed the save/restore unwind hints because they were no longer
needed. Now they're going to be needed again so re-add them.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
The firmware entry code may accidentally clear STIBP or SSBD. Fix that.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
If a kernel is built with CONFIG_RETPOLINE=n, but the user still wants
to mitigate Spectre v2 using IBRS or eIBRS, the RSB filling will be
silently disabled.
There's nothing retpoline-specific about RSB buffer filling. Remove the
CONFIG_RETPOLINE guards around it.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Zen2 uarchs have an undocumented, unnamed, MSR that contains a chicken
bit for some speculation behaviour. It needs setting.
Note: very belatedly AMD released naming; it's now officially called
MSR_AMD64_DE_CFG2 and MSR_AMD64_DE_CFG2_SUPPRESS_NOBR_PRED_BIT
but shall remain the SPECTRAL CHICKEN.
Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Since entry asm is tricky, add a validation pass that ensures the
retbleed mitigation has been done before the first actual RET
instruction.
Entry points are those that either have UNWIND_HINT_ENTRY, which acts
as UNWIND_HINT_EMPTY but marks the instruction as an entry point, or
those that have UWIND_HINT_IRET_REGS at +0.
This is basically a variant of validate_branch() that is
intra-function and it will simply follow all branches from marked
entry points and ensures that all paths lead to ANNOTATE_UNRET_END.
If a path hits RET or an indirection the path is a fail and will be
reported.
There are 3 ANNOTATE_UNRET_END instances:
- UNTRAIN_RET itself
- exception from-kernel; this path doesn't need UNTRAIN_RET
- all early exceptions; these also don't need UNTRAIN_RET
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
jmp2ret mitigates the easy-to-attack case at relatively low overhead.
It mitigates the long speculation windows after a mispredicted RET, but
it does not mitigate the short speculation window from arbitrary
instruction boundaries.
On Zen2, there is a chicken bit which needs setting, which mitigates
"arbitrary instruction boundaries" down to just "basic block boundaries".
But there is no fix for the short speculation window on basic block
boundaries, other than to flush the entire BTB to evict all attacker
predictions.
On the spectrum of "fast & blurry" -> "safe", there is (on top of STIBP
or no-SMT):
1) Nothing System wide open
2) jmp2ret May stop a script kiddy
3) jmp2ret+chickenbit Raises the bar rather further
4) IBPB Only thing which can count as "safe".
Tentative numbers put IBPB-on-entry at a 2.5x hit on Zen2, and a 10x hit
on Zen1 according to lmbench.
[ bp: Fixup feature bit comments, document option, 32-bit build fix. ]
Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Update retpoline validation with the new CONFIG_RETPOLINE requirement of
not having bare naked RET instructions.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Having IBRS enabled while the SMT sibling is idle unnecessarily slows
down the running sibling. OTOH, disabling IBRS around idle takes two
MSR writes, which will increase the idle latency.
Therefore, only disable IBRS around deeper idle states. Shallow idle
states are bounded by the tick in duration, since NOHZ is not allowed
for them by virtue of their short target residency.
Only do this for mwait-driven idle, since that keeps interrupts disabled
across idle, which makes disabling IBRS vs IRQ-entry a non-issue.
Note: C6 is a random threshold, most importantly C1 probably shouldn't
disable IBRS, benchmarking needed.
Suggested-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Skylake suffers from RSB underflow speculation issues; report this
vulnerability and it's mitigation (spectre_v2=ibrs).
[jpoimboe: cleanups, eibrs]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|