summaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)Author
2018-08-15x86/ACPI/cstate: Make APCI C1 FFH MWAIT C-state description vendor-neutralPrarit Bhargava
Commit 5209654a46ee (x86/ACPI/cstate: Allow ACPI C1 FFH MWAIT use on AMD systems) forgot to update the ACPI C1 idle state description and tools like turbostat display "ACPI FFH INTEL MWAIT 0x0" which is quite confusing on an AMD system. Drop the "INTEL" part from the ACPI C1 FFH MWAIT C-state description to avoid confusion. Fixes: 5209654a46ee (x86/ACPI/cstate: Allow ACPI C1 FFH MWAIT use on AMD systems) Signed-off-by: Prarit Bhargava <prarit@redhat.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-14x86/smp: fix non-SMP broken build due to redefinition of ↵Vlastimil Babka
apic_id_is_primary_thread The function has an inline "return false;" definition with CONFIG_SMP=n but the "real" definition is also visible leading to "redefinition of ‘apic_id_is_primary_thread’" compiler error. Guard it with #ifdef CONFIG_SMP Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Fixes: 6a4d2657e048 ("x86/smp: Provide topology_is_primary_thread()") Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-14Merge tag 'dma-mapping-4.19' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - a series from Robin to fix bus imposed dma limits by adding a separate mask for them to struct device instead of trying to squeeze a second meaning out of the existing dma mask as we did before. This has ACKs from the various other subsystems touched - a small swiotlb cleanup from Kees (acked by Konrad) - conversion of nios2 and sh to the new generic dma-noncoherent code. Various other architecture conversions will come through the architectures maintainers trees. * tag 'dma-mapping-4.19' of git://git.infradead.org/users/hch/dma-mapping: sh: use generic dma_noncoherent_ops sh: split arch/sh/mm/consistent.c sh: use dma_direct_ops for the CONFIG_DMA_COHERENT case sh: introduce a sh_cacheop_vaddr helper sh: simplify get_arch_dma_ops OF: Don't set default coherent DMA mask ACPI/IORT: Don't set default coherent DMA mask iommu/dma: Respect bus DMA limit for IOVAs of/device: Set bus DMA mask as appropriate ACPI/IORT: Set bus DMA mask as appropriate dma-mapping: Generalise dma_32bit_limit flag ACPI/IORT: Support address size limit for root complexes of/platform: Initialise default DMA masks nios2: use generic dma_noncoherent_ops swiotlb: clean up reporting dma-mapping: relax warning for per-device areas
2018-08-14Merge branch 'l1tf-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Merge L1 Terminal Fault fixes from Thomas Gleixner: "L1TF, aka L1 Terminal Fault, is yet another speculative hardware engineering trainwreck. It's a hardware vulnerability which allows unprivileged speculative access to data which is available in the Level 1 Data Cache when the page table entry controlling the virtual address, which is used for the access, has the Present bit cleared or other reserved bits set. If an instruction accesses a virtual address for which the relevant page table entry (PTE) has the Present bit cleared or other reserved bits set, then speculative execution ignores the invalid PTE and loads the referenced data if it is present in the Level 1 Data Cache, as if the page referenced by the address bits in the PTE was still present and accessible. While this is a purely speculative mechanism and the instruction will raise a page fault when it is retired eventually, the pure act of loading the data and making it available to other speculative instructions opens up the opportunity for side channel attacks to unprivileged malicious code, similar to the Meltdown attack. While Meltdown breaks the user space to kernel space protection, L1TF allows to attack any physical memory address in the system and the attack works across all protection domains. It allows an attack of SGX and also works from inside virtual machines because the speculation bypasses the extended page table (EPT) protection mechanism. The assoicated CVEs are: CVE-2018-3615, CVE-2018-3620, CVE-2018-3646 The mitigations provided by this pull request include: - Host side protection by inverting the upper address bits of a non present page table entry so the entry points to uncacheable memory. - Hypervisor protection by flushing L1 Data Cache on VMENTER. - SMT (HyperThreading) control knobs, which allow to 'turn off' SMT by offlining the sibling CPU threads. The knobs are available on the kernel command line and at runtime via sysfs - Control knobs for the hypervisor mitigation, related to L1D flush and SMT control. The knobs are available on the kernel command line and at runtime via sysfs - Extensive documentation about L1TF including various degrees of mitigations. Thanks to all people who have contributed to this in various ways - patches, review, testing, backporting - and the fruitful, sometimes heated, but at the end constructive discussions. There is work in progress to provide other forms of mitigations, which might be less horrible performance wise for a particular kind of workloads, but this is not yet ready for consumption due to their complexity and limitations" * 'l1tf-final' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits) x86/microcode: Allow late microcode loading with SMT disabled tools headers: Synchronise x86 cpufeatures.h for L1TF additions x86/mm/kmmio: Make the tracer robust against L1TF x86/mm/pat: Make set_memory_np() L1TF safe x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert x86/speculation/l1tf: Invert all not present mappings cpu/hotplug: Fix SMT supported evaluation KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry x86/speculation: Simplify sysfs report of VMX L1TF vulnerability Documentation/l1tf: Remove Yonah processors from not vulnerable list x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr() x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d x86: Don't include linux/irq.h from asm/hardirq.h x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond' x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush() cpu/hotplug: detect SMT disabled by BIOS ...
2018-08-13Merge branch 'x86-timers-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer updates from Thomas Gleixner: "Early TSC based time stamping to allow better boot time analysis. This comes with a general cleanup of the TSC calibration code which grew warts and duct taping over the years and removes 250 lines of code. Initiated and mostly implemented by Pavel with help from various folks" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) x86/kvmclock: Mark kvm_get_preset_lpj() as __init x86/tsc: Consolidate init code sched/clock: Disable interrupts when calling generic_sched_clock_init() timekeeping: Prevent false warning when persistent clock is not available sched/clock: Close a hole in sched_clock_init() x86/tsc: Make use of tsc_calibrate_cpu_early() x86/tsc: Split native_calibrate_cpu() into early and late parts sched/clock: Use static key for sched_clock_running sched/clock: Enable sched clock early sched/clock: Move sched clock initialization and merge with generic clock x86/tsc: Use TSC as sched clock early x86/tsc: Initialize cyc2ns when tsc frequency is determined x86/tsc: Calibrate tsc only once ARM/time: Remove read_boot_clock64() s390/time: Remove read_boot_clock64() timekeeping: Default boot time offset to local_clock() timekeeping: Replace read_boot_clock64() with read_persistent_wall_and_boot_offset() s390/time: Add read_persistent_wall_and_boot_offset() x86/xen/time: Output xen sched_clock time from 0 x86/xen/time: Initialize pv xen time in init_hypervisor_platform() ...
2018-08-13Merge branch 'x86/pti' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds
Pull x86 PTI updates from Thomas Gleixner: "The Speck brigade sadly provides yet another large set of patches destroying the perfomance which we carefully built and preserved - PTI support for 32bit PAE. The missing counter part to the 64bit PTI code implemented by Joerg. - A set of fixes for the Global Bit mechanics for non PCID CPUs which were setting the Global Bit too widely and therefore possibly exposing interesting memory needlessly. - Protection against userspace-userspace SpectreRSB - Support for the upcoming Enhanced IBRS mode, which is preferred over IBRS. Unfortunately we dont know the performance impact of this, but it's expected to be less horrible than the IBRS hammering. - Cleanups and simplifications" * 'x86/pti' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) x86/mm/pti: Move user W+X check into pti_finalize() x86/relocs: Add __end_rodata_aligned to S_REL x86/mm/pti: Clone kernel-image on PTE level for 32 bit x86/mm/pti: Don't clear permissions in pti_clone_pmd() x86/mm/pti: Fix 32 bit PCID check x86/mm/init: Remove freed kernel image areas from alias mapping x86/mm/init: Add helper for freeing kernel image pages x86/mm/init: Pass unconverted symbol addresses to free_init_pages() mm: Allow non-direct-map arguments to free_reserved_area() x86/mm/pti: Clear Global bit more aggressively x86/speculation: Support Enhanced IBRS on future CPUs x86/speculation: Protect against userspace-userspace spectreRSB x86/kexec: Allocate 8k PGDs for PTI Revert "perf/core: Make sure the ring-buffer is mapped in all page-tables" x86/mm: Remove in_nmi() warning from vmalloc_fault() x86/entry/32: Check for VM86 mode in slow-path check perf/core: Make sure the ring-buffer is mapped in all page-tables x86/pti: Check the return value of pti_user_pagetable_walk_pmd() x86/pti: Check the return value of pti_user_pagetable_walk_p4d() x86/entry/32: Add debug code to check entry/exit CR3 ...
2018-08-13Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Thomas Gleixner: "Two fixes for x86: - Provide a declaration for native_save_fl() which unbreaks the wreckage caused by making it 'extern inline'. - Fix the failing paravirt patching which is supposed to replace indirect with direct calls. The wreckage is caused by an incorrect clobber test" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/paravirt: Fix spectre-v2 mitigations for paravirt guests x86/irqflags: Provide a declaration for native_save_fl
2018-08-13Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Thomas Gleixner: "Trivial cleanups and improvements" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/UV: Remove redundant check of p == q x86/platform/olpc: Use PTR_ERR_OR_ZERO() x86/platform/UV: Mark memblock related init code and data correctly
2018-08-13Merge branch 'x86-cache-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache QoS (RDT/CAR) updates from Thomas Gleixner: "Add support for pseudo-locked cache regions. Cache Allocation Technology (CAT) allows on certain CPUs to isolate a region of cache and 'lock' it. Cache pseudo-locking builds on the fact that a CPU can still read and write data pre-allocated outside its current allocated area on cache hit. With cache pseudo-locking data can be preloaded into a reserved portion of cache that no application can fill, and from that point on will only serve cache hits. The cache pseudo-locked memory is made accessible to user space where an application can map it into its virtual address space and thus have a region of memory with reduced average read latency. The locking is not perfect and gets totally screwed by WBINDV and similar mechanisms, but it provides a reasonable enhancement for certain types of latency sensitive applications. The implementation extends the current CAT mechanism and provides a generally useful exclusive CAT mode on which it builds the extra pseude-locked regions" * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) x86/intel_rdt: Disable PMU access x86/intel_rdt: Fix possible circular lock dependency x86/intel_rdt: Make CPU information accessible for pseudo-locked regions x86/intel_rdt: Support restoration of subset of permissions x86/intel_rdt: Fix cleanup of plr structure on error x86/intel_rdt: Move pseudo_lock_region_clear() x86/intel_rdt: Limit C-states dynamically when pseudo-locking active x86/intel_rdt: Support L3 cache performance event of Broadwell x86/intel_rdt: More precise L2 hit/miss measurements x86/intel_rdt: Create character device exposing pseudo-locked region x86/intel_rdt: Create debugfs files for pseudo-locking testing x86/intel_rdt: Create resctrl debug area x86/intel_rdt: Ensure RDT cleanup on exit x86/intel_rdt: Resctrl files reflect pseudo-locked information x86/intel_rdt: Support creation/removal of pseudo-locked region x86/intel_rdt: Pseudo-lock region creation/removal core x86/intel_rdt: Discover supported platforms via prefetch disable bits x86/intel_rdt: Add utilities to test pseudo-locked region possibility x86/intel_rdt: Split resource group removal in two x86/intel_rdt: Enable entering of pseudo-locksetup mode ...
2018-08-13Merge branch 'x86-debug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 dump printing cleanup from Thomas Gleixner: "Clean up the show_opcodes() printout so nested dumps can be properly differentiated" * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Avoid pr_cont() in show_opcodes()
2018-08-13Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu updates from Thomas Gleixner: "Two small updates for the CPU code: - Improve NUMA emulation - Add the EPT_AD CPU feature bit" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpufeatures: Add EPT_AD feature bit x86/numa_emulation: Introduce uniform split capability x86/numa_emulation: Fix emulated-to-physical node mapping
2018-08-13Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Thomas Gleixner: "Trival cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/iommu: Use NULL instead of 0 x86/platform/pcspeaker: Use PTR_ERR_OR_ZERO() to fix ptr_ret.cocci warning
2018-08-13Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Thomas Gleixner: "The lowlevel and ASM code updates for x86: - Make stack trace unwinding more reliable - ASM instruction updates for better code generation - Various cleanups" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/64: Add two more instruction suffixes x86/asm/64: Use 32-bit XOR to zero registers x86/build/vdso: Simplify 'cmd_vdso2c' x86/build/vdso: Remove unused vdso-syms.lds x86/stacktrace: Enable HAVE_RELIABLE_STACKTRACE for the ORC unwinder x86/unwind/orc: Detect the end of the stack x86/stacktrace: Do not fail for ORC with regs on stack x86/stacktrace: Clarify the reliable success paths x86/stacktrace: Remove STACKTRACE_DUMP_ONCE x86/stacktrace: Do not unwind after user regs x86/asm: Use CC_SET/CC_OUT in percpu_cmpxchg8b_double() to micro-optimize code generation
2018-08-13Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic update from Thomas Gleixner: "Trivial cleanups of the APIC related code" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Trivial coding style fixes x86/vector: Merge allocate_vector() into assign_vector_locked()
2018-08-13Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf update from Thomas Gleixner: "The perf crowd presents: Kernel updates: - Removal of jprobes - Cleanup and consolidatation the handling of kprobes - Cleanup and consolidation of hardware breakpoints - The usual pile of fixes and updates to PMUs and event descriptors Tooling updates: - Updates and improvements all over the place. Nothing outstanding, just the (good) boring incremental grump work" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits) perf trace: Do not require --no-syscalls to suppress strace like output perf bpf: Include uapi/linux/bpf.h from the 'perf trace' script's bpf.h perf tools: Allow overriding MAX_NR_CPUS at compile time perf bpf: Show better message when failing to load an object perf list: Unify metric group description format with PMU event description perf vendor events arm64: Update ThunderX2 implementation defined pmu core events perf cs-etm: Generate branch sample for CS_ETM_TRACE_ON packet perf cs-etm: Generate branch sample when receiving a CS_ETM_TRACE_ON packet perf cs-etm: Support dummy address value for CS_ETM_TRACE_ON packet perf cs-etm: Fix start tracing packet handling perf build: Fix installation directory for eBPF perf c2c report: Fix crash for empty browser perf tests: Fix indexing when invoking subtests perf trace: Beautify the AF_INET & AF_INET6 'socket' syscall 'protocol' args perf trace beauty: Add beautifiers for 'socket''s 'protocol' arg perf trace beauty: Do not print NULL strarray entries perf beauty: Add a generator for IPPROTO_ socket's protocol constants tools include uapi: Grab a copy of linux/in.h perf tests: Fix complex event name parsing perf evlist: Fix error out while applying initial delay and LBR ...
2018-08-13Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Thomas Gleixner: - Cleanup and improvement of NUMA balancing - Refactoring and improvements to the PELT (Per Entity Load Tracking) code - Watchdog simplification and related cleanups - The usual pile of small incremental fixes and improvements * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) watchdog: Reduce message verbosity stop_machine: Reflow cpu_stop_queue_two_works() sched/numa: Move task_numa_placement() closer to numa_migrate_preferred() sched/numa: Use group_weights to identify if migration degrades locality sched/numa: Update the scan period without holding the numa_group lock sched/numa: Remove numa_has_capacity() sched/numa: Modify migrate_swap() to accept additional parameters sched/numa: Remove unused task_capacity from 'struct numa_stats' sched/numa: Skip nodes that are at 'hoplimit' sched/debug: Reverse the order of printing faults sched/numa: Use task faults only if numa_group is not yet set up sched/numa: Set preferred_node based on best_cpu sched/numa: Simplify load_too_imbalanced() sched/numa: Evaluate move once per node sched/numa: Remove redundant field sched/debug: Show the sum wait time of a task group sched/fair: Remove #ifdefs from scale_rt_capacity() sched/core: Remove get_cpu() from sched_fork() sched/cpufreq: Clarify sugov_get_util() sched/sysctl: Remove unused sched_time_avg_ms sysctl ...
2018-08-13Merge branch 'ras-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS updates from Thomas Gleixner: "A small set of changes to the RAS core: - Rework of the MCE bank scanning code - Y2038 converion" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Cleanup __mc_scan_banks() x86/mce: Carve out bank scanning code x86/mce: Remove !banks check x86/mce: Carve out the crashing_cpu check x86/mce: Always use 64-bit timestamps
2018-08-10x86/microcode: Allow late microcode loading with SMT disabledJosh Poimboeuf
The kernel unnecessarily prevents late microcode loading when SMT is disabled. It should be safe to allow it if all the primary threads are online. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2018-08-08Merge branches 'arm/shmobile', 'arm/renesas', 'arm/msm', 'arm/smmu', ↵Joerg Roedel
'arm/omap', 'x86/amd', 'x86/vt-d' and 'core' into next
2018-08-07x86/paravirt: Fix spectre-v2 mitigations for paravirt guestsPeter Zijlstra
Nadav reported that on guests we're failing to rewrite the indirect calls to CALLEE_SAVE paravirt functions. In particular the pv_queued_spin_unlock() call is left unpatched and that is all over the place. This obviously wrecks Spectre-v2 mitigation (for paravirt guests) which relies on not actually having indirect calls around. The reason is an incorrect clobber test in paravirt_patch_call(); this function rewrites an indirect call with a direct call to the _SAME_ function, there is no possible way the clobbers can be different because of this. Therefore remove this clobber check. Also put WARNs on the other patch failure case (not enough room for the instruction) which I've not seen trigger in my (limited) testing. Three live kernel image disassemblies for lock_sock_nested (as a small function that illustrates the problem nicely). PRE is the current situation for guests, POST is with this patch applied and NATIVE is with or without the patch for !guests. PRE: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: callq *0xffffffff822299e8 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063ae0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. POST: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: callq 0xffffffff810a0c20 <__raw_callee_save___pv_queued_spin_unlock> 0xffffffff817be9a5 <+53>: xchg %ax,%ax 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063aa0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. NATIVE: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: movb $0x0,(%rdi) 0xffffffff817be9a3 <+51>: nopl 0x0(%rax) 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063ae0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. Fixes: 63f70270ccd9 ("[PATCH] i386: PARAVIRT: add common patching machinery") Fixes: 3010a0663fd9 ("x86/paravirt, objtool: Annotate indirect calls") Reported-by: Nadav Amit <namit@vmware.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: stable@vger.kernel.org
2018-08-07cpu/hotplug: Fix SMT supported evaluationThomas Gleixner
Josh reported that the late SMT evaluation in cpu_smt_state_init() sets cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied on the kernel command line as it cannot differentiate between SMT disabled by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and makes the sysfs interface unusable. Rework this so that during bringup of the non boot CPUs the availability of SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a 'primary' thread then set the local cpu_smt_available marker and evaluate this explicitely right after the initial SMP bringup has finished. SMT evaulation on x86 is a trainwreck as the firmware has all the information _before_ booting the kernel, but there is no interface to query it. Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-06xen/pv: Call get_cpu_address_sizes to set x86_virt/phys_bitsM. Vefa Bicakci
Commit d94a155c59c9 ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption") has moved the query and calculation of the x86_virt_bits and x86_phys_bits fields of the cpuinfo_x86 struct from the get_cpu_cap function to a new function named get_cpu_address_sizes. One of the call sites related to Xen PV VMs was unfortunately missed in the aforementioned commit. This prevents successful boot-up of kernel versions 4.17 and up in Xen PV VMs if CONFIG_DEBUG_VIRTUAL is enabled, due to the following code path: enlighten_pv.c::xen_start_kernel mmu_pv.c::xen_reserve_special_pages page.h::__pa physaddr.c::__phys_addr physaddr.h::phys_addr_valid phys_addr_valid uses boot_cpu_data.x86_phys_bits to validate physical addresses. boot_cpu_data.x86_phys_bits is no longer populated before the call to xen_reserve_special_pages due to the aforementioned commit though, so the validation performed by phys_addr_valid fails, which causes __phys_addr to trigger a BUG, preventing boot-up. Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: xen-devel@lists.xenproject.org Cc: x86@kernel.org Cc: stable@vger.kernel.org # for v4.17 and up Fixes: d94a155c59c9 ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption") Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-08-06Merge branch 'x86/pti-urgent' into x86/ptiThomas Gleixner
Integrate the PTI Global bit fixes which conflict with the 32bit PTI support. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-06KVM: X86: Implement PV IPIs in linux guestWanpeng Li
Implement paravirtual apic hooks to enable PV IPIs for KVM if the "send IPI" hypercall is available. The hypercall lets a guest send IPIs, with at most 128 destinations per hypercall in 64-bit mode and 64 vCPUs per hypercall in 32-bit mode. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-06KVM: X86: Add kvm hypervisor init time platform setup callbackWanpeng Li
Add kvm hypervisor init time platform setup callback which will be used to replace native apic hooks by pararvirtual hooks. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-06x86/kvm: Don't use pvqspinlock code if only 1 vCPUWaiman Long
On a VM with only 1 vCPU, the locking fast path will always be successful. In this case, there is no need to use the the PV qspinlock code which has higher overhead on the unlock side than the native qspinlock code. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-05x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentryPaolo Bonzini
Bit 3 of ARCH_CAPABILITIES tells a hypervisor that L1D flush on vmentry is not needed. Add a new value to enum vmx_l1d_flush_state, which is used either if there is no L1TF bug at all, or if bit 3 is set in ARCH_CAPABILITIES. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-05x86/speculation: Simplify sysfs report of VMX L1TF vulnerabilityPaolo Bonzini
Three changes to the content of the sysfs file: - If EPT is disabled, L1TF cannot be exploited even across threads on the same core, and SMT is irrelevant. - If mitigation is completely disabled, and SMT is enabled, print "vulnerable" instead of "vulnerable, SMT vulnerable" - Reorder the two parts so that the main vulnerability state comes first and the detail on SMT is second. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-05Merge 4.18-rc7 into master to pick up the KVM dependcyThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-05x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1dNicolai Stange
The last missing piece to having vmx_l1d_flush() take interrupts after VMEXIT into account is to set the kvm_cpu_l1tf_flush_l1d per-cpu flag on irq entry. Issue calls to kvm_set_cpu_l1tf_flush_l1d() from entering_irq(), ipi_entering_ack_irq(), smp_reschedule_interrupt() and uv_bau_message_interrupt(). Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-05x86: Don't include linux/irq.h from asm/hardirq.hNicolai Stange
The next patch in this series will have to make the definition of irq_cpustat_t available to entering_irq(). Inclusion of asm/hardirq.h into asm/apic.h would cause circular header dependencies like asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/topology.h linux/smp.h asm/smp.h or linux/gfp.h linux/mmzone.h asm/mmzone.h asm/mmzone_64.h asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/irqdesc.h linux/kobject.h linux/sysfs.h linux/kernfs.h linux/idr.h linux/gfp.h and others. This causes compilation errors because of the header guards becoming effective in the second inclusion: symbols/macros that had been defined before wouldn't be available to intermediate headers in the #include chain anymore. A possible workaround would be to move the definition of irq_cpustat_t into its own header and include that from both, asm/hardirq.h and asm/apic.h. However, this wouldn't solve the real problem, namely asm/harirq.h unnecessarily pulling in all the linux/irq.h cruft: nothing in asm/hardirq.h itself requires it. Also, note that there are some other archs, like e.g. arm64, which don't have that #include in their asm/hardirq.h. Remove the linux/irq.h #include from x86' asm/hardirq.h. Fix resulting compilation errors by adding appropriate #includes to *.c files as needed. Note that some of these *.c files could be cleaned up a bit wrt. to their set of #includes, but that should better be done from separate patches, if at all. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-03x86/intel_rdt: Disable PMU accessThomas Gleixner
Peter is objecting to the direct PMU access in RDT. Right now the PMU usage is broken anyway as it is not coordinated with perf. Until this discussion settled, disable the PMU mechanics by simply rejecting the type '2' measurement in the resctrl file. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: vikas.shivappa@linux.intel.com CC: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: hpa@zytor.com
2018-08-03x86/speculation: Support Enhanced IBRS on future CPUsSai Praneeth
Future Intel processors will support "Enhanced IBRS" which is an "always on" mode i.e. IBRS bit in SPEC_CTRL MSR is enabled once and never disabled. From the specification [1]: "With enhanced IBRS, the predicted targets of indirect branches executed cannot be controlled by software that was executed in a less privileged predictor mode or on another logical processor. As a result, software operating on a processor with enhanced IBRS need not use WRMSR to set IA32_SPEC_CTRL.IBRS after every transition to a more privileged predictor mode. Software can isolate predictor modes effectively simply by setting the bit once. Software need not disable enhanced IBRS prior to entering a sleep state such as MWAIT or HLT." If Enhanced IBRS is supported by the processor then use it as the preferred spectre v2 mitigation mechanism instead of Retpoline. Intel's Retpoline white paper [2] states: "Retpoline is known to be an effective branch target injection (Spectre variant 2) mitigation on Intel processors belonging to family 6 (enumerated by the CPUID instruction) that do not have support for enhanced IBRS. On processors that support enhanced IBRS, it should be used for mitigation instead of retpoline." The reason why Enhanced IBRS is the recommended mitigation on processors which support it is that these processors also support CET which provides a defense against ROP attacks. Retpoline is very similar to ROP techniques and might trigger false positives in the CET defense. If Enhanced IBRS is selected as the mitigation technique for spectre v2, the IBRS bit in SPEC_CTRL MSR is set once at boot time and never cleared. Kernel also has to make sure that IBRS bit remains set after VMEXIT because the guest might have cleared the bit. This is already covered by the existing x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() speculation control functions. Enhanced IBRS still requires IBPB for full mitigation. [1] Speculative-Execution-Side-Channel-Mitigations.pdf [2] Retpoline-A-Branch-Target-Injection-Mitigation.pdf Both documents are available at: https://bugzilla.kernel.org/show_bug.cgi?id=199511 Originally-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tim C Chen <tim.c.chen@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/1533148945-24095-1-git-send-email-sai.praneeth.prakhya@intel.com
2018-08-03x86/cpufeatures: Add EPT_AD feature bitPeter Feiner
Some Intel processors have an EPT feature whereby the accessed & dirty bits in EPT entries can be updated by HW. MSR IA32_VMX_EPT_VPID_CAP exposes the presence of this capability. There is no point in trying to use that new feature bit in the VMX code as VMX needs to read the MSR anyway to access other bits, but having the feature bit for EPT_AD in place helps virtualization management as it exposes "ept_ad" in /proc/cpuinfo/$proc/flags if the feature is present. [ tglx: Amended changelog ] Signed-off-by: Peter Feiner <pfeiner@google.com> Signed-off-by: Peter Shier <pshier@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180801180657.138051-1-pshier@google.com
2018-08-02Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The BTF conflicts were simple overlapping changes. The virtio_net conflict was an overlap of a fix of statistics counter, happening alongisde a move over to a bonafide statistics structure rather than counting value on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-02x86/iommu: Use NULL instead of 0Zhong Jiang
Fixes the following sparse warning: arch/x86/kernel/pci-iommu_table.c:63:37: warning: Using plain integer as NULL pointer Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <hpa@zytor.com> Cc: <gregkh@linuxfoundation.org> Link: https://lkml.kernel.org/r/1532162004-24670-1-git-send-email-zhongjiang@huawei.com
2018-08-02Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-31x86/speculation: Protect against userspace-userspace spectreRSBJiri Kosina
The article "Spectre Returns! Speculation Attacks using the Return Stack Buffer" [1] describes two new (sub-)variants of spectrev2-like attacks, making use solely of the RSB contents even on CPUs that don't fallback to BTB on RSB underflow (Skylake+). Mitigate userspace-userspace attacks by always unconditionally filling RSB on context switch when the generic spectrev2 mitigation has been enabled. [1] https://arxiv.org/pdf/1807.07940.pdf Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1807261308190.997@cbobk.fhfr.pm
2018-07-30PCI: Call dma_debug_add_bus() for pci_bus_type from PCI coreChristoph Hellwig
There is nothing arch-specific about PCI or dma-debug, so call dma_debug_add_bus() from the PCI core just after registering the bus type. Most of dma-debug is already generic; this just adds reporting of pending dma-allocations on driver unload for arches other than powerpc, sh, and x86. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
2018-07-30x86/apic: Trivial coding style fixesYi Wang
There is inconsistent indenting in calibrate_APIC_clock() and activate_managed(). Remove the surplus TAB. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: jgross@suse.com Cc: ville.syrjala@linux.intel.com Cc: len.brown@intel.com Cc: gregkh@linuxfoundation.org Cc: zhong.weidong@zte.com.cn Link: https://lkml.kernel.org/r/1532672103-32250-1-git-send-email-wang.yi59@zte.com.cn
2018-07-30x86/platform/UV: Mark memblock related init code and data correctlyDou Liyang
parse_mem_block_size() and mem_block_size are only used during init. Mark them accordingly. Fixes: d7609f4210cb ("x86/platform/UV: Add kernel parameter to set memory block size") Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Cc: Mike Travis <mike.travis@hpe.com> Cc: Andrew Banman <andrew.banman@hpe.com> Link: https://lkml.kernel.org/r/20180730075947.23023-1-douly.fnst@cn.fujitsu.com
2018-07-30x86/kvmclock: Mark kvm_get_preset_lpj() as __initDou Liyang
kvm_get_preset_lpj() is only called from kvmclock_init(), so mark it __init as well. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář<rkrcmar@redhat.com> Cc: <kvm@vger.kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org Link: https://lkml.kernel.org/r/20180730075421.22830-3-douly.fnst@cn.fujitsu.com
2018-07-30x86/tsc: Consolidate init codeDou Liyang
Split out suplicated code from tsc_early_init() and tsc_init() into a common helper and fixup some comment typos. [ tglx: Massaged changelog and renamed function ] Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20180730075421.22830-2-douly.fnst@cn.fujitsu.com
2018-07-30x86/kexec: Allocate 8k PGDs for PTIJoerg Roedel
Fuzzing the PTI-x86-32 code with trinity showed unhandled kernel paging request oops-messages that looked a lot like silent data corruption. Lot's of debugging and testing lead to the kexec-32bit code, which is still allocating 4k PGDs when PTI is enabled. But since it uses native_set_pud() to build the page-table, it will unevitably call into __pti_set_user_pgtbl(), which writes beyond the allocated 4k page. Use PGD_ALLOCATION_ORDER to allocate PGDs in the kexec code to fix the issue. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: David H. Gutteridge <dhgutteridge@sympatico.ca> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1532533683-5988-4-git-send-email-joro@8bytes.org
2018-07-30BackMerge v4.18-rc7 into drm-nextDave Airlie
rmk requested this for armada and I think we've had a few conflicts build up. Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-07-27dma-mapping: Generalise dma_32bit_limit flagRobin Murphy
Whilst the notion of an upstream DMA restriction is most commonly seen in PCI host bridges saddled with a 32-bit native interface, a more general version of the same issue can exist on complex SoCs where a bus or point-to-point interconnect link from a device's DMA master interface to another component along the path to memory (often an IOMMU) may carry fewer address bits than the interfaces at both ends nominally support. In order to properly deal with this, the first step is to expand the dma_32bit_limit flag into an arbitrary mask. To minimise the impact on existing code, we'll make sure to only consider this new mask valid if set. That makes sense anyway, since a mask of zero would represent DMA not being wired up at all, and that would be better handled by not providing valid ops in the first place. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-27iommu: Add config option to set passthrough as defaultOlof Johansson
This allows the default behavior to be controlled by a kernel config option instead of changing the commandline for the kernel to include "iommu.passthrough=on" or "iommu=pt" on machines where this is desired. Likewise, for machines where this config option is enabled, it can be disabled at boot time with "iommu.passthrough=off" or "iommu=nopt". Also corrected iommu=pt documentation for IA-64, since it has no code that parses iommu= at all. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-07-25Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-25Merge branch 'sched/urgent' into sched/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-24Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller