summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)Author
2017-07-30Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A couple of fixes for performance counters and kprobes: - a series of small patches which make the uncore performance counters on Skylake server systems work correctly - add a missing instruction slot release to the failure path of kprobes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kprobes/x86: Release insn_slot in failure path perf/x86/intel/uncore: Fix missing marker for skx_uncore_cha_extra_regs perf/x86/intel/uncore: Fix SKX CHA event extra regs perf/x86/intel/uncore: Remove invalid Skylake server CHA filter field perf/x86/intel/uncore: Fix Skylake server CHA LLC_LOOKUP event umask perf/x86/intel/uncore: Fix Skylake server PCU PMU event format perf/x86/intel/uncore: Fix Skylake UPI PMU event masks
2017-07-30cpufreq: x86: Make scaling_cur_freq behave more as expectedRafael J. Wysocki
After commit f8475cef9008 "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF" the scaling_cur_freq policy attribute in sysfs only behaves as expected on x86 with APERF/MPERF registers available when it is read from at least twice in a row. The value returned by the first read may not be meaningful, because the computations in there use cached values from the previous iteration of aperfmperf_snapshot_khz() which may be stale. To prevent that from happening, modify arch_freq_get_on_cpu() to call aperfmperf_snapshot_khz() twice, with a short delay between these calls, if the previous invocation of aperfmperf_snapshot_khz() was too far back in the past (specifically, more that 1s ago). Also, as pointed out by Doug Smythies, aperf_delta is limited now and the multiplication of it by cpu_khz won't overflow, so simplify the s->khz computations too. Fixes: f8475cef9008 "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF" Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-30acpi, x86/mm: Remove encryption mask from ACPI page protection typeTom Lendacky
The arch_apei_get_mem_attributes() function is used to set the page protection type for ACPI physical addresses. When SME is active, the associated protection type cannot have the encryption mask set since the ACPI tables live in un-encrypted memory - the kernel will see corrupted data. To fix this, create a new protection type, PAGE_KERNEL_NOENC, that is a 'no encryption' version of PAGE_KERNEL, and return that from arch_apei_get_mem_attributes(). Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e1cb9395b2f061cd96f1e59c3cbbe5ff5d4ec26e.1501186516.git.thomas.lendacky@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-30x86/mm, kexec: Fix memory corruption with SME on successive kexecsTom Lendacky
After issuing successive kexecs it was found that the SHA hash failed verification when booting the kexec'd kernel. When SME is enabled, the change from using pages that were marked encrypted to now being marked as not encrypted (through new identify mapped page tables) results in memory corruption if there are any cache entries for the previously encrypted pages. This is because separate cache entries can exist for the same physical location but tagged both with and without the encryption bit. To prevent this, issue a wbinvd if SME is active before copying the pages from the source location to the destination location to clear any possible cache entry conflicts. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: <kexec@lists.infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Dave Young <dyoung@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e7fb8610af3a93e8f8ae6f214cd9249adc0df2b4.1501186516.git.thomas.lendacky@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-30x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment readsAndy Lutomirski
Now that pt_regs properly defines segment fields as 16-bit on 32-bit CPUs, there's no need to mask off the high word. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-30x86/traps: Don't clear segment high bits in early_idt_handler_common()Andy Lutomirski
Now that pt_regs defines the segment fields as 16-bit, there's no need to sanitize the values. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-30x86/asm/32: Make pt_regs's segment registers be 16 bitsAndy Lutomirski
Many 32-bit x86 CPUs do 16-bit writes when storing segment registers to memory. This can cause the high word of regs->[cdefgs]s to occasionally contain garbage. Rather than making the entry code more complicated to fix up the garbage, just change pt_regs to reflect reality. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-30Merge branch 'perf/urgent' into perf/core, to pick up latest fixes and ↵Ingo Molnar
refresh the tree Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-28x86/boot: Disable the address-of-packed-member compiler warningMatthias Kaehlcke
The clang warning 'address-of-packed-member' is disabled for the general kernel code, also disable it for the x86 boot code. This suppresses a bunch of warnings like this when building with clang: ./arch/x86/include/asm/processor.h:535:30: warning: taking address of packed member 'sp0' of class or structure 'x86_hw_tss' may result in an unaligned pointer value [-Waddress-of-packed-member] return this_cpu_read_stable(cpu_tss.x86_tss.sp0); ^~~~~~~~~~~~~~~~~~~ ./arch/x86/include/asm/percpu.h:391:59: note: expanded from macro 'this_cpu_read_stable' #define this_cpu_read_stable(var) percpu_stable_op("mov", var) ^~~ ./arch/x86/include/asm/percpu.h:228:16: note: expanded from macro 'percpu_stable_op' : "p" (&(var))); ^~~ Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Cc: Doug Anderson <dianders@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170725215053.135586-1-mka@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-27x86/topology: Remove the unused parent_node() macroDou Liyang
Commit: a7be6e5a7f8d ("mm: drop useless local parameters of __register_one_node()") ... removed the last user of parent_node(), so remove the macro. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1501076076-1974-11-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-27x86/ldt/64: Refresh DS and ES when modify_ldt changes an entryAndy Lutomirski
On x86_32, modify_ldt() implicitly refreshes the cached DS and ES segments because they are refreshed on return to usermode. On x86_64, they're not refreshed on return to usermode. To improve determinism and match x86_32's behavior, refresh them when we update the LDT. This avoids a situation in which the DS points to a descriptor that is changed but the old cached segment persists until the next reschedule. If this happens, then the user-visible state will change nondeterministically some time after modify_ldt() returns, which is unfortunate. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Chang Seok <chang.seok.bae@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-27Backmerge tag 'v4.13-rc2' into drm-nextDave Airlie
Linux 4.13-rc2 This is required for drm-misc fixing.
2017-07-26KVM: LAPIC: Fix reentrancy issues with preempt notifiersWanpeng Li
Preempt can occur in the preemption timer expiration handler: CPU0 CPU1 preemption timer vmexit handle_preemption_timer(vCPU0) kvm_lapic_expired_hv_timer hv_timer_is_use == true sched_out sched_in kvm_arch_vcpu_load kvm_lapic_restart_hv_timer restart_apic_timer start_hv_timer already-expired timer or sw timer triggerd in the window start_sw_timer cancel_hv_timer /* back in kvm_lapic_expired_hv_timer */ cancel_hv_timer WARN_ON(!apic->lapic_timer.hv_timer_in_use); ==> Oops This can be reproduced if CONFIG_PREEMPT is enabled. ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1563 kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G OE 4.13.0-rc2+ #16 RIP: 0010:kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] Call Trace: handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1498 cancel_hv_timer.isra.40+0x4f/0x60 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G W OE 4.13.0-rc2+ #16 RIP: 0010:cancel_hv_timer.isra.40+0x4f/0x60 [kvm] Call Trace: kvm_lapic_expired_hv_timer+0x3e/0xb0 [kvm] handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 This patch fixes it by making the caller of cancel_hv_timer, start_hv_timer and start_sw_timer be in preemption-disabled regions, which trivially avoid any reentrancy issue with preempt notifier. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> [Add more WARNs. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26KVM: nVMX: Fix loss of L2's NMI blocking stateWanpeng Li
Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1: Before NMI IRET test Sending NMI to self NMI isr running stack 0x461000 Sending nested NMI to self After nested NMI to self Nested NMI isr running rip=40038e After iret After NMI to self FAIL: NMI Commit 4c4a6f790ee862 (KVM: nVMX: track NMI blocking state separately for each VMCS) tracks NMI blocking state separately for vmcs01 and vmcs02. However it is not enough: - The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault on IRET, so the L2 can generate #PF which can be intercepted by L0. - L0 walks L1's guest page table and sees the mapping is invalid, it resumes the L1 guest and injects the #PF into L1. At this point the vmcs02 has nmi_known_unmasked=true. - L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field of vmcs12 (and fixes the shadow page table) before resuming L2 guest. - L1 executes VMRESUME to resume L2, causing a vmexit to L0 - during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the interruptibility-state field of vmcs02, but nmi_known_unmasked is still true. - L2 immediately exits to L0 with another page fault, because L0 still has not updated the NGVA->HPA page tables. However, nmi_known_unmasked is true so vmx_recover_nmi_blocking does not do anything. The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26KVM: nVMX: Fix posted intr delivery when vcpu is in guest modeWincy Van
The PI vector for L0 and L1 must be different. If dest vcpu0 is in guest mode while vcpu1 is delivering a non-nested PI to vcpu0, there wont't be any vmexit so that the non-nested interrupt will be delayed. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26x86: irq: Define a global vector for nested posted interruptsWincy Van
We are using the same vector for nested/non-nested posted interrupts delivery, this may cause interrupts latency in L1 since we can't kick the L2 vcpu out of vmx-nonroot mode. This patch introduces a new vector which is only for nested posted interrupts to solve the problems above. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26KVM: x86: do mask out upper bits of PAE CR3Paolo Bonzini
This reverts the change of commit f85c758dbee54cc3612a6e873ef7cecdb66ebee5, as the behavior it modified was intended. The VM is running in 32-bit PAE mode, and Table 4-7 of the Intel manual says: Table 4-7. Use of CR3 with PAE Paging Bit Position(s) Contents 4:0 Ignored 31:5 Physical address of the 32-Byte aligned page-directory-pointer table used for linear-address translation 63:32 Ignored (these bits exist only on processors supporting the Intel-64 architecture) To placate the static checker, write the mask explicitly as an unsigned long constant instead of using a 32-bit unsigned constant. Cc: Dan Carpenter <dan.carpenter@oracle.com> Fixes: f85c758dbee54cc3612a6e873ef7cecdb66ebee5 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26x86/kconfig: Consolidate unwinders into multiple choice selectionJosh Poimboeuf
There are three mutually exclusive unwinders. Make that more obvious by combining them into a multiple-choice selection: CONFIG_FRAME_POINTER_UNWINDER CONFIG_ORC_UNWINDER CONFIG_GUESS_UNWINDER (if CONFIG_EXPERT=y) Frame pointers are still the default (for now). The old CONFIG_FRAME_POINTER option is still used in some arch-independent places, so keep it around, but make it invisible to the user on x86 - it's now selected by CONFIG_FRAME_POINTER_UNWINDER=y. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/20170725135424.zukjmgpz3plf5pmt@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-26x86/kconfig: Make it easier to switch to the new ORC unwinderJosh Poimboeuf
A couple of Kconfig changes which make it much easier to switch to the new CONFIG_ORC_UNWINDER: 1) Remove x86 dependencies on CONFIG_FRAME_POINTER for lockdep, latencytop, and fault injection. x86 has a 'guess' unwinder which just scans the stack for kernel text addresses. It's not 100% accurate but in many cases it's good enough. This allows those users who don't want the text overhead of the frame pointer or ORC unwinders to still use these features. More importantly, this also makes it much more straightforward to disable frame pointers. 2) Make CONFIG_ORC_UNWINDER depend on !CONFIG_FRAME_POINTER. While it would be possible to have both enabled, it doesn't really make sense to do so. So enforce a sane configuration to prevent the user from making a dumb mistake. With these changes, when you disable CONFIG_FRAME_POINTER, "make oldconfig" will ask if you want to enable CONFIG_ORC_UNWINDER. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/9985fb91ce5005fe33ea5cc2a20f14bd33c61d03.1500938583.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-26x86/unwind: Add the ORC unwinderJosh Poimboeuf
Add the new ORC unwinder which is enabled by CONFIG_ORC_UNWINDER=y. It plugs into the existing x86 unwinder framework. It relies on objtool to generate the needed .orc_unwind and .orc_unwind_ip sections. For more details on why ORC is used instead of DWARF, see Documentation/x86/orc-unwinder.txt - but the short version is that it's a simplified, fundamentally more robust debugninfo data structure, which also allows up to two orders of magnitude faster lookups than the DWARF unwinder - which matters to profiling workloads like perf. Thanks to Andy Lutomirski for the performance improvement ideas: splitting the ORC unwind table into two parallel arrays and creating a fast lookup table to search a subset of the unwind table. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/0a6cbfb40f8da99b7a45a1a8302dc6aef16ec812.1500938583.git.jpoimboe@redhat.com [ Extended the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25x86/mce/AMD: Allow any CPU to initialize the smca_banks arrayYazen Ghannam
Current SMCA implementations have the same banks on each CPU with the non-core banks only visible to a "master thread" on each die. Practically, this means the smca_banks array, which describes the banks, only needs to be populated once by a single master thread. CPU 0 seemed like a good candidate to do the populating. However, it's possible that CPU 0 is not enabled in which case the smca_banks array won't be populated. Rather than try to figure out another master thread to do the populating, we should just allow any CPU to populate the array. Drop the CPU 0 check and return early if the bank was already initialized. Also, drop the WARNing about an already initialized bank, since this will be a common, expected occurrence. The smca_banks array is only populated at boot time and CPUs are brought online sequentially. So there's no need for locking around the array. If the first CPU up is a master thread, then it will populate the array with all banks, core and non-core. Every CPU afterwards will return early. If the first CPU up is not a master thread, then it will populate the array with all core banks. The first CPU afterwards that is a master thread will skip populating the core banks and continue populating the non-core banks. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Jack Miller <jack@codezen.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170724101228.17326-4-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25Merge branch 'WIP.locking/atomics' into locking/coreIngo Molnar
Merge two uncontroversial cleanups from this branch while the rest is being reworked. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25x86/efi: Fix reboot_mode when EFI runtime services are disabledStefan Assmann
When EFI runtime services are disabled, for example by the "noefi" kernel cmdline parameter, the reboot_type could still be set to BOOT_EFI causing reboot to fail. Fix this by checking if EFI runtime services are enabled. Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170724122248.24006-1-sassmann@kpanic.de [ Fixed 'not disabled' double negation. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25x86/microcode/AMD: Free unneeded patch before exit from update_cache()Shu Wang
verify_and_add_patch() allocates memory for a microcode patch and hands it down to be added to the cache of patches. However, if the cache already has the latest patch, the newly allocated one needs to be freed before returning. Do that. This issue has been found by kmemleak: unreferenced object 0xffff88010e780b40 (size 32): comm "bash", pid 860, jiffies 4294690939 (age 29.297s) backtrace: kmemleak_alloc kmem_cache_alloc_trace load_microcode_amd.isra.0 request_microcode_amd reload_store dev_attr_store sysfs_kf_write kernfs_fop_write __vfs_write vfs_write SyS_write do_syscall_64 return_from_SYSCALL_64 0xffffffffffffffff (gdb) list *0xffffffff81050d60 0xffffffff81050d60 is in load_microcode_amd (arch/x86/kernel/cpu/microcode/amd.c:616). which is this: patch = kzalloc(sizeof(*patch), GFP_KERNEL); --> if (!patch) { pr_err("Patch allocation failure.\n"); return -EINVAL; } Signed-off-by: Shu Wang <shuwang@redhat.com> [ Rewrite commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: chuhu@redhat.com Cc: liwang@redhat.com Link: http://lkml.kernel.org/r/20170724101228.17326-2-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25x86/mm/dump_pagetables: Speed up page tables dump for CONFIG_KASAN=yAndrey Ryabinin
KASAN fills kernel page tables with repeated values to map several TBs of the virtual memory to the single kasan_zero_page: kasan_zero_p4d -> kasan_zero_pud -> kasan_zero_pmd-> kasan_zero_pte-> kasan_zero_page Walking the whole KASAN shadow range takes a lot of time, especially with 5-level page tables. Since we already know that all kasan page tables eventually point to the kasan_zero_page we could call note_page() right and avoid walking lower levels of the page tables. This will not affect the output of the kernel_page_tables file, but let us avoid spending time in page table walkers: Before: $ time cat /sys/kernel/debug/kernel_page_tables > /dev/null real 0m55.855s user 0m0.000s sys 0m55.840s After: $ time cat /sys/kernel/debug/kernel_page_tables > /dev/null real 0m0.054s user 0m0.000s sys 0m0.054s Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170724152558.24689-1-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25x86/platform/intel-mid: Make IRQ allocation a bit more flexibleAndy Shevchenko
In the future we would use dynamic allocation for IRQ which brings non-1:1 mapping for IOAPIC domain. Thus, we need to respect return value of mp_map_gsi_to_irq() and assign it back to the device structure. Besides that we need to read GSI from interrupt pin register to avoid cases when some drivers will try to initialize PCI device twice in a row which will call pcibios_enable_irq() twice as well. serial 0000:00:04.1: Mapped GSI28 to IRQ5 serial 0000:00:04.2: Mapped GSI29 to IRQ5 serial 0000:00:04.3: Mapped GSI54 to IRQ5 8250_mid 0000:00:04.1: Mapped GSI28 to IRQ5 8250_mid 0000:00:04.2: Mapped GSI29 to IRQ6 8250_mid 0000:00:04.3: Mapped GSI54 to IRQ7 Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-pci@vger.kernel.org Link: http://lkml.kernel.org/r/20170724173402.12939-1-andriy.shevchenko@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25x86/platform/intel-mid: Group timers callbacks togetherAndy Shevchenko
Group timers callback initializers together in x86_intel_mid_early_setup() for easy to find and maintain. No functional change intended. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170724173309.12878-1-andriy.shevchenko@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25x86/asm: Add suffix macro for GEN_*_RMWcc()Kees Cook
The coming x86 refcount protection needs to be able to add trailing instructions to the GEN_*_RMWcc() operations. This extracts the difference between the goto/non-goto cases so the helper macros can be defined outside the #ifdef cases. Additionally adds argument naming to the resulting asm for referencing from suffixed instructions, and adds clobbers for "cc", and "cx" to let suffixes use _ASM_CX, and retain any set flags. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Jann Horn <jannh@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arozansk@redhat.com Cc: axboe@kernel.dk Cc: kernel-hardening@lists.openwall.com Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1500921349-10803-2-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25x86/mm: Implement PCID based optimization: try to preserve old TLB entries ↵Andy Lutomirski
using PCID PCID is a "process context ID" -- it's what other architectures call an address space ID. Every non-global TLB entry is tagged with a PCID, only TLB entries that match the currently selected PCID are used, and we can switch PGDs without flushing the TLB. x86's PCID is 12 bits. This is an unorthodox approach to using PCID. x86's PCID is far too short to uniquely identify a process, and we can't even really uniquely identify a running process because there are monster systems with over 4096 CPUs. To make matters worse, past attempts to use all 12 PCID bits have resulted in slowdowns instead of speedups. This patch uses PCID differently. We use a PCID to identify a recently-used mm on a per-cpu basis. An mm has no fixed PCID binding at all; instead, we give it a fresh PCID each time it's loaded except in cases where we want to preserve the TLB, in which case we reuse a recent value. Here are some benchmark results, done on a Skylake laptop at 2.3 GHz (turbo off, intel_pstate requesting max performance) under KVM with the guest using idle=poll (to avoid artifacts when bouncing between CPUs). I haven't done any real statistics here -- I just ran them in a loop and picked the fastest results that didn't look like outliers. Unpatched means commit a4eb8b993554, so all the bookkeeping overhead is gone. ping-pong between two mms on the same CPU using eventfd: patched: 1.22µs patched, nopcid: 1.33µs unpatched: 1.34µs Same ping-pong, but now touch 512 pages (all zero-page to minimize cache misses) each iteration. dTLB misses are measured by dtlb_load_misses.miss_causes_a_walk: patched: 1.8µs 11M dTLB misses patched, nopcid: 6.2µs, 207M dTLB misses unpatched: 6.1µs, 190M dTLB misses Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Nadav Amit <nadav.amit@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/9ee75f17a81770feed616358e6860d98a2a5b1e7.1500957502.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-25x86/boot: #undef memcpy() et al in string.cMichael Davidson
undef memcpy() and friends in boot/string.c so that the functions defined here will have the correct names, otherwise we end up up trying to redefine __builtin_memcpy() etc. Surprisingly, GCC allows this (and, helpfully, discards the __builtin_ prefix from the function name when compiling it), but clang does not. Adding these #undef's appears to preserve what I assume was the original intent of the code. Signed-off-by: Michael Davidson <md@google.com> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bernhard.Rosenkranzer@linaro.org Cc: Greg Hackmann <ghackmann@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170724235155.79255-1-mka@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24ACPI / boot: Add number of legacy IRQs to debug outputAndy Shevchenko
Sometimes it's useful to have that when mp_config_acpi_legacy_irqs() is called. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24ACPI / boot: Correct address space of __acpi_map_table()Andy Shevchenko
Sparse complains about wrong address space used in __acpi_map_table() and in __acpi_unmap_table(). arch/x86/kernel/acpi/boot.c:127:29: warning: incorrect type in return expression (different address spaces) arch/x86/kernel/acpi/boot.c:127:29: expected char * arch/x86/kernel/acpi/boot.c:127:29: got void [noderef] <asn:2>* arch/x86/kernel/acpi/boot.c:135:23: warning: incorrect type in argument 1 (different address spaces) arch/x86/kernel/acpi/boot.c:135:23: expected void [noderef] <asn:2>*addr arch/x86/kernel/acpi/boot.c:135:23: got char *map Correct address space to be in align of type of returned and passed parameter. Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24ACPI / boot: Don't define unused variablesAndy Shevchenko
Some code in acpi_parse_x2apic() conditionally compiled, though parts of it are being used in any case. This annoys gcc. arch/x86/kernel/acpi/boot.c: In function ‘acpi_parse_x2apic’: arch/x86/kernel/acpi/boot.c:203:5: warning: variable ‘enabled’ set but not used [-Wunused-but-set-variable] u8 enabled; ^~~~~~~ arch/x86/kernel/acpi/boot.c:202:6: warning: variable ‘apic_id’ set but not used [-Wunused-but-set-variable] int apic_id; ^~~~~~~ Re-arrange the code to avoid compiling unused variables. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-24signal: Remove kernel interal si_code magicEric W. Biederman
struct siginfo is a union and the kernel since 2.4 has been hiding a union tag in the high 16bits of si_code using the values: __SI_KILL __SI_TIMER __SI_POLL __SI_FAULT __SI_CHLD __SI_RT __SI_MESGQ __SI_SYS While this looks plausible on the surface, in practice this situation has not worked well. - Injected positive signals are not copied to user space properly unless they have these magic high bits set. - Injected positive signals are not reported properly by signalfd unless they have these magic high bits set. - These kernel internal values leaked to userspace via ptrace_peek_siginfo - It was possible to inject these kernel internal values and cause the the kernel to misbehave. - Kernel developers got confused and expected these kernel internal values in userspace in kernel self tests. - Kernel developers got confused and set si_code to __SI_FAULT which is SI_USER in userspace which causes userspace to think an ordinary user sent the signal and that it was not kernel generated. - The values make it impossible to reorganize the code to transform siginfo_copy_to_user into a plain copy_to_user. As si_code must be massaged before being passed to userspace. So remove these kernel internal si codes and make the kernel code simpler and more maintainable. To replace these kernel internal magic si_codes introduce the helper function siginfo_layout, that takes a signal number and an si_code and computes which union member of siginfo is being used. Have siginfo_layout return an enumeration so that gcc will have enough information to warn if a switch statement does not handle all of union members. A couple of architectures have a messed up ABI that defines signal specific duplications of SI_USER which causes more special cases in siginfo_layout than I would like. The good news is only problem architectures pay the cost. Update all of the code that used the previous magic __SI_ values to use the new SIL_ values and to call siginfo_layout to get those values. Escept where not all of the cases are handled remove the defaults in the switch statements so that if a new case is missed in the future the lack will show up at compile time. Modify the code that copies siginfo si_code to userspace to just copy the value and not cast si_code to a short first. The high bits are no longer used to hold a magic union member. Fixup the siginfo header files to stop including the __SI_ values in their constants and for the headers that were missing it to properly update the number of si_codes for each signal type. The fixes to copy_siginfo_from_user32 implementations has the interesting property that several of them perviously should never have worked as the __SI_ values they depended up where kernel internal. With that dependency gone those implementations should work much better. The idea of not passing the __SI_ values out to userspace and then not reinserting them has been tested with criu and criu worked without changes. Ref: 2.4.0-test1 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-07-24x86/io: Make readq() / writeq() API consistentAndy Shevchenko
Despite the following commit: 93093d099e5d ("x86: provide readq()/writeq() on 32-bit too, complete") which says: ...Also, map all the APIs to the strongest ordering variant. It's way too easy to mess such details up in drivers and the difference between "memory" and "" constrained asm() constructs is in the noise range. ... we have for now only one user of this API (i.e. writeq_relaxed() in drivers/hwtracing/intel_th/sth.c) on x86 and it does care about "relaxed" part of it. Moreover 32-bit support has been removed from that header, though appeared later in specific headers that emphasizes its non-atomic context. The rest should keep in mind a consistent picture of the __raw_IO() vs. IO() vs. IO_relaxed() API. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Baolin Wang <baolin.wang@spreadtrum.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Cc: linux-i2c@vger.kernel.org Cc: wsa@the-dreams.de Link: http://lkml.kernel.org/r/20170630170934.83028-6-andriy.shevchenko@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24x86/io: Remove xlate_dev_kmem_ptr() duplicationAndy Shevchenko
Generic header defines xlate_dev_kmem_ptr(). Reuse it from generic header and remove in x86 code. Move a description to the generic header as well. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Baolin Wang <baolin.wang@spreadtrum.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Cc: linux-i2c@vger.kernel.org Cc: wsa@the-dreams.de Link: http://lkml.kernel.org/r/20170630170934.83028-5-andriy.shevchenko@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24x86/io: Remove mem*io() duplicationsAndy Shevchenko
Generic header defines memset_io, memcpy_fromio(). and memcpy_toio(). Reuse them from generic header and remove in x86 code. Move the descriptions to the generic header as well. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Baolin Wang <baolin.wang@spreadtrum.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Cc: linux-i2c@vger.kernel.org Cc: wsa@the-dreams.de Link: http://lkml.kernel.org/r/20170630170934.83028-4-andriy.shevchenko@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24x86/io: Include asm-generic/io.h to architectural codeAndy Shevchenko
asm-generic/io.h defines few helpers which would be useful in the drivers, such as writesb() and readsb(). Include it to the asm/io.h in architectural folder. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Wolfram Sang <wsa@the-dreams.de> Cc: Baolin Wang <baolin.wang@spreadtrum.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Cc: linux-i2c@vger.kernel.org Link: http://lkml.kernel.org/r/20170630170934.83028-3-andriy.shevchenko@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24x86/io: Define IO accessors by preprocessorAndy Shevchenko
As a preparatory to use generic IO accessor helpers we need to define architecture dependent functions via preprocessor to let world know we have them. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Wolfram Sang <wsa@the-dreams.de> Cc: Baolin Wang <baolin.wang@spreadtrum.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Cc: linux-i2c@vger.kernel.org Link: http://lkml.kernel.org/r/20170630170934.83028-2-andriy.shevchenko@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24kprobes/x86: Release insn_slot in failure pathMasami Hiramatsu
The following commit: 003002e04ed3 ("kprobes: Fix arch_prepare_kprobe to handle copy insn failures") returns an error if the copying of the instruction, but does not release the allocated insn_slot. Clean up correctly. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S . Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 003002e04ed3 ("kprobes: Fix arch_prepare_kprobe to handle copy insn failures") Link: http://lkml.kernel.org/r/150064834183.6172.11694375818447664416.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24perf/x86/intel/uncore: Fix missing marker for skx_uncore_cha_extra_regsStephane Eranian
This skx_uncore_cha_extra_regs array was missing an end-marker. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-7-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24perf/x86/intel/uncore: Fix SKX CHA event extra regsStephane Eranian
This patch adds two missing event extra regs for Skylake Server CHA PMU: - TOR_INSERTS - TOR_OCCUPANCY Were missing support for all the filters, including opcode matchers. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-6-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24perf/x86/intel/uncore: Remove invalid Skylake server CHA filter fieldKan Liang
There is no field c6 and link for CHA BOX FILTER. Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-5-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24perf/x86/intel/uncore: Fix Skylake server CHA LLC_LOOKUP event umaskKan Liang
Correct the umask for LLC_LOOKUP.LOCAL and LLC_LOOKUP.REMOTE events Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-4-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24perf/x86/intel/uncore: Fix Skylake server PCU PMU event formatKan Liang
PCU event format for SKX are different from snbep. Introduce a new format group for SKX PCU. Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-3-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24perf/x86/intel/uncore: Fix Skylake UPI PMU event masksStephane Eranian
This patch fixes the event_mask and event_ext_mask for the Intel Skylake Server UPI PMU. Bit 21 is not used as a filter. The extended umask is from bit 32 to bit 55. Correct both umasks. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-2-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-24KVM: VMX: remove unused fieldPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-23Merge 4.13-rc2 into char-misc-nextGreg Kroah-Hartman
We want the char/misc driver fixes in here as well to handle future changes. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-23xen/x86: fix cpu hotplugJuergen Gross
Commit dc6416f1d711eb4c1726e845d653235dcaae12e1 ("xen/x86: Call cpu_startup_entry(CPUHP_AP_ONLINE_IDLE) from xen_play_dead()") introduced an error leading to a stack overflow of the idle task when a cpu was brought offline/online many times: by calling cpu_startup_entry() instead of returning at the end of xen_play_dead() do_idle() would be entered again and again. Don't use cpu_startup_entry(), but cpuhp_online_idle() instead allowing to return from xen_play_dead(). Cc: <stable@vger.kernel.org> # 4.12 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2017-07-23xen/x86: Don't BUG on CPU0 offliningVitaly Kuznetsov
CONFIG_BOOTPARAM_HOTPLUG_CPU0 allows to offline CPU0 but Xen HVM guests BUG() in xen_teardown_timer(). Remove the BUG_ON(), this is probably a leftover from ancient times when CPU0 hotplug was impossible, it works just fine for HVM. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>