linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2018-03-31	perf/x86/intel: Enable C-state residency events for Cannon Lake	Harry Pan
	Cannon Lake supports C1/C3/C6/C7, PC2/PC3/PC6/PC7/PC8/PC9/PC10 state residency counters, this patch enables those counters. ( The MSR information is based on Intel Software Developers' Manual, Vol. 4, Order No. 335592. ) Tested-by: Puthikorn Voravootivat <puthik@chromium.org> Signed-off-by: Harry Pan <harry.pan@intel.com> Reviewed-by: Benson Leung <bleung@chromium.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan.liang@intel.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: gs0622@gmail.com Link: http://lkml.kernel.org/r/20180309121549.630-3-harry.pan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31	perf/x86/intel: Add Cannon Lake support for RAPL profiling	Harry Pan
	This patch enables RAPL counters (energy consumption counters) support for Cannon Lake processors. ( ESU and power domains refer to Intel Software Developers' Manual, Vol. 4, Order No. 335592. ) Usage example: $ perf list $ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10 Tested-by: Puthikorn Voravootivat <puthik@chromium.org> Signed-off-by: Harry Pan <harry.pan@intel.com> Reviewed-by: Benson Leung <bleung@chromium.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: colin.king@canonical.com Cc: gs0622@gmail.com Cc: kan.liang@linux.intel.com Link: http://lkml.kernel.org/r/20180309121549.630-2-harry.pan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31	x86/cpu/tme: Fix spelling: "configuation" -> "configuration"	Colin Ian King
	Trivial fix to spelling mistake in the pr_err_once() error message text. Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/20180313154709.1015-1-colin.king@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31	x86/build: Don't pass in -D__KERNEL__ multiple times	Cao jin
	Some .<target>.cmd files under arch/x86 are showing two instances of -D__KERNEL__, like arch/x86/boot/ and arch/x86/realmode/rm/. __KERNEL__ is already defined in KBUILD_CPPFLAGS in the top Makefile, so it can be dropped safely. Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kbuild@vger.kernel.org Link: http://lkml.kernel.org/r/20180316084944.3997-1-caoj.fnst@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31	Merge branch 'linus' into locking/core, to pick up fixes	Ingo Molnar
	Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-31	Merge branch 'topic/paca' into next	Michael Ellerman
	Bring in yet another series that touches KVM code, and might need to be merged into the kvm-ppc branch to resolve conflicts. This required some changes in pnv_power9_force_smt4_catch/release() due to the paca array becomming an array of pointers.
2018-03-30	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull KVM fixes from Radim Krčmář: "PPC: - Fix a bug causing occasional machine check exceptions on POWER8 hosts (introduced in 4.16-rc1) x86: - Fix a guest crashing regression with nested VMX and restricted guest (introduced in 4.16-rc1) - Fix dependency check for pv tlb flush (the wrong dependency that effectively disabled the feature was added in 4.16-rc4, the original feature in 4.16-rc1, so it got decent testing)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix pv tlb flush dependencies KVM: nVMX: sync vmcs02 segment regs prior to vmx_set_cr0 KVM: PPC: Book3S HV: Fix duplication of host SLB entries
2018-03-31	powerpc/mm/hash: Don't memset pgd table if not needed	Aneesh Kumar K.V
	We need to zero-out pgd table only if we share the slab cache with pud/pmd level caches. With the support of 4PB, we don't share the slab cache anymore. Instead of removing the code completely hide it within an #ifdef. We don't need to do this with any other page table level, because they all allocate table of double the size and we take of initializing the first half corrrectly during page table zap. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Consolidate multiple #if / #ifdef into one] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/mm/hash64: Increase the VA range	Aneesh Kumar K.V
	This patch increases the max virtual (effective) address value to 4PB. With 4K page size config we continue to limit ourself to 64TB. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Keep the H_PGTABLE_RANGE test, update it to work] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/mm: Add support for handling > 512TB address in SLB miss	Aneesh Kumar K.V
	For addresses above 512TB we allocate additional mmu contexts. To make it all easy, addresses above 512TB are handled with IR/DR=1 and with stack frame setup. The mmu_context_t is also updated to track the new extended_ids. To support upto 4PB we need a total 8 contexts. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Minor formatting tweaks and comment wording, switch BUG to WARN in get_ea_context().] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/mm/slice: Consolidate return path in slice_get_unmapped_area()	Aneesh Kumar K.V
	In a following patch, on finding a free area we will need to do allocatinon of extra contexts as needed. Consolidating the return path for slice_get_unmapped_area() will make that easier. Split into a separate patch to make review easy. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/mm/keys: Move pte bits to correct headers	Aneesh Kumar K.V
	Memory keys are supported only with hash translation mode. Instead of using #ifdef in generic code move the key related pte bits to respective headers Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/xive: Fix wrong xmon output caused by typo	Frederic Barrat
	Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/64: Fix smp_wmb barrier definition use use lwsync consistently	Nicholas Piggin
	asm/barrier.h is not always included after asm/synch.h, which meant it was missing __SUBARCH_HAS_LWSYNC, so in some files smp_wmb() would be eieio when it should be lwsync. kernel/time/hrtimer.c is one case. __SUBARCH_HAS_LWSYNC is only used in one place, so just fold it in to where it's used. Previously with my small simulator config, 377 instances of eieio in the tree. After this patch there are 55. Fixes: 46d075be585e ("powerpc: Optimise smp_wmb") Cc: stable@vger.kernel.org # v2.6.29+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/4xx: Fix error return code in ppc4xx_msi_probe()	Wei Yongjun
	Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> [mpe: Add missing ';' to make it compile] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/mm: Fix thread_pkey_regs_init()	Ram Pai
	thread_pkey_regs_init() initializes the pkey related registers instead of initializing the fields in the task structures. Fortunately those key related registers are re-set to zero when the task gets scheduled on the cpu. However its good to fix this glaringly visible error. Fixes: 06bb53b33804 ("powerpc: store and restore the pkey state across context switches") Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/kprobes: Fix call trace due to incorrect preempt count	Naveen N. Rao
	Michael Ellerman reported the following call trace when running ftracetest: BUG: using __this_cpu_write() in preemptible [00000000] code: ftracetest/6178 caller is opt_pre_handler+0xc4/0x110 CPU: 1 PID: 6178 Comm: ftracetest Not tainted 4.15.0-rc7-gcc6x-gb2cd1df #1 Call Trace: [c0000000f9ec39c0] [c000000000ac4304] dump_stack+0xb4/0x100 (unreliable) [c0000000f9ec3a00] [c00000000061159c] check_preemption_disabled+0x15c/0x170 [c0000000f9ec3a90] [c000000000217e84] opt_pre_handler+0xc4/0x110 [c0000000f9ec3af0] [c00000000004cf68] optimized_callback+0x148/0x170 [c0000000f9ec3b40] [c00000000004d954] optinsn_slot+0xec/0x10000 [c0000000f9ec3e30] [c00000000004bae0] kretprobe_trampoline+0x0/0x10 This is showing up since OPTPROBES is now enabled with CONFIG_PREEMPT. trampoline_probe_handler() considers itself to be a special kprobe handler for kretprobes. In doing so, it expects to be called from kprobe_handler() on a trap, and re-enables preemption before returning a non-zero return value so as to suppress any subsequent processing of the trap by the kprobe_handler(). However, with optprobes, we don't deal with special handlers (we ignore the return code) and just try to re-enable preemption causing the above trace. To address this, modify trampoline_probe_handler() to not be special. The only additional processing done in kprobe_handler() is to emulate the instruction (in this case, a 'nop'). We adjust the value of regs->nip for the purpose and delegate the job of re-enabling preemption and resetting current kprobe to the probe handlers (kprobe_handler() or optimized_callback()). Fixes: 8a2d71a3f273 ("powerpc/kprobes: Disable preemption before invoking probe handler for optprobes") Cc: stable@vger.kernel.org # v4.15+ Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write()	Nicholas Piggin
	opal_nvram_write currently just assumes success if it encounters an error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO on other errors instead. Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks") Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/pseries: Fix clearing of security feature flags	Mauricio Faria de Oliveira
	The H_CPU_BEHAV_* flags should be checked for in the 'behaviour' field of 'struct h_cpu_char_result' -- 'character' is for H_CPU_CHAR_* flags. Found by playing around with QEMU's implementation of the hypercall: H_CPU_CHAR=0xf000000000000000 H_CPU_BEHAV=0x0000000000000000 This clears H_CPU_BEHAV_FAVOUR_SECURITY and H_CPU_BEHAV_L1D_FLUSH_PR so pseries_setup_rfi_flush() disables 'rfi_flush'; and it also clears H_CPU_CHAR_L1D_THREAD_PRIV flag. So there is no RFI flush mitigation at all for cpu_show_meltdown() to report; but currently it does: Original kernel: # cat /sys/devices/system/cpu/vulnerabilities/meltdown Mitigation: RFI Flush Patched kernel: # cat /sys/devices/system/cpu/vulnerabilities/meltdown Not affected H_CPU_CHAR=0x0000000000000000 H_CPU_BEHAV=0xf000000000000000 This sets H_CPU_BEHAV_BNDS_CHK_SPEC_BAR so cpu_show_spectre_v1() should report vulnerable; but currently it doesn't: Original kernel: # cat /sys/devices/system/cpu/vulnerabilities/spectre_v1 Not affected Patched kernel: # cat /sys/devices/system/cpu/vulnerabilities/spectre_v1 Vulnerable Brown-paper-bag-by: Michael Ellerman <mpe@ellerman.id.au> Fixes: f636c14790ea ("powerpc/pseries: Set or clear security feature flags") Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/mm: Pass node id into create_section_mapping	Nicholas Piggin
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Move __map_kernel_page_nid() inside #ifdef SPARSEMEM_VMEMMAP] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/64s/radix: Allocate kernel page tables node-local if possible	Nicholas Piggin
	Try to allocate kernel page tables for direct mapping and vmemmap according to the node of the memory they will map. The node is not available for the linear map in early boot, so use range allocation to allocate the page tables from the region they map, which is effectively node-local. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix build error in radix__create_section_mapping()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/64s/radix: Split early page table mapping to its own function	Nicholas Piggin
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/64: Allocate per-cpu stacks node-local if possible	Nicholas Piggin
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-31	powerpc/64: Allocate pacas per node	Nicholas Piggin
	Per-node allocations are possible on 64s with radix that does not have the bolted SLB limitation. Hash would be able to do the same if all CPUs had the bottom of their node-local memory bolted as well. This is left as an exercise for the reader. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add dummy definition of boot_cpuid for !SMP] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-30	powerpc/64: Defer paca allocation until memory topology is discovered	Nicholas Piggin
	Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Rename the dummy allocate_pacas() to fix 32-bit build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-30	powerpc/setup: Add cpu_to_phys_id array	Nicholas Piggin
	Build an array that finds hardware CPU number from logical CPU number in firmware CPU discovery. Use that rather than setting paca of other CPUs directly, to begin with. Subsequent patch will not have pacas allocated at this point. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix SMP=n build by adding #ifdef in arch_match_cpu_phys_id()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-30	powerpc/64: move default SPR recording	Nicholas Piggin
	Move this into the early setup code, and don't iterate over CPU masks. We don't want to call into sysfs so early from setup, and a future patch won't initialize CPU masks by the time this is called. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fold in incremental fix from Nick for DSCR handling] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-30	powerpc/mm/numa: move numa topology discovery earlier	Nicholas Piggin
	Split sparsemem initialisation from basic numa topology discovery. Move the parsing earlier in boot, before pacas are allocated. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-30	powerpc/64s: Allocate slb_shadow structures individually	Nicholas Piggin
	slb_shadow structures are avoided for radix environment. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-30	powerpc/64s: Allocate LPPACAs individually	Nicholas Piggin
	We no longer allocate lppacas in an array, so this patch removes the 1kB static alignment for the structure, and enforces the PAPR alignment requirements at allocation time. We can not reduce the 1kB allocation size however, due to existing KVM hypervisors. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-30	powerpc/64: Use array of paca pointers and allocate pacas individually	Nicholas Piggin
	Change the paca array into an array of pointers to pacas. Allocate pacas individually. This allows flexibility in where the PACAs are allocated. Future work will allocate them node-local. Platforms that don't have address limits on PACAs would be able to defer PACA allocations until later in boot rather than allocate all possible ones up-front then freeing unused. This is slightly more overhead (one additional indirection) for cross CPU paca references, but those aren't too common. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-30	powerpc/64s: Do not allocate lppaca if we are not virtualized	Nicholas Piggin
	The "lppaca" is a structure registered with the hypervisor. This is unnecessary when running on non-virtualised platforms. One field from the lppaca (pmcregs_in_use) is also used by the host, so move the host part out into the paca (lppaca field is still updated in guest mode). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix non-pseries build with some #ifdefs] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-30	powerpc/mpic: Check if cpu_possible() in mpic_physmask()	Michael Ellerman
	In mpic_physmask() we loop over all CPUs up to 32, then get the hard SMP processor id of that CPU. Currently that's possibly walking off the end of the paca array, but in a future patch we will change the paca array to be an array of pointers, and in that case we will get a NULL for missing CPUs and oops. eg: Unable to handle kernel paging request for data at address 0x88888888888888b8 Faulting instruction address: 0xc00000000004e380 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP .mpic_set_affinity+0x60/0x1a0 LR .irq_do_set_affinity+0x48/0x100 Fix it by checking the CPU is possible, this also fixes the code if there are gaps in the CPU numbering which probably never happens on mpic systems but who knows. Debugged-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-29	Fix vector raw inintialization logic	Anton Ivanov
	Vector RAW in UML needs to BPF filter its own MAC only if QDISC_BYPASS has failed. If QDISC_BYPASS is successful, the frames originated locally are not visible to readers on the raw socket. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2018-03-29	Migrate vector timers to new timer API	Anton Ivanov
	The patches for the UML vector drivers were in-flight when the timer changes happened and were not covered by them. This change migrates vector_kern.c to use the new timer API. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2018-03-29	um: Compile with modern headers	Jason A. Donenfeld
	Recent libcs have gotten a bit more strict, so we actually need to include the right headers and use the right types. This enables UML to compile again. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at>
2018-03-29	Merge tag 'kvm-ppc-next-4.17-1' of ↵	Radim Krčmář
	git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc KVM PPC update for 4.17 - Improvements for the radix page fault handler for HV KVM on POWER9.
2018-03-29	perf/x86/pt, coresight: Clean up address filter structure	Alexander Shishkin
	This is a cosmetic patch that deals with the address filter structure's ambiguous fields 'filter' and 'range'. The former stands to mean that the filter's action should be to filter the traces to its address range if it's set or stop tracing if it's unset. This is confusing and hard on the eyes, so this patch replaces it with 'action' enum. The 'range' field is completely redundant (meaning that the filter is an address range as opposed to a single address trigger), as we can use zero size to mean the same thing. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180329120648.11902-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-29	Merge branch 'perf/urgent' into perf/core	Ingo Molnar
	Conflicts: kernel/events/hw_breakpoint.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-29	Merge tag 'irqchip-4.17' of ↵	Thomas Gleixner
	git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates for 4.17 from Marc Zyngier: - New Qualcomm PDC irqchip - New Microsemi Ocelot irqchip - Suspend/resume support for some oddball GICv3 irqchip - Better GIC/GICv3 support for kexec - Various cleanups and fixes
2018-03-29	ARM: davinci: da8xx: simplify CFGCHIP regmap_config	David Lechner
	Since commit 8253bb3f8255 ("regmap: potentially duplicate the name string stored in regmap"), the name field of struct regmap_config is copied in __regmap_init(), so we no longer need to worry about keeping our own copy of the name. Just use a string literal in the initialization of da8xx_cfgchip_config instead of creating a separate variable for the name. Signed-off-by: David Lechner <david@lechnology.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com>
2018-03-29	ARM: davinci: da8xx: fix oops in USB PHY driver due to stack allocated ↵	David Lechner
	platform_data This fixes a possible kernel oops due to using stack allocated platform data for the USB PHY driver on DA8XX devices. If the platform device probe is deferred, then we get a corrupt pointer for the platform data. We now use a global static struct for the platform data so that the platform data pointer does not get written over. Fixes: bdec5a6b5789 ("ARM: da8xx: use platform data for CFGCHIP syscon regmap") Signed-off-by: David Lechner <david@lechnology.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com>
2018-03-28	Merge tag 'powerpc-4.16-6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 4.16. Apologies if this is a bit big at rc7, but they're all reasonably important fixes. None are actually for new code, so they aren't indicative of 4.16 being in bad shape from our point of view. - Fix missing AT_BASE_PLATFORM (in auxv) when we're using a new firmware interface for describing CPU features. - Fix lost pending interrupts due to a race in our interrupt soft-masking code. - A workaround for a nest MMU bug with TLB invalidations on Power9. - A workaround for broadcast TLB invalidations on Power9. - Fix a bug in our instruction SLB miss handler, when handling bad addresses (eg. >= TASK_SIZE), which could corrupt non-volatile user GPRs. Thanks to: Aneesh Kumar K.V, Balbir Singh, Benjamin Herrenschmidt, Nicholas Piggin" * tag 'powerpc-4.16-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs powerpc/mm: Fixup tlbie vs store ordering issue on POWER9 powerpc/mm/radix: Move the functions that does the actual tlbie closer powerpc/mm/radix: Remove unused code powerpc/mm: Workaround Nest MMU bug with TLB invalidations powerpc/mm: Add tracking of the number of coprocessors using a context powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened powerpc/64s: Fix NULL AT_BASE_PLATFORM when using DT CPU features
2018-03-28	Merge tag 'armsoc-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "Here are are a couple of last-minute fixes for 4.16, mostly for regressions. As usual, the majory are device tree changes: - USB 3 support on rk3399 didn't work and is being reverted for now - One fix for an old suspend/resume bug on rk3399 - A few regulator related fixes on Banana Pi M2, and on imx7d-sdb - A boot regression fix for all Aspeed SoCs failing to find their memory - One more dtc warning fix The other changes are: - A few updates to the MAINTAINERS file - A revert for an incorrect orion5x cleanup - Two power management fixes for OMAP" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: OMAP: Fix SRAM W+X mapping ARM: dts: aspeed: Add default memory node mailmap: Update email address for Gregory CLEMENT ARM: davinci: fix the GPIO lookup for omapl138-hawk MAINTAINERS: Update Tegra IOMMU maintainer ARM: dts: imx7d-sdb: Fix regulator-usb-otg2-vbus node name ARM: ux500: Fix PMU IRQ regression ARM: dts: rockchip: Add missing #sound-dai-cells on rk3288 Revert "arm64: dts: rockchip: add usb3-phy otg-port support for rk3399" arm64: dts: rockchip: Fix rk3399-gru-* s2r (pinctrl hogs, wifi reset) ARM: OMAP: Fix dmtimer init for omap1 MAINTAINERS: update email address for Maxime Ripard ARM: dts: sun6i: a31s: bpi-m2: add missing regulators ARM: dts: sun6i: a31s: bpi-m2: improve pmic properties
2018-03-28	KVM: nVMX: Optimization: Dont set KVM_REQ_EVENT when VMExit with ↵	Liran Alon
	nested_run_pending When vCPU runs L2 and there is a pending event that requires to exit from L2 to L1 and nested_run_pending=1, vcpu_enter_guest() will request an immediate-exit from L2 (See req_immediate_exit). Since now handling of req_immediate_exit also makes sure to set KVM_REQ_EVENT, there is no need to also set it on vmx_vcpu_run() when nested_run_pending=1. This optimizes cases where VMRESUME was executed by L1 to enter L2 and there is no pending events that require exit from L2 to L1. Previously, this would have set KVM_REQ_EVENT unnecessarly. Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-28	KVM: nVMX: Require immediate-exit when event reinjected to L2 and L1 event ↵	Liran Alon
	pending In case L2 VMExit to L0 during event-delivery, VMCS02 is filled with IDT-vectoring-info which vmx_complete_interrupts() makes sure to reinject before next resume of L2. While handling the VMExit in L0, an IPI could be sent by another L1 vCPU to the L1 vCPU which currently runs L2 and exited to L0. When L0 will reach vcpu_enter_guest() and call inject_pending_event(), it will note that a previous event was re-injected to L2 (by IDT-vectoring-info) and therefore won't check if there are pending L1 events which require exit from L2 to L1. Thus, L0 enters L2 without immediate VMExit even though there are pending L1 events! This commit fixes the issue by making sure to check for L1 pending events even if a previous event was reinjected to L2 and bailing out from inject_pending_event() before evaluating a new pending event in case an event was already reinjected. The bug was observed by the following setup: * L0 is a 64CPU machine which runs KVM. * L1 is a 16CPU machine which runs KVM. * L0 & L1 runs with APICv disabled. (Also reproduced with APICv enabled but easier to analyze below info with APICv disabled) * L1 runs a 16CPU L2 Windows Server 2012 R2 guest. During L2 boot, L1 hangs completely and analyzing the hang reveals that one L1 vCPU is holding KVM's mmu_lock and is waiting forever on an IPI that he has sent for another L1 vCPU. And all other L1 vCPUs are currently attempting to grab mmu_lock. Therefore, all L1 vCPUs are stuck forever (as L1 runs with kernel-preemption disabled). Observing /sys/kernel/debug/tracing/trace_pipe reveals the following series of events: (1) qemu-system-x86-19066 [030] kvm_nested_vmexit: rip: 0xfffff802c5dca82f reason: EPT_VIOLATION ext_inf1: 0x0000000000000182 ext_inf2: 0x00000000800000d2 ext_int: 0x00000000 ext_int_err: 0x00000000 (2) qemu-system-x86-19054 [028] kvm_apic_accept_irq: apicid f vec 252 (Fixed\|edge) (3) qemu-system-x86-19066 [030] kvm_inj_virq: irq 210 (4) qemu-system-x86-19066 [030] kvm_entry: vcpu 15 (5) qemu-system-x86-19066 [030] kvm_exit: reason EPT_VIOLATION rip 0xffffe00069202690 info 83 0 (6) qemu-system-x86-19066 [030] kvm_nested_vmexit: rip: 0xffffe00069202690 reason: EPT_VIOLATION ext_inf1: 0x0000000000000083 ext_inf2: 0x0000000000000000 ext_int: 0x00000000 ext_int_err: 0x00000000 (7) qemu-system-x86-19066 [030] kvm_nested_vmexit_inject: reason: EPT_VIOLATION ext_inf1: 0x0000000000000083 ext_inf2: 0x0000000000000000 ext_int: 0x00000000 ext_int_err: 0x00000000 (8) qemu-system-x86-19066 [030] kvm_entry: vcpu 15 Which can be analyzed as follows: (1) L2 VMExit to L0 on EPT_VIOLATION during delivery of vector 0xd2. Therefore, vmx_complete_interrupts() will set KVM_REQ_EVENT and reinject a pending-interrupt of 0xd2. (2) L1 sends an IPI of vector 0xfc (CALL_FUNCTION_VECTOR) to destination vCPU 15. This will set relevant bit in LAPIC's IRR and set KVM_REQ_EVENT. (3) L0 reach vcpu_enter_guest() which calls inject_pending_event() which notes that interrupt 0xd2 was reinjected and therefore calls vmx_inject_irq() and returns. Without checking for pending L1 events! Note that at this point, KVM_REQ_EVENT was cleared by vcpu_enter_guest() before calling inject_pending_event(). (4) L0 resumes L2 without immediate-exit even though there is a pending L1 event (The IPI pending in LAPIC's IRR). We have already reached the buggy scenario but events could be furthered analyzed: (5+6) L2 VMExit to L0 on EPT_VIOLATION. This time not during event-delivery. (7) L0 decides to forward the VMExit to L1 for further handling. (8) L0 resumes into L1. Note that because KVM_REQ_EVENT is cleared, the LAPIC's IRR is not examined and therefore the IPI is still not delivered into L1! Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-28	KVM: x86: Fix misleading comments on handling pending exceptions	Liran Alon
	The reason that exception.pending should block re-injection of NMI/interrupt is not described correctly in comment in code. Instead, it describes why a pending exception should be injected before a pending NMI/interrupt. Therefore, move currently present comment to code-block evaluating a new pending event which explains why exception.pending is evaluated first. In addition, create a new comment describing that exception.pending blocks re-injection of NMI/interrupt because the exception was queued by handling vmexit which was due to NMI/interrupt delivery. Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@orcle.com> [Used a comment from Sean J <sean.j.christopherson@intel.com>. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-28	KVM: x86: Rename interrupt.pending to interrupt.injected	Liran Alon
	For exceptions & NMIs events, KVM code use the following coding convention: ) "pending" represents an event that should be injected to guest at some point but it's side-effects have not yet occurred. ) "injected" represents an event that it's side-effects have already occurred. However, interrupts don't conform to this coding convention. All current code flows mark interrupt.pending when it's side-effects have already taken place (For example, bit moved from LAPIC IRR to ISR). Therefore, it makes sense to just rename interrupt.pending to interrupt.injected. This change follows logic of previous commit 664f8e26b00c ("KVM: X86: Fix loss of exception which has not yet been injected") which changed exception to follow this coding convention as well. It is important to note that in case !lapic_in_kernel(vcpu), interrupt.pending usage was and still incorrect. In this case, interrrupt.pending can only be set using one of the following ioctls: KVM_INTERRUPT, KVM_SET_VCPU_EVENTS and KVM_SET_SREGS. Looking at how QEMU uses these ioctls, one can see that QEMU uses them either to re-set an "interrupt.pending" state it has received from KVM (via KVM_GET_VCPU_EVENTS interrupt.pending or via KVM_GET_SREGS interrupt_bitmap) or by dispatching a new interrupt from QEMU's emulated LAPIC which reset bit in IRR and set bit in ISR before sending ioctl to KVM. So it seems that indeed "interrupt.pending" in this case is also suppose to represent "interrupt.injected". However, kvm_cpu_has_interrupt() & kvm_cpu_has_injectable_intr() is misusing (now named) interrupt.injected in order to return if there is a pending interrupt. This leads to nVMX/nSVM not be able to distinguish if it should exit from L2 to L1 on EXTERNAL_INTERRUPT on pending interrupt or should re-inject an injected interrupt. Therefore, add a FIXME at these functions for handling this issue. This patch introduce no semantics change. Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-28	KVM: VMX: No need to clear pending NMI/interrupt on inject realmode interrupt	Liran Alon
	kvm_inject_realmode_interrupt() is called from one of the injection functions which writes event-injection to VMCS: vmx_queue_exception(), vmx_inject_irq() and vmx_inject_nmi(). All these functions are called just to cause an event-injection to guest. They are not responsible of manipulating the event-pending flag. The only purpose of kvm_inject_realmode_interrupt() should be to emulate real-mode interrupt-injection. This was also incorrect when called from vmx_queue_exception(). Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-28	x86/kvm: use Enlightened VMCS when running on Hyper-V	Vitaly Kuznetsov
	Enlightened VMCS is just a structure in memory, the main benefit besides avoiding somewhat slower VMREAD/VMWRITE is using clean field mask: we tell the underlying hypervisor which fields were modified since VMEXIT so there's no need to inspect them all. Tight CPUID loop test shows significant speedup: Before: 18890 cycles After: 8304 cycles Static key is being used to avoid performance penalty for non-Hyper-V deployments. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>