summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2017-08-28powerpc/configs: Update for CONFIG_DEBUG_FS being selected via CONFIG_RCU_TRACEMichael Ellerman
In commit 961518259b3b ("rcu: Enable RCU tracepoints by default to aid in debugging"), CONFIG_RCU_TRACE was made default y (if CONFIG_TREE_RCU=y, which it is for some of our configs). That in turn causes CONFIG_TREE_RCU_TRACE to be enabled, which selects CONFIG_DEBUG_FS. The end result is that CONFIG_DEBUG_FS is forced on, meaning we don't have to enable it in some of our configs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/configs: Drop no longer needed CONFIG_DEVKMEMMichael Ellerman
Since commit e334cd69fa65 ("Move CONFIG_DEVKMEM default to n") we no longer need to set CONFIG_DEVKMEM in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/configs: Drop no longer needed CONFIG_FHANDLEMichael Ellerman
Since commit f76be61755c5 ("Make CONFIG_FHANDLE default y") we no longer need to set CONFIG_FHANDLE in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/configs: Explicitly drop CONFIG_INPUT_MOUSEDEVMichael Ellerman
In commit 73d8ef76006b ("Input: mousedev - stop offering PS/2 to userspace by default") (Jan 2017), CONFIG_INPUT_MOUSEDEV was switched from default y to default n, with the explanation: Evdev interface has been available for many years and by now everyone is switched to using it, so let's stop offering /dev/input/mouseN and /dev/psaux by default. We had a number of configs which had it enabled, but going by the above explanation probably don't need it enabled anymore. So drop the last remnants of it from our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/configs: Drop unneeded CONFIG_CRYPTO_ANSI_CPRNGMichael Ellerman
Since commit 401e4238f35c ("crypto: rng - Make DRBG the default RNG") we no longer need to set CONFIG_CRYPTO_ANSI_CPRNG in our defconfigs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/configs: Update for symbol movement onlyMichael Ellerman
Update defconfigs for symbols that have moved around, without their value changing. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/oops: Line up NIP & MSR with other rowsMichael Ellerman
This is purely cosmetic, but does look nicer IMHO: Before: task: c000000001453400 task.stack: c000000001c6c000 NIP: c000000000a0fbfc LR: c000000000a0fbf4 CTR: c000000000ba6220 REGS: c0000001fffef820 TRAP: 0300 Not tainted (4.13.0-rc6-gcc-6.3.1-00234-g423af27f7d81) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 88088242 XER: 00000000 CFAR: c0000000000b3488 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 0 After: task: c000000001453400 task.stack: c000000001c6c000 NIP: c000000000a0fbfc LR: c000000000a0fbf4 CTR: c000000000ba6220 REGS: c0000001fffef820 TRAP: 0300 Not tainted (4.13.0-rc6-gcc-6.3.1-00234-g423af27f7d81-dirty) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 88088242 XER: 00000000 CFAR: c0000000000b34a4 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 0 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/oops: Print CR/XER on same line as MSRMichael Ellerman
Somehow we missed this when the pr_cont() changes went in. Fix CR/XER to go on the same line as MSR, as they have historically, eg: MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 4804408a XER: 20000000 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/oops: Use IS_ENABLED() for oops markersMichael Ellerman
Just because it looks less gross. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/oops: Print the kernel's endian in the oopsMichael Ellerman
Although the MSR tells you what endian you're in it's possible that isn't the same endian the kernel was built for, and if that happens you're usually having a very bad day. So print a marker to make it 100% clear which endian the kernel was built for. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/oops: Fix the oops markers to use pr_cont()Michael Ellerman
When we oops we print a few markers for significant config options such as PREEMPT, SMP etc. Currently these appear on separate lines because we're not using pr_cont() properly. Fix it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28powerpc/powernv: Fix build error in opal-imc.c when NUMA=nLABBE Corentin
When building a random powerpc kernel I hit this build error: arch/powerpc/platforms/powernv/opal-imc.c:130:13: error : assignment discards « const » qualifier from pointer target type [-Werror=discarded-qualifiers] l_cpumask = cpumask_of_node(nid); ^ This happens because when CONFIG_NUMA=n cpumask_of_node() returns a const pointer. This patch simply adds const to l_cpumask to fix this issue. Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Flesh out change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-28Merge tag 'kvm-s390-master-4.13-2' into kvms390/nextChristian Borntraeger
Additional fixes on top of these two - missing inline assembly constraint - wrong exception handling are necessary
2017-08-28arm: dts: sunxi: Revert EMAC changesMaxime Ripard
Since the discussion is not settled yet for the EMAC, and that the release in getting really close, let's revert the changes for now, and we'll reintroduce them later. Acked-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
2017-08-28arm64: dts: allwinner: Revert EMAC changesMaxime Ripard
Since the discussion is not settled yet for the EMAC, and that the release in getting really close, let's revert the changes for now, and we'll reintroduce them later. Acked-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
2017-08-28Revert "ARM: dts: sun8i: h3: Enable dwmac-sun8i on the Beelink X2"Maxime Ripard
This reverts commit ddb56254ae52acff7bd7fbd8f963e79bffc324d4. The EMAC bindings have not stabilized yet, so we can't commit to keeping them stable. Acked-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
2017-08-28Merge 4.13-rc7 into char-misc-nextGreg Kroah-Hartman
We want the binder fix in here as well for testing and merge issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-28m68knommu: remove dead codeAlexandre Belloni
As CONFIG_RTC_DRV_M5441x doesn't exist because the driver never made it upstream, there is no device to register. Remove code that is never compiled and init_BSP() as it doesn't do anything. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2017-08-28m68k: allow NULL clock for clk_get_rateJonas Gorski
Make the behaviour of clk_get_rate consistent with common clk's clk_get_rate by accepting NULL clocks as parameter. Some device drivers rely on this, and will cause an OOPS otherwise. Fixes: facdf0ed4f59 ("m68knommu: introduce basic clk infrastructure") Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: linux-m68k@lists.linux-m68k.org Cc: linux-kernel@vger.kernel.org Reported-by: Mathias Kresin <dev@kresin.me> Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
2017-08-27ARM: dts: rk3228-evb: Fix the compiling errorDavid Wu
This patch solves the following error: arch/arm/boot/dts/rk3228-evb.dtb: ERROR (phandle_references): Reference to non-existent node or label "phy0" Fixess db40f15b53e4 ("ARM: dts: rk3228-evb: Enable the integrated PHY for gmac") Signed-off-by: David Wu <david.wu@rock-chips.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-27blackfin: merge the two TWI header filesWolfram Sang
There seems to be no need for separate ones since all users include both files anyhow. Merge them because include/linux/i2c is to be deprecated. Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-08-27Merge branch 'i2c-mux/for-next' of https://github.com/peda-r/i2c-mux into ↵Wolfram Sang
i2c/for-4.14
2017-08-26media: arm: dts: omap3: N9/N950: Add AS3645A camera flashSakari Ailus
Add the as3645a flash controller to the DT source. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-08-26Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two fixes: one for an ldt_struct handling bug and a cherry-picked objtool fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Fix use-after-free of ldt_struct objtool: Fix '-mtune=atom' decoding support in objtool 2.0
2017-08-26kvm/x86: Avoid clearing the C-bit in rsvd_bits()Brijesh Singh
The following commit: d0ec49d4de90 ("kvm/x86/svm: Support Secure Memory Encryption within KVM") uses __sme_clr() to remove the C-bit in rsvd_bits(). rsvd_bits() is just a simple function to return some 1 bits. Applying a mask based on properties of the host MMU is incorrect. Additionally, the masks computed by __reset_rsvds_bits_mask also apply to guest page tables, where the C bit is reserved since we don't emulate SME. The fix is to clear the C-bit from rsvd_bits_mask array after it has been populated from __reset_rsvds_bits_mask() Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: paolo.bonzini@gmail.com Fixes: d0ec49d ("kvm/x86/svm: Support Secure Memory Encryption within KVM") Link: http://lkml.kernel.org/r/20170825205540.123531-1-brijesh.singh@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-26efi: Move efi_mem_type() to common codeJan Beulich
This follows efi_mem_attributes(), as it's similarly generic. Drop __weak from that one though (and don't introduce it for efi_mem_type() in the first place) to make clear that other overrides to these functions are really not intended. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Jan Beulich <JBeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170825155019.6740-5-ard.biesheuvel@linaro.org [ Resolved conflict with: f99afd08a45f: (efi: Update efi_mem_type() to return an error rather than 0) ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-26efi/libstub: Enable reset attack mitigationMatthew Garrett
If a machine is reset while secrets are present in RAM, it may be possible for code executed after the reboot to extract those secrets from untouched memory. The Trusted Computing Group specified a mechanism for requesting that the firmware clear all RAM on reset before booting another OS. This is done by setting the MemoryOverwriteRequestControl variable at startup. If userspace can ensure that all secrets are removed as part of a controlled shutdown, it can reset this variable to 0 before triggering a hardware reboot. Signed-off-by: Matthew Garrett <mjg59@google.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170825155019.6740-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-26Merge branch 'x86/mm' into efi/core, to pick up dependenciesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-26Merge branch 'linus' into x86/mm to pick up fixes and to fix conflictsIngo Molnar
Conflicts: arch/x86/kernel/head64.c arch/x86/mm/mmap.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull Paolo Bonzini: "Bugfixes for x86, PPC and s390" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.state KVM: x86: simplify handling of PKRU KVM: x86: block guest protection keys unless the host has them enabled KVM: PPC: Book3S HV: Add missing barriers to XIVE code and document them KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss KVM: PPC: Book3S HV: Use msgsync with hypervisor doorbells on POWER9 KVM: s390: sthyi: fix specification exception detection KVM: s390: sthyi: fix sthyi inline assembly
2017-08-25Merge tag 'powerpc-4.13-8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: "Just one fix, to add a barrier in the switch_mm() code to make sure the mm cpumask update is ordered vs the MMU starting to load translations. As far as we know no one's actually hit the bug, but that's just luck. Thanks to Benjamin Herrenschmidt, Nicholas Piggin" * tag 'powerpc-4.13-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Ensure cpumask update is ordered
2017-08-25futex: Remove duplicated code and fix undefined behaviourJiri Slaby
There is code duplicated over all architecture's headers for futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, and comparison of the result. Remove this duplication and leave up to the arches only the needed assembly which is now in arch_futex_atomic_op_inuser. This effectively distributes the Will Deacon's arm64 fix for undefined behaviour reported by UBSAN to all architectures. The fix was done in commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump. And as suggested by Thomas, check for negative oparg too, because it was also reported to cause undefined behaviour report. Note that s390 removed access_ok check in d12a29703 ("s390/uaccess: remove pointless access_ok() checks") as access_ok there returns true. We introduce it back to the helper for the sake of simplicity (it gets optimized away anyway). Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390] Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile] Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org> Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64] Cc: linux-mips@linux-mips.org Cc: Rich Felker <dalias@libc.org> Cc: linux-ia64@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: peterz@infradead.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: sparclinux@vger.kernel.org Cc: Jonas Bonn <jonas@southpole.se> Cc: linux-s390@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-hexagon@vger.kernel.org Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-snps-arc@lists.infradead.org Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-xtensa@linux-xtensa.org Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: openrisc@lists.librecores.org Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Stafford Horne <shorne@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Richard Henderson <rth@twiddle.net> Cc: Chris Zankel <chris@zankel.net> Cc: Michal Simek <monstr@monstr.eu> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-parisc@vger.kernel.org Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: linux-alpha@vger.kernel.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: "David S. Miller" <davem@davemloft.net> Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
2017-08-25x86/intel_rdt: Turn off most RDT features on SkylakeTony Luck
Errata list is included in this document: https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/6th-gen-x-series-spec-update.pdf with more details in: https://www.intel.com/content/www/us/en/processors/xeon/scalable/xeon-scalable-spec-update.html But the tl;dr summary (using tags from first of the documents) is: SKZ4 MBM does not accurately track write bandwidth SKZ17 CMT counters may not count accurately SKZ18 CAT may not restrict cacheline allocation under certain conditions SKZ19 MBM counters may undercount Disable all these features on Skylake models. Users who understand the errata may re-enable using boot command line options. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Fenghua" <fenghua.yu@intel.com> Cc: Ravi V" <ravi.v.shankar@intel.com> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: "Stephane Eranian" <eranian@google.com> Cc: "Andi Kleen" <ak@linux.intel.com> Cc: "David Carrillo-Cisneros" <davidcc@google.com> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Link: http://lkml.kernel.org/r/3aea0a3bae219062c812668bd9b7b8f1a25003ba.1503512900.git.tony.luck@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-08-25x86/intel_rdt: Add command line options for resource director technologyTony Luck
Command line options allow us to ignore features that we don't want. Also we can re-enable options that have been disabled on a platform (so long as the underlying h/w actually supports the option). [ tglx: Marked the option array __initdata and the helper function __init ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Fenghua" <fenghua.yu@intel.com> Cc: Ravi V" <ravi.v.shankar@intel.com> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: "Stephane Eranian" <eranian@google.com> Cc: "Andi Kleen" <ak@linux.intel.com> Cc: "David Carrillo-Cisneros" <davidcc@google.com> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Link: http://lkml.kernel.org/r/0c37b0d4dbc30977a3c1cee08b66420f83662694.1503512900.git.tony.luck@intel.com
2017-08-25x86/intel_rdt: Move special case code for Haswell to a quirk functionTony Luck
No functional change, but lay the ground work for other per-model quirks. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Fenghua" <fenghua.yu@intel.com> Cc: Ravi V" <ravi.v.shankar@intel.com> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: "Stephane Eranian" <eranian@google.com> Cc: "Andi Kleen" <ak@linux.intel.com> Cc: "David Carrillo-Cisneros" <davidcc@google.com> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Link: http://lkml.kernel.org/r/f195a83751b5f8b1d8a78bd3c1914300c8fa3142.1503512900.git.tony.luck@intel.com
2017-08-26arm64: dts: uniphier: add reset controller node of analog amplifierKatsuhiro Suzuki
This patch adds reset controller node of analog signal amplifier core (ADAMV) for UniPhier LD11/LD20 SoCs. Signed-off-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-08-25kvm: nVMX: Validate the virtual-APIC address on nested VM-entryJim Mattson
According to the SDM, if the "use TPR shadow" VM-execution control is 1, bits 11:0 of the virtual-APIC address must be 0 and the address should set any bits beyond the processor's physical-address width. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-25KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()Paul Mackerras
Nixiaoming pointed out that there is a memory leak in kvm_vm_ioctl_create_spapr_tce() if the call to anon_inode_getfd() fails; the memory allocated for the kvmppc_spapr_tce_table struct is not freed, and nor are the pages allocated for the iommu tables. In addition, we have already incremented the process's count of locked memory pages, and this doesn't get restored on error. David Hildenbrand pointed out that there is a race in that the function checks early on that there is not already an entry in the stt->iommu_tables list with the same LIOBN, but an entry with the same LIOBN could get added between then and when the new entry is added to the list. This fixes all three problems. To simplify things, we now call anon_inode_getfd() before placing the new entry in the list. The check for an existing entry is done while holding the kvm->lock mutex, immediately before adding the new entry to the list. Finally, on failure we now call kvmppc_account_memlimit to decrement the process's count of locked memory pages. Reported-by: Nixiaoming <nixiaoming@huawei.com> Reported-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-25Merge branch 'linus' into sched/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25perf/x86: Export some PMU attributes in caps/ directoryAndi Kleen
It can be difficult to figure out for user programs what features the x86 CPU PMU driver actually supports. Currently it requires grepping in dmesg, but dmesg is not always available. This adds a caps directory to /sys/bus/event_source/devices/cpu/, similar to the caps already used on intel_pt, which can be used to discover the available capabilities cleanly. Three capabilities are defined: - pmu_name: Underlying CPU name known to the driver - max_precise: Max precise level supported - branches: Known depth of LBR. Example: % grep . /sys/bus/event_source/devices/cpu/caps/* /sys/bus/event_source/devices/cpu/caps/branches:32 /sys/bus/event_source/devices/cpu/caps/max_precise:3 /sys/bus/event_source/devices/cpu/caps/pmu_name:skylake Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170822185201.9261-3-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25perf/x86: Only show format attributes when supportedAndi Kleen
Only show the Intel format attributes in sysfs when the feature is actually supported with the current model numbers. This allows programs to probe what format attributes are available, and give a sensible error message to users if they are not. This handles near all cases for intel attributes since Nehalem, except the (obscure) case when the model number if known, but PEBS is disabled in PERF_CAPABILITIES. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170822185201.9261-2-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25perf/x86: Fix data source decoding for SkylakeAndi Kleen
Skylake changed the encoding of the PEBS data source field. Some combinations are not available anymore, but some new cases e.g. for L4 cache hit are added. Fix up the conversion table for Skylake, similar as had been done for Nehalem. On Skylake server the encoding for L4 actually means persistent memory. Handle this case too. To properly describe it in the abstracted perf format I had to add some new fields. Since a hit can have only one level add a new field that is an enumeration, not a bit field to describe the level. It can describe any level. Some numbers are also used to describe PMEM and LFB. Also add a new generic remote flag that can be combined with the generic level to signify a remote cache. And there is an extension field for the snoop indication to handle the Forward state. I didn't add a generic flag for hops because it's not needed for Skylake. I changed the existing encodings for older CPUs to also fill in the new level and remote fields. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/20170816222156.19953-3-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25perf/x86: Move Nehalem PEBS code to flagAndi Kleen
Minor cleanup: use an explicit x86_pmu flag to handle the missing Lock / TLB information on Nehalem, instead of always checking the model number for each PEBS sample. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/20170816222156.19953-2-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25x86/mm: Fix use-after-free of ldt_structEric Biggers
The following commit: 39a0526fb3f7 ("x86/mm: Factor out LDT init from context init") renamed init_new_context() to init_new_context_ldt() and added a new init_new_context() which calls init_new_context_ldt(). However, the error code of init_new_context_ldt() was ignored. Consequently, if a memory allocation in alloc_ldt_struct() failed during a fork(), the ->context.ldt of the new task remained the same as that of the old task (due to the memcpy() in dup_mm()). ldt_struct's are not intended to be shared, so a use-after-free occurred after one task exited. Fix the bug by making init_new_context() pass through the error code of init_new_context_ldt(). This bug was found by syzkaller, which encountered the following splat: BUG: KASAN: use-after-free in free_ldt_struct.part.2+0x10a/0x150 arch/x86/kernel/ldt.c:116 Read of size 4 at addr ffff88006d2cb7c8 by task kworker/u9:0/3710 CPU: 1 PID: 3710 Comm: kworker/u9:0 Not tainted 4.13.0-rc4-next-20170811 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:52 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x24e/0x340 mm/kasan/report.c:409 __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429 free_ldt_struct.part.2+0x10a/0x150 arch/x86/kernel/ldt.c:116 free_ldt_struct arch/x86/kernel/ldt.c:173 [inline] destroy_context_ldt+0x60/0x80 arch/x86/kernel/ldt.c:171 destroy_context arch/x86/include/asm/mmu_context.h:157 [inline] __mmdrop+0xe9/0x530 kernel/fork.c:889 mmdrop include/linux/sched/mm.h:42 [inline] exec_mmap fs/exec.c:1061 [inline] flush_old_exec+0x173c/0x1ff0 fs/exec.c:1291 load_elf_binary+0x81f/0x4ba0 fs/binfmt_elf.c:855 search_binary_handler+0x142/0x6b0 fs/exec.c:1652 exec_binprm fs/exec.c:1694 [inline] do_execveat_common.isra.33+0x1746/0x22e0 fs/exec.c:1816 do_execve+0x31/0x40 fs/exec.c:1860 call_usermodehelper_exec_async+0x457/0x8f0 kernel/umh.c:100 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431 Allocated by task 3700: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:447 set_track mm/kasan/kasan.c:459 [inline] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551 kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3627 kmalloc include/linux/slab.h:493 [inline] alloc_ldt_struct+0x52/0x140 arch/x86/kernel/ldt.c:67 write_ldt+0x7b7/0xab0 arch/x86/kernel/ldt.c:277 sys_modify_ldt+0x1ef/0x240 arch/x86/kernel/ldt.c:307 entry_SYSCALL_64_fastpath+0x1f/0xbe Freed by task 3700: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:447 set_track mm/kasan/kasan.c:459 [inline] kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524 __cache_free mm/slab.c:3503 [inline] kfree+0xca/0x250 mm/slab.c:3820 free_ldt_struct.part.2+0xdd/0x150 arch/x86/kernel/ldt.c:121 free_ldt_struct arch/x86/kernel/ldt.c:173 [inline] destroy_context_ldt+0x60/0x80 arch/x86/kernel/ldt.c:171 destroy_context arch/x86/include/asm/mmu_context.h:157 [inline] __mmdrop+0xe9/0x530 kernel/fork.c:889 mmdrop include/linux/sched/mm.h:42 [inline] __mmput kernel/fork.c:916 [inline] mmput+0x541/0x6e0 kernel/fork.c:927 copy_process.part.36+0x22e1/0x4af0 kernel/fork.c:1931 copy_process kernel/fork.c:1546 [inline] _do_fork+0x1ef/0xfb0 kernel/fork.c:2025 SYSC_clone kernel/fork.c:2135 [inline] SyS_clone+0x37/0x50 kernel/fork.c:2129 do_syscall_64+0x26c/0x8c0 arch/x86/entry/common.c:287 return_from_SYSCALL_64+0x0/0x7a Here is a C reproducer: #include <asm/ldt.h> #include <pthread.h> #include <signal.h> #include <stdlib.h> #include <sys/syscall.h> #include <sys/wait.h> #include <unistd.h> static void *fork_thread(void *_arg) { fork(); } int main(void) { struct user_desc desc = { .entry_number = 8191 }; syscall(__NR_modify_ldt, 1, &desc, sizeof(desc)); for (;;) { if (fork() == 0) { pthread_t t; srand(getpid()); pthread_create(&t, NULL, fork_thread, NULL); usleep(rand() % 10000); syscall(__NR_exit_group, 0); } wait(NULL); } } Note: the reproducer takes advantage of the fact that alloc_ldt_struct() may use vmalloc() to allocate a large ->entries array, and after commit: 5d17a73a2ebe ("vmalloc: back off when the current task is killed") it is possible for userspace to fail a task's vmalloc() by sending a fatal signal, e.g. via exit_group(). It would be more difficult to reproduce this bug on kernels without that commit. This bug only affected kernels with CONFIG_MODIFY_LDT_SYSCALL=y. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@vger.kernel.org> [v4.6+] Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Fixes: 39a0526fb3f7 ("x86/mm: Factor out LDT init from context init") Link: http://lkml.kernel.org/r/20170824175029.76040-1-ebiggers3@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.statePaolo Bonzini
The host pkru is restored right after vcpu exit (commit 1be0e61), so KVM_GET_XSAVE will return the host PKRU value instead. Fix this by using the guest PKRU explicitly in fill_xsave and load_xsave. This part is based on a patch by Junkang Fu. The host PKRU data may also not match the value in vcpu->arch.guest_fpu.state, because it could have been changed by userspace since the last time it was saved, so skip loading it in kvm_load_guest_fpu. Reported-by: Junkang Fu <junkang.fjk@alibaba-inc.com> Cc: Yang Zhang <zy107165@alibaba-inc.com> Fixes: 1be0e61c1f255faaeab04a390e00c8b9b9042870 Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-25KVM: x86: simplify handling of PKRUPaolo Bonzini
Move it to struct kvm_arch_vcpu, replacing guest_pkru_valid with a simple comparison against the host value of the register. The write of PKRU in addition can be skipped if the guest has not enabled the feature. Once we do this, we need not test OSPKE in the host anymore, because guest_CR4.PKE=1 implies host_CR4.PKE=1. The static PKU test is kept to elide the code on older CPUs. Suggested-by: Yang Zhang <zy107165@alibaba-inc.com> Fixes: 1be0e61c1f255faaeab04a390e00c8b9b9042870 Cc: stable@vger.kernel.org Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-25KVM: x86: block guest protection keys unless the host has them enabledPaolo Bonzini
If the host has protection keys disabled, we cannot read and write the guest PKRU---RDPKRU and WRPKRU fail with #GP(0) if CR4.PKE=0. Block the PKU cpuid bit in that case. This ensures that guest_CR4.PKE=1 implies host_CR4.PKE=1. Fixes: 1be0e61c1f255faaeab04a390e00c8b9b9042870 Cc: stable@vger.kernel.org Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-25Merge branch 'x86/asm' into x86/apicThomas Gleixner
Pick up dependent changes to avoid merge conflicts