summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2016-07-23x86/insn: remove pcommitDan Williams
The pcommit instruction is being deprecated in favor of either ADR (asynchronous DRAM refresh: flush-on-power-fail) at the platform level, or posted-write-queue flush addresses as defined by the ACPI 6.x NFIT (NVDIMM Firmware Interface Table). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Acked-by: Ingo Molnar <mingo@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-23Revert "KVM: x86: add pcommit support"Dan Williams
This reverts commit 8b3e34e46aca9b6d349b331cd9cf71ccbdc91b2e. Given the deprecation of the pcommit instruction, the relevant VMX features and CPUID bits are not going to be rolled into the SDM. Remove their usage from KVM. Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-12pmem: kill __pmem address spaceDan Williams
The __pmem address space was meant to annotate codepaths that touch persistent memory and need to coordinate a call to wmb_pmem(). Now that wmb_pmem() is gone, there is little need to keep this annotation. Cc: Christoph Hellwig <hch@lst.de> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-12pmem: kill wmb_pmem()Dan Williams
All users have been replaced with flushing in the pmem driver. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-06-10Merge tag 'powerpc-4.7-3Michael Ellerman:' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from - ptrace: Fix out of bounds array access warning from Khem Raj - pseries: Fix PCI config address for DDW from Gavin Shan - pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added from Michael Ellerman - of: fix autoloading due to broken modalias with no 'compatible' from Wolfram Sang - radix: Fix always false comparison against MMU_NO_CONTEXT from Aneesh Kumar K.V - hash: Compute the segment size correctly for ISA 3.0 from Aneesh Kumar K.V - nohash: Fix build break with 64K pages from Michael Ellerman * tag 'powerpc-4.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/nohash: Fix build break with 64K pages powerpc/mm/hash: Compute the segment size correctly for ISA 3.0 powerpc/mm/radix: Fix always false comparison against MMU_NO_CONTEXT of: fix autoloading due to broken modalias with no 'compatible' powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added powerpc/pseries: Fix PCI config address for DDW powerpc/ptrace: Fix out of bounds array access warning
2016-06-10Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Will Deacon: "A fix for an issue that Alex saw whilst swapping with hardware access/dirty bit support enabled in the kernel: Fix a failure to fault in old pages on a write when CONFIG_ARM64_HW_AFDBM is enabled" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: always take dirty state from new pte in ptep_set_access_flags
2016-06-10Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes from all around the map, plus a commit that introduces a new header of Intel model name symbols (unused) that will make the next merge window easier" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ioapic: Fix incorrect pointers in ioapic_setup_resources() x86/entry/traps: Don't force in_interrupt() to return true in IST handlers x86/cpu/AMD: Extend X86_FEATURE_TOPOEXT workaround to newer models x86/cpu/intel: Introduce macros for Intel family numbers x86, build: copy ldlinux.c32 to image.iso x86/msr: Use the proper trace point conditional for writes
2016-06-10Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A handful of tooling fixes, two PMU driver fixes and a cleanup of redundant code that addresses a security analyzer false positive" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Remove a redundant check perf/x86/intel/uncore: Remove SBOX support for Broadwell server perf ctf: Convert invalid chars in a string before set value perf record: Fix crash when kptr is restricted perf symbols: Check kptr_restrict for root perf/x86/intel/rapl: Fix pmus free during cleanup
2016-06-10x86/ioapic: Fix incorrect pointers in ioapic_setup_resources()Rui Wang
On a 4-socket Brickland system, hot-removing one ioapic is fine. Hot-removing the 2nd one causes panic in mp_unregister_ioapic() while calling release_resource(). It is because the iomem_res pointer has already been released when removing the first ioapic. To explain the use of &res[num] here: res is assigned to ioapic_resources, and later in ioapic_insert_resources() we do: struct resource *r = ioapic_resources; for_each_ioapic(i) { insert_resource(&iomem_resource, r); r++; } Here 'r' is treated as an arry of 'struct resource', and the r++ ensures that each element of the array is inserted separately. Thus we should call release_resouce() on each element at &res[num]. Fix it by assigning the correct pointers to ioapics[i].iomem_res in ioapic_setup_resources(). Signed-off-by: Rui Wang <rui.y.wang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: tony.luck@intel.com Cc: linux-pci@vger.kernel.org Cc: rjw@rjwysocki.net Cc: linux-acpi@vger.kernel.org Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1465369193-4816-3-git-send-email-rui.y.wang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-10x86/entry/traps: Don't force in_interrupt() to return true in IST handlersAndy Lutomirski
Forcing in_interrupt() to return true if we're not in a bona fide interrupt confuses the softirq code. This fixes warnings like: NOHZ: local_softirq_pending 282 ... which can happen when running things like selftests/x86. This will change perf's static percpu buffer usage in IST context. I think this is okay, and it's changing the behavior to match historical (pre-4.0) behavior. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 959274753857 ("x86, traps: Track entry into and exit from IST context") Link: http://lkml.kernel.org/r/cdc215f94d118d691d73df35275022331156fb45.1464130360.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-10powerpc/nohash: Fix build break with 64K pagesMichael Ellerman
Commit 74701d5947a6 "powerpc/mm: Rename function to indicate we are allocating fragments" renamed page_table_free() to pte_fragment_free(). One occurrence was mistyped as pte_fragment_fre(). This only breaks the nohash 64K page build, which is not the default or enabled in any defconfig. Fixes: 74701d5947a6 ("powerpc/mm: Rename function to indicate we are allocating fragments") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-09Merge tag 'arc-4.7-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - Revert of ll-sc backoff retry workaround in atomics/spinlocks as hardware is now proven to work just fine - Typo fixes (Thanks Andrea Gelmini) - Removal of obsolete DT property (Alexey) - Other minor fixes * tag 'arc-4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: Revert "ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff" Revert "ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle" Revert "ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff" ARC: don't enable DISCONTIGMEM unconditionally ARC: [intc-compact] simplify code for 2 priority levels arc: Get rid of root core-frequency property Fix typos
2016-06-08x86/cpu/AMD: Extend X86_FEATURE_TOPOEXT workaround to newer modelsBorislav Petkov
We need to reenable the topology extensions CPUID leafs on newer models too, if BIOS has disabled them, as we rely on them to get proper compute unit topology. Make the printk a once thing, while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rui Huang <ray.huang@amd.com> Cc: Sherry Hurwitz <sherry.hurwitz@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-hwmon@vger.kernel.org Link: http://lkml.kernel.org/r/1464775468-23355-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08x86/cpu/intel: Introduce macros for Intel family numbersDave Hansen
Problem: We have a boatload of open-coded family-6 model numbers. Half of them have these model numbers in hex and the other half in decimal. This makes grepping for them tons of fun, if you were to try. Solution: Consolidate all the magic numbers. Put all the definitions in one header. The names here are closely derived from the comments describing the models from arch/x86/events/intel/core.c. We could easily make them shorter by doing things like s/SANDYBRIDGE/SNB/, but they seemed fine even with the longer versions to me. Do not take any of these names too literally, like "DESKTOP" or "MOBILE". These are all colloquial names and not precise descriptions of everywhere a given model will show up. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com> Cc: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Vishwanath Somayaji <vishwanath.somayaji@intel.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: jacob.jun.pan@intel.com Cc: linux-acpi@vger.kernel.org Cc: linux-edac@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: platform-driver-x86@vger.kernel.org Link: http://lkml.kernel.org/r/20160603001927.F2A7D828@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-08arm64: mm: always take dirty state from new pte in ptep_set_access_flagsWill Deacon
Commit 66dbd6e61a52 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM") ensured that pte flags are updated atomically in the face of potential concurrent, hardware-assisted updates. However, Alex reports that: | This patch breaks swapping for me. | In the broken case, you'll see either systemd cpu time spike (because | it's stuck in a page fault loop) or the system hang (because the | application owning the screen is stuck in a page fault loop). It turns out that this is because the 'dirty' argument to ptep_set_access_flags is always 0 for read faults, and so we can't use it to set PTE_RDONLY. The failing sequence is: 1. We put down a PTE_WRITE | PTE_DIRTY | PTE_AF pte 2. Memory pressure -> pte_mkold(pte) -> clear PTE_AF 3. A read faults due to the missing access flag 4. ptep_set_access_flags is called with dirty = 0, due to the read fault 5. pte is then made PTE_WRITE | PTE_DIRTY | PTE_AF | PTE_RDONLY (!) 6. A write faults, but pte_write is true so we get stuck The solution is to check the new page table entry (as would be done by the generic, non-atomic definition of ptep_set_access_flags that just calls set_pte_at) to establish the dirty state. Cc: <stable@vger.kernel.org> # 4.3+ Fixes: 66dbd6e61a52 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Alexander Graf <agraf@suse.de> Tested-by: Alexander Graf <agraf@suse.de> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-08powerpc/mm/hash: Compute the segment size correctly for ISA 3.0Aneesh Kumar K.V
PowerISA 3.0 encodes the segment size in the second half of hash page table entry. Update hpte_decode() accordingly. Fixes: 50de596de8be ("powerpc/mm/hash: Add support for Power9 Hash") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-08powerpc/mm/radix: Fix always false comparison against MMU_NO_CONTEXTAneesh Kumar K.V
In some of the radix TLB flush routines, we use a local to store the mm->context.id, AKA the PID. Currently we use an int, but the PID is unsigned long, so large values of PID will be truncated. In particular MMU_NO_CONTEXT is -1, which means all our comparisons against that value can never be true. This means we'll issue TLB flushes when we shouldn't on radix enabled machines. Fix it by using an unsigned long for the local. Discovered by Coverity. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> [mpe: Write change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Fixes for crap of assorted ages: EOPENSTALE one is 4.2+, autofs one is 4.6, d_walk - 3.2+. The atomic_open() and coredump ones are regressions from this window" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: coredump: fix dumping through pipes fix a regression in atomic_open() fix d_walk()/non-delayed __d_free() race autofs braino fix for do_last() fix EOPENSTALE bug in do_last()
2016-06-07coredump: fix dumping through pipesMateusz Guzik
The offset in the core file used to be tracked with ->written field of the coredump_params structure. The field was retired in favour of file->f_pos. However, ->f_pos is not maintained for pipes which leads to breakage. Restore explicit tracking of the offset in coredump_params. Introduce ->pos field for this purpose since ->written was already reused. Fixes: a00839395103 ("get rid of coredump_params->written"). Reported-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> Signed-off-by: Mateusz Guzik <mguzik@redhat.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-08powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was addedMichael Ellerman
The recent commit 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call") added a new PVR mask & value to the start of the ibm_architecture_vec[] array. However it missed the fact that further down in the array, we hard code the offset of one of the fields, and then at boot use that value to patch the value in the array. This means every update to the array must also update the #define, ugh. This means that on pseries machines we will misreport to firmware the number of cores we support, by a factor of threads_per_core. Fix it for now by updating the #define. Fixes: 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-07Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "This finally removes the CLK_IS_ROOT flag by picking up the last few stragglers that didn't get merged by anyone this time around. Better to do it now than wait for another one to pop up. There's also a minor maintainers update and a Kconfig fix" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: nxp: Select MFD_SYSCON for creg driver MAINTAINERS: Add file patterns for clock device tree bindings clk: Remove CLK_IS_ROOT flag clk: microchip: Remove CLK_IS_ROOT powerpc/512x: clk: Remove CLK_IS_ROOT vexpress/spc: Remove CLK_IS_ROOT
2016-06-07x86, build: copy ldlinux.c32 to image.isoH. Peter Anvin
For newer versions of Syslinux, we need ldlinux.c32 in addition to isolinux.bin to reside on the boot disk, so if the latter is found, copy it, too, to the isoimage tree. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Linux Stable Tree <stable@vger.kernel.org>
2016-06-06x86/msr: Use the proper trace point conditional for writesDr. David Alan Gilbert
The msr tracing for writes is incorrectly conditional on the read trace. Fixes: 7f47d8cc039f "x86, tracing, perf: Add trace point for MSR accesses" Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: stable@vger.kernel.org Cc: ak@linux.intel.com Link: http://lkml.kernel.org/r/1464976859-21850-1-git-send-email-dgilbert@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-06-06powerpc/pseries: Fix PCI config address for DDWGavin Shan
In commit 8445a87f7092 "powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism", the PE address was replaced with the PCI config address in order to remove dependency on EEH. According to PAPR spec, firmware (pHyp or QEMU) should accept "xxBBSSxx" format PCI config address, not "xxxxBBSS" provided by the patch. Note that "BB" is PCI bus number and "SS" is the combination of slot and function number. This fixes the PCI address passed to DDW RTAS calls. Fixes: 8445a87f7092 ("powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism") Cc: stable@vger.kernel.org # v3.4+ Reported-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-06powerpc/ptrace: Fix out of bounds array access warningKhem Raj
gcc-6 correctly warns about a out of bounds access arch/powerpc/kernel/ptrace.c:407:24: warning: index 32 denotes an offset greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' [-Warray-bounds] offsetof(struct thread_fp_state, fpr[32][0])); ^ check the end of array instead of beginning of next element to fix this Signed-off-by: Khem Raj <raj.khem@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Segher Boessenkool <segher@kernel.crashing.org> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-05parisc: Move die_if_kernel() prototype into traps.h headerHelge Deller
Signed-off-by: Helge Deller <deller@gmx.de>
2016-06-05parisc: Fix pagefault crash in unaligned __get_user() callHelge Deller
One of the debian buildd servers had this crash in the syslog without any other information: Unaligned handler failed, ret = -2 clock_adjtime (pid 22578): Unaligned data reference (code 28) CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G E 4.5.0-2-parisc64-smp #1 Debian 4.5.4-1 task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111100000001111 Tainted: G E r00-03 000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0 r04-07 00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff r08-11 0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4 r12-15 000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b r16-19 0000000000028800 0000000000000001 0000000000000070 00000001bde7c218 r20-23 0000000000000000 00000001bde7c210 0000000000000002 0000000000000000 r24-27 0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0 r28-31 0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218 sr00-03 0000000001200000 0000000001200000 0000000000000000 0000000001200000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88 IIR: 0ca0d089 ISR: 0000000001200000 IOR: 00000000fa6f7fff CPU: 1 CR30: 00000001bde7c000 CR31: ffffffffffffffff ORIG_R28: 00000002369fe628 IAOQ[0]: compat_get_timex+0x2dc/0x3c0 IAOQ[1]: compat_get_timex+0x2e0/0x3c0 RP(r2): compat_get_timex+0x40/0x3c0 Backtrace: [<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0 [<0000000040205024>] syscall_exit+0x0/0x14 This means the userspace program clock_adjtime called the clock_adjtime() syscall and then crashed inside the compat_get_timex() function. Syscalls should never crash programs, but instead return EFAULT. The IIR register contains the executed instruction, which disassebles into "ldw 0(sr3,r5),r9". This load-word instruction is part of __get_user() which tried to read the word at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in. The unaligned handler is able to emulate all ldw instructions, but it fails if it fails to read the source e.g. because of page fault. The following program reproduces the problem: #define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> #include <sys/mman.h> int main(void) { /* allocate 8k */ char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); /* free second half (upper 4k) and make it invalid. */ munmap(ptr+4096, 4096); /* syscall where first int is unaligned and clobbers into invalid memory region */ /* syscall should return EFAULT */ return syscall(__NR_clock_adjtime, 0, ptr+4095); } To fix this issue we simply need to check if the faulting instruction address is in the exception fixup table when the unaligned handler failed. If it is, call the fixup routine instead of crashing. While looking at the unaligned handler I found another issue as well: The target register should not be modified if the handler was unsuccessful. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org
2016-06-05parisc: Fix printk time during bootHelge Deller
Avoid showing invalid printk time stamps during boot. Signed-off-by: Helge Deller <deller@gmx.de> Reviewed-by: Aaro Koskinen <aaro.koskinen@iki.fi>
2016-06-04parisc: Fix backtrace on PA-RISCMikulas Patocka
This patch fixes backtrace on PA-RISC There were several problems: 1) The code that decodes instructions handles instructions that subtract from the stack pointer incorrectly. If the instruction subtracts the number X from the stack pointer the code increases the frame size by (0x100000000-X). This results in invalid accesses to memory and recursive page faults. 2) Because gcc reorders blocks, handling instructions that subtract from the frame pointer is incorrect. For example, this function int f(int a) { if (__builtin_expect(a, 1)) return a; g(); return a; } is compiled in such a way, that the code that decreases the stack pointer for the first "return a" is placed before the code for "g" call. If we recognize this decrement, we mistakenly believe that the frame size for the "g" call is zero. To fix problems 1) and 2), the patch doesn't recognize instructions that decrease the stack pointer at all. To further safeguard the unwind code against nonsense values, we don't allow frame size larger than Total_frame_size. 3) The backtrace is not locked. If stack dump races with module unload, invalid table can be accessed. This patch adds a spinlock when processing module tables. Note, that for correct backtrace, you need recent binutils. Binutils 2.18 from Debian 5 produce garbage unwind tables. Binutils 2.21 work better (it sometimes forgets function frames, but at least it doesn't generate garbage). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Helge Deller <deller@gmx.de>
2016-06-03Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: - a few simple fixes for fallout from the recent gic-v3 changes - a workaround for a Cavium thunderX erratum - a bugfix for the pic32 irqchip to make external interrupts work proper - a missing return value in the generic IPI management code * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-pic32-evic: Fix bug with external interrupts. irqchip/gicv3-its: numa: Enable workaround for Cavium thunderx erratum 23144 irqchip/gic-v3: Fix quiescence check in gic_enable_redist irqchip/gic-v3: Fix copy+paste mistakes in defines irqchip/gic-v3: Fix ICC_SGI1R_EL1.INTID decoding mask genirq: Fix missing return value in irq_destroy_ipi()
2016-06-03Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fix from Russell King: "Just one fix to the ptrace code, spotted by Simon Marchi, where if a thread migrates to a different CPU and the VFP registers are changed through ptrace, the application doesn't see the updated VFP registers" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: fix PTRACE_SETVFPREGS on SMP systems
2016-06-03Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The main thing here is reviving hugetlb support using contiguous ptes, which we ended up reverting at the last minute in 4.5 pending a fix which went into the core mm/ code during the recent merge window. - Revert a previous revert and get hugetlb going with contiguous hints - Wire up missing compat syscalls - Enable CONFIG_SET_MODULE_RONX by default - Add missing line to our compat /proc/cpuinfo output - Clarify levels in our page table dumps - Fix booting with RANDOMIZE_TEXT_OFFSET enabled - Misc fixes to the ARM CPU PMU driver (refcounting, probe failure) - Remove some dead code and update a comment" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: fix alignment when RANDOMIZE_TEXT_OFFSET is enabled arm64: move {PAGE,CONT}_SHIFT into Kconfig arm64: mm: dump: log span level arm64: update stale PAGE_OFFSET comment drivers/perf: arm_pmu: Avoid leaking pmu->irq_affinity on error drivers/perf: arm_pmu: Defer the setting of __oprofile_cpu_pmu drivers/perf: arm_pmu: Fix reference count of a device_node in of_pmu_irq_cfg arm64: report CPU number in bad_mode arm64: unistd32.h: wire up missing syscalls for compat tasks arm64: Provide "model name" in /proc/cpuinfo for PER_LINUX32 tasks arm64: enable CONFIG_SET_MODULE_RONX by default arm64: Remove orphaned __addr_ok() definition Revert "arm64: hugetlb: partial revert of 66b3923a1a0f"
2016-06-03Merge tag 'powerpc-4.7-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Handle RTAS delay requests in configure_bridge from Russell Currey - Refactor the configure_bridge RTAS tokens from Russell Currey - Fix definition of SIAR and SDAR registers from Thomas Huth - Use privileged SPR number for MMCR2 from Thomas Huth - Update LPCR only if it is powernv from Aneesh Kumar K.V - Fix the reference bit update when handling hash fault from Aneesh Kumar K.V - Add missing tlb flush from Aneesh Kumar K.V - Add POWER8NVL support to ibm,client-architecture-support call from Thomas Huth * tag 'powerpc-4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call powerpc/mm/radix: Add missing tlb flush powerpc/mm/hash: Fix the reference bit update when handling hash fault powerpc/mm/radix: Update LPCR only if it is powernv powerpc: Use privileged SPR number for MMCR2 powerpc: Fix definition of SIAR and SDAR registers powerpc/pseries/eeh: Refactor the configure_bridge RTAS tokens powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge
2016-06-03arm64: fix alignment when RANDOMIZE_TEXT_OFFSET is enabledMark Rutland
With ARM64_64K_PAGES and RANDOMIZE_TEXT_OFFSET enabled, we hit the following issue on the boot: kernel BUG at arch/arm64/mm/mmu.c:480! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 4.6.0 #310 Hardware name: ARM Juno development board (r2) (DT) task: ffff000008d58a80 ti: ffff000008d30000 task.ti: ffff000008d30000 PC is at map_kernel_segment+0x44/0xb0 LR is at paging_init+0x84/0x5b0 pc : [<ffff000008c450b4>] lr : [<ffff000008c451a4>] pstate: 600002c5 Call trace: [<ffff000008c450b4>] map_kernel_segment+0x44/0xb0 [<ffff000008c451a4>] paging_init+0x84/0x5b0 [<ffff000008c42728>] setup_arch+0x198/0x534 [<ffff000008c40848>] start_kernel+0x70/0x388 [<ffff000008c401bc>] __primary_switched+0x30/0x74 Commit 7eb90f2ff7e3 ("arm64: cover the .head.text section in the .text segment mapping") removed the alignment between the .head.text and .text sections, and used the _text rather than the _stext interval for mapping the .text segment. Prior to this commit _stext was always section aligned and didn't cause any issue even when RANDOMIZE_TEXT_OFFSET was enabled. Since that alignment has been removed and _text is used to map the .text segment, we need ensure _text is always page aligned when RANDOMIZE_TEXT_OFFSET is enabled. This patch adds logic to TEXT_OFFSET fuzzing to ensure that the offset is always aligned to the kernel page size. To ensure this, we rely on the PAGE_SHIFT being available via Kconfig. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Fixes: 7eb90f2ff7e3 ("arm64: cover the .head.text section in the .text segment mapping") Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-03arm64: move {PAGE,CONT}_SHIFT into KconfigMark Rutland
In some cases (e.g. the awk for CONFIG_RANDOMIZE_TEXT_OFFSET) we would like to make use of PAGE_SHIFT outside of code that can include the usual header files. Add a new CONFIG_ARM64_PAGE_SHIFT for this, likewise with ARM64_CONT_SHIFT for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-03arm64: mm: dump: log span levelMark Rutland
The page table dump code logs spans of entries at the same level (pgd/pud/pmd/pte) which have the same attributes. While we log the (decoded) attributes, we don't log the level, which leaves the output ambiguous and/or confusing in some cases. For example: 0xffff800800000000-0xffff800980000000 6G RW NX SHD AF BLK UXN MEM/NORMAL If using 4K pages, this may describe a span of 6 1G block entries at the PGD/PUD level, or 3072 2M block entries at the PMD level. This patch adds the page table level to each output line, removing this ambiguity. For the example above, this will produce: 0xffffffc800000000-0xffffffc980000000 6G PUD RW NX SHD AF BLK UXN MEM/NORMAL When 3 level tables are in use, and we use the asm-generic/nopud.h definitions, the dump code treats each entry in the PGD as a 1 element table at the PUD level, and logs spans as being PUDs, which can be confusing. To counteract this, the "PUD" mnemonic is replaced with "PGD" when CONFIG_PGTABLE_LEVELS <= 3. Likewise for "PMD" when CONFIG_PGTABLE_LEVELS <= 2. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Huang Shijie <shijie.huang@arm.com> Cc: Laura Abbott <labbott@fedoraproject.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-03arm64: update stale PAGE_OFFSET commentMark Rutland
Commit ab893fb9f1b17f02 ("arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region") logically split KIMAGE_VADDR from PAGE_OFFSET, and since commit f9040773b7bbbd9e ("arm64: move kernel image to base of vmalloc area") the two have been distinct values. Unfortunately, neither commit updated the comment above these definitions, which now erroneously states that PAGE_OFFSET is the start of the kernel image rather than the start of the linear mapping. This patch fixes said comment, and introduces an explanation of KIMAGE_VADDR. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-03arm64: report CPU number in bad_modeMark Rutland
If we take an exception we don't expect (e.g. SError), we report this in the bad_mode handler with pr_crit. Depending on the configured log level, we may or may not log additional information in functions called subsequently. Notably, the messages in dump_stack (including the CPU number) are printed with KERN_DEFAULT and may not appear. Some exceptions have an IMPLEMENTATION DEFINED ESR_ELx.ISS encoding, and knowing the CPU number is crucial to correctly decode them. To ensure that this is always possible, we should log the CPU number along with the ESR_ELx value, so we are not reliant on subsequent logs or additional printk configuration options. This patch logs the CPU number in bad_mode such that it is possible for a developer to decode these exceptions, provided access to sufficient documentation. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Al Grant <Al.Grant@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Martin <dave.martin@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-03perf/x86/intel/uncore: Remove SBOX support for Broadwell serverKan Liang
There was a report that on certain Broadwell-EP systems writing any bit of the SBOX PMU initialization MSR would #GP at boot. This did not happen on all systems. My test systems booted fine. Considering both DE and EP may have such issues, this patch removes SBOX support for all Broadwell platforms for now. Reported-and-tested-by: Mark van Dijk <mark@voidzero.net> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1464347540-5763-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Radim Krčmář: "ARM: - two fixes for 4.6 vgic [Christoffer] (cc stable) - six fixes for 4.7 vgic [Marc] x86: - six fixes from syzkaller reports [Paolo] (two of them cc stable) - allow OS X to boot [Dmitry] - don't trust compilers [Nadav]" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi KVM: fail KVM_SET_VCPU_EVENTS with invalid exception number KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID kvm: x86: avoid warning on repeated KVM_SET_TSS_ADDR KVM: Handle MSR_IA32_PERF_CTL KVM: x86: avoid write-tearing of TDP KVM: arm/arm64: vgic-new: Removel harmful BUG_ON arm64: KVM: vgic-v3: Relax synchronization when SRE==1 arm64: KVM: vgic-v3: Prevent the guest from messing with ICC_SRE_EL1 arm64: KVM: Make ICC_SRE_EL1 access return the configured SRE value KVM: arm/arm64: vgic-v3: Always resample level interrupts KVM: arm/arm64: vgic-v2: Always resample level interrupts KVM: arm/arm64: vgic-v3: Clear all dirty LRs KVM: arm/arm64: vgic-v2: Clear all dirty LRs
2016-06-02irqchip/gicv3-its: numa: Enable workaround for Cavium thunderx erratum 23144Ganapatrao Kulkarni
The erratum fixes the hang of ITS SYNC command by avoiding inter node io and collections/cpu mapping on thunderx dual-socket platform. This fix is only applicable for Cavium's ThunderX dual-socket platform. Reviewed-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-06-02KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGSPaolo Bonzini
MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to any of bits 63:32. However, this is not detected at KVM_SET_DEBUGREGS time, and the next KVM_RUN oopses: general protection fault: 0000 [#1] SMP CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 [...] Call Trace: [<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm] [<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm] [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480 [<ffffffff812418a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71 Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 RIP [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40 RSP <ffff88005836bd50> Testcase (beautified/reduced from syzkaller output): #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[8]; int main() { struct kvm_debugregs dr = { 0 }; r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); memcpy(&dr, "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72" "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8" "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9" "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb", 48); r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr); r[6] = ioctl(r[4], KVM_RUN, 0); } Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02KVM: fail KVM_SET_VCPU_EVENTS with invalid exception numberPaolo Bonzini
This cannot be returned by KVM_GET_VCPU_EVENTS, so it is okay to return EINVAL. It causes a WARN from exception_type: WARNING: CPU: 3 PID: 16732 at arch/x86/kvm/x86.c:345 exception_type+0x49/0x50 [kvm]() CPU: 3 PID: 16732 Comm: a.out Tainted: G W 4.4.6-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 0000000000000286 000000006308a48b ffff8800bec7fcf8 ffffffff813b542e 0000000000000000 ffffffffa0966496 ffff8800bec7fd30 ffffffff810a40f2 ffff8800552a8000 0000000000000000 00000000002c267c 0000000000000001 Call Trace: [<ffffffff813b542e>] dump_stack+0x63/0x85 [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0 [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20 [<ffffffffa0924809>] exception_type+0x49/0x50 [kvm] [<ffffffffa0934622>] kvm_arch_vcpu_ioctl_run+0x10a2/0x14e0 [kvm] [<ffffffffa091c04d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm] [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480 [<ffffffff812414a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71 ---[ end trace b1a0391266848f50 ]--- Testcase (beautified/reduced from syzkaller output): #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <fcntl.h> #include <sys/ioctl.h> #include <linux/kvm.h> long r[31]; int main() { memset(r, -1, sizeof(r)); r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[7] = ioctl(r[3], KVM_CREATE_VCPU, 0); struct kvm_vcpu_events ve = { .exception.injected = 1, .exception.nr = 0xd4 }; r[27] = ioctl(r[7], KVM_SET_VCPU_EVENTS, &ve); r[30] = ioctl(r[7], KVM_RUN, 0); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUIDPaolo Bonzini
This causes an ugly dmesg splat. Beautified syzkaller testcase: #include <unistd.h> #include <sys/syscall.h> #include <sys/ioctl.h> #include <fcntl.h> #include <linux/kvm.h> long r[8]; int main() { struct kvm_cpuid2 c = { 0 }; r[2] = open("/dev/kvm", O_RDWR); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 0x8); r[7] = ioctl(r[4], KVM_SET_CPUID, &c); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02kvm: x86: avoid warning on repeated KVM_SET_TSS_ADDRPaolo Bonzini
Found by syzkaller: WARNING: CPU: 3 PID: 15175 at arch/x86/kvm/x86.c:7705 __x86_set_memory_region+0x1dc/0x1f0 [kvm]() CPU: 3 PID: 15175 Comm: a.out Tainted: G W 4.4.6-300.fc23.x86_64 #1 Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012 0000000000000286 00000000950899a7 ffff88011ab3fbf0 ffffffff813b542e 0000000000000000 ffffffffa0966496 ffff88011ab3fc28 ffffffff810a40f2 00000000000001fd 0000000000003000 ffff88014fc50000 0000000000000000 Call Trace: [<ffffffff813b542e>] dump_stack+0x63/0x85 [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0 [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20 [<ffffffffa09251cc>] __x86_set_memory_region+0x1dc/0x1f0 [kvm] [<ffffffffa092521b>] x86_set_memory_region+0x3b/0x60 [kvm] [<ffffffffa09bb61c>] vmx_set_tss_addr+0x3c/0x150 [kvm_intel] [<ffffffffa092f4d4>] kvm_arch_vm_ioctl+0x654/0xbc0 [kvm] [<ffffffffa091d31a>] kvm_vm_ioctl+0x9a/0x6f0 [kvm] [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480 [<ffffffff812414a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71 Testcase: #include <unistd.h> #include <sys/ioctl.h> #include <fcntl.h> #include <string.h> #include <linux/kvm.h> long r[8]; int main() { memset(r, -1, sizeof(r)); r[2] = open("/dev/kvm", O_RDONLY|O_TRUNC); r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul); r[5] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul); r[7] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02KVM: Handle MSR_IA32_PERF_CTLDmitry Bilunov
Intel CPUs having Turbo Boost feature implement an MSR to provide a control interface via rdmsr/wrmsr instructions. One could detect the presence of this feature by issuing one of these instructions and handling the #GP exception which is generated in case the referenced MSR is not implemented by the CPU. KVM's vCPU model behaves exactly as a real CPU in this case by injecting a fault when MSR_IA32_PERF_CTL is called (which KVM does not support). However, some operating systems use this register during an early boot stage in which their kernel is not capable of handling #GP correctly, causing #DP and finally a triple fault effectively resetting the vCPU. This patch implements a dummy handler for MSR_IA32_PERF_CTL to avoid the crashes. Signed-off-by: Dmitry Bilunov <kmeaw@yandex-team.ru> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02KVM: x86: avoid write-tearing of TDPNadav Amit
In theory, nothing prevents the compiler from write-tearing PTEs, or split PTE writes. These partially-modified PTEs can be fetched by other cores and cause mayhem. I have not really encountered such case in real-life, but it does seem possible. For example, the compiler may try to do something creative for kvm_set_pte_rmapp() and perform multiple writes to the PTE. Signed-off-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02ARM: fix PTRACE_SETVFPREGS on SMP systemsRussell King
PTRACE_SETVFPREGS fails to properly mark the VFP register set to be reloaded, because it undoes one of the effects of vfp_flush_hwstate(). Specifically vfp_flush_hwstate() sets thread->vfpstate.hard.cpu to an invalid CPU number, but vfp_set() overwrites this with the original CPU number, thereby rendering the hardware state as apparently "valid", even though the software state is more recent. Fix this by reverting the previous change. Cc: <stable@vger.kernel.org> Fixes: 8130b9d7b9d8 ("ARM: 7308/1: vfp: flush thread hwstate before copying ptrace registers") Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Simon Marchi <simon.marchi@ericsson.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2016-06-02Revert "ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with ↵Vineet Gupta
exponential backoff" This reverts commit e78fdfef84be13a5c2b8276e12203cdf24778596. The issue was fixed in hardware in HS2.1C release and there are no known external users of affected RTL so revert the whole delayed retry series ! Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-06-02Revert "ARCv2: spinlock/rwlock: Reset retry delay when starting a new ↵Vineet Gupta
spin-wait cycle" This reverts commit b89aa12c177477e34caa722818536fb5d0bffd76. The issue was fixed in hardware in HS2.1C release and there are no known external users of affected RTL so revert the whole delayed retry series ! Signed-off-by: Vineet Gupta <vgupta@synopsys.com>