summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2016-06-17Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The main things are getting kgdb up and running with upstream GDB after a protocol change was reverted and fixing our spin_unlock_wait and spin_is_locked implementations after doing some similar work with PeterZ on the qspinlock code last week. Whilst we haven't seen any failures in practice, it's still worth getting this fixed. Summary: - Plug the ongoing spin_unlock_wait/spin_is_locked mess - KGDB protocol fix to sync w/ GDB - Fix MIDR-based PMU probing for old 32-bit SMP systems (OMAP4/Realview) - Minor tweaks to the fault handling path" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: kgdb: Match pstate size with gdbserver protocol arm64: spinlock: Ensure forward-progress in spin_unlock_wait arm64: spinlock: fix spin_unlock_wait for LSE atomics arm64: spinlock: order spin_{is_locked,unlock_wait} against local locks arm: pmu: Fix non-devicetree probing arm64: mm: mark fault_info table const arm64: fix dump_instr when PAN and UAO are in use
2016-06-17ARM: dts: am437x-sk-evm: Reduce i2c0 bus speed for tps65218Dave Gerlach
Based on the latest timing specifications for the TPS65218 from the data sheet, http://www.ti.com/lit/ds/symlink/tps65218.pdf, document SLDS206 from November 2014, we must change the i2c bus speed to better fit within the minimum high SCL time required for proper i2c transfer. When running at 400khz, measurements show that SCL spends 0.8125 uS/1.666 uS high/low which violates the requirement for minimum high period of SCL provided in datasheet Table 7.6 which is 1 uS. Switching to 100khz gives us 5 uS/5 uS high/low which both fall above the minimum given values for 100 khz, 4.0 uS/4.7 uS high/low. Without this patch occasionally a voltage set operation from the kernel will appear to have worked but the actual voltage reflected on the PMIC will not have updated, causing problems especially with cpufreq that may update to a higher OPP without actually raising the voltage on DCDC2, leading to a hang. Signed-off-by: Dave Gerlach <d-gerlach@ti.com> Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com> Signed-off-by: Aparna Balasubramanian <aparnab@ti.com> Signed-off-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2016-06-17powerpc/eeh: Fix invalid cached PE primary busGavin Shan
The PE primary bus cannot be got from its child devices when having full hotplug in error recovery. The PE primary bus is cached, which is done in commit <05ba75f84864> ("powerpc/eeh: Fix stale cached primary bus"). In eeh_reset_device(), the flag (EEH_PE_PRI_BUS) is cleared before the PCI hot remove. eeh_pe_bus_get() then returns NULL as the PE primary bus in pnv_eeh_reset() and it crashes the kernel eventually. This fixes the issue by clearing the flag (EEH_PE_PRI_BUS) before the PCI hot add. With it, the PowerNV EEH reset backend (pnv_eeh_reset()) can get valid PE primary bus through eeh_pe_bus_get(). Fixes: 67086e32b564 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE") Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-17powerpc/mm/radix: Update Radix tree size as per ISA 3.0Aneesh Kumar K.V
ISA 3.0 updated it to be encoded as Radix tree size = 2^(RTS + 31). We have it encoded as 2^(RTS + 28). Add a helper with the correct encoding and use it instead of opencoding. Fixes: 2bfd65e45e87 ("powerpc/mm/radix: Add radix callbacks for early init routines") Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-17powerpc/mm/hash: Don't add memory coherence if cache inhibited is setAneesh Kumar K.V
H_ENTER hcall handling in qemu had assumptions that a cache inhibited hpte entry won't have memory conference set. Also older kernel mentioned that some version of pHyp required this (the code removed by the below commit says: /* Make pHyp happy */ if ((rflags & _PAGE_NO_CACHE) && !(rflags & _PAGE_WRITETHRU)) hpte_r &= ~HPTE_R_M; But with older kernel we had some inconsistent memory conherence mapping. We always enabled memory conherence in the page fault path and removed memory conherence is _PAGE_NO_CACHE was set when we mapped the page via htab_bolt_mapping. The commit mentioned below tried to consolidate that by always enabling memory conherence. But as mentioned above that breaks Qemu H_ENTER handling. This patch update this such that we enable memory conherence only if cache inhibited is not set and bring fault handling, lpar and bolt mapping in sync. Fixes: commit 30bda41aba4e("powerpc/mm: Drop WIMG in favour of new constant") Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-17ARM: OMAP2+: timer: add probe for clocksourcesTero Kristo
A few platforms are currently missing clocksource_probe() completely in their time_init functionality. On OMAP3430 for example, this is causing cpuidle to be pretty much dead, as the counter32k is not going to be registered and instead a gptimer is used as a clocksource. This will tick in periodic mode, preventing any deeper idle states. While here, also drop one unnecessary check for populated DT before existing clocksource_probe() call. Signed-off-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2016-06-17ARM: OMAP1: fix ams-delta FIQ handler to work with sparse IRQJanusz Krzysztofik
After OMAP1 IRQ definitions have been changed by commit 685e2d08c54b ("ARM: OMAP1: Change interrupt numbering for sparse IRQ") introduced in v4.2, ams-delta FIQ handler which depends on them no longer works as expected. Fix it. Created and tested on Amstrad Delta against Linux-4.7-rc3 Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2016-06-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: - miscellaneous fixes for MIPS and s390 - one new kvm_stat for s390 - correctly disable VT-d posted interrupts with the rest of posted interrupts - "make randconfig" fix for x86 AMD - off-by-one in irq route check (the "good" kind that errors out a bit too early!) * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: vmx: check apicv is active before using VT-d posted interrupt kvm: Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES kvm: svm: Do not support AVIC if not CONFIG_X86_LOCAL_APIC kvm: svm: Fix implicit declaration for __default_cpu_present_to_apicid() MIPS: KVM: Fix CACHE triggered exception emulation MIPS: KVM: Don't unwind PC when emulating CACHE MIPS: KVM: Include bit 31 in segment matches MIPS: KVM: Fix modular KVM under QEMU KVM: s390: Add stats for PEI events KVM: s390: ignore IBC if zero
2016-06-16arm64: kgdb: Match pstate size with gdbserver protocolDaniel Thompson
Current versions of gdb do not interoperate cleanly with kgdb on arm64 systems because gdb and kgdb do not use the same register description. This patch modifies kgdb to work with recent releases of gdb (>= 7.8.1). Compatibility with gdb (after the patch is applied) is as follows: gdb-7.6 and earlier Ok gdb-7.7 series Works if user provides custom target description gdb-7.8(.0) Works if user provides custom target description gdb-7.8.1 and later Ok When commit 44679a4f142b ("arm64: KGDB: Add step debugging support") was introduced it was paired with a gdb patch that made an incompatible change to the gdbserver protocol. This patch was eventually merged into the gdb sources: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=a4d9ba85ec5597a6a556afe26b712e878374b9dd The change to the protocol was mostly made to simplify big-endian support inside the kernel gdb stub. Unfortunately the gdb project released gdb-7.7.x and gdb-7.8.0 before the protocol incompatibility was identified and reversed: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bdc144174bcb11e808b4e73089b850cf9620a7ee This leaves us in a position where kgdb still uses the no-longer-used protocol; gdb-7.8.1, which restored the original behaviour, was released on 2014-10-29. I don't believe it is possible to detect/correct the protocol incompatiblity which means the kernel must take a view about which version of the gdb remote protocol is "correct". This patch takes the view that the original/current version of the protocol is correct and that version found in gdb-7.7.x and gdb-7.8.0 is anomalous. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-16ARM: dts: armada-38x: fix MBUS_ID for crypto SRAM on Armada 385 LinksysThomas Petazzoni
When the support for the Marvell crypto engine was added in the Device Tree of the various Armada 385 Device Tree files in commit d716f2e837ac6 ("ARM: mvebu: define crypto SRAM ranges for all armada-38x boards"), a typo was made in the MBus window attributes for the Armada 385 Linksys board: 0x09/0x05 are used instead of 0x19/0x15. This commit fixes this typo, which makes the CESA engines operational on Armada 385 Linksys boards. Reported-by: Terry Stockert <stockert@inkblotadmirer.me> Cc: Terry Stockert <stockert@inkblotadmirer.me> Cc: Imre Kaloz <kaloz@openwrt.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: <stable@vger.kernel.org> Fixes: d716f2e837ac6 ("ARM: mvebu: define crypto SRAM ranges for all armada-38x boards") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2016-06-16ARM: mvebu: map PCI I/O regions strongly orderedThomas Petazzoni
In order for HW I/O coherency to work on Cortex-A9 based Marvell SoCs, all MMIO registers must be mapped strongly ordered. In commit 1c8c3cf0b5239 ("ARM: 8060/1: mm: allow sub-architectures to override PCI I/O memory type") we implemented a new function, pci_ioremap_set_mem_type(), that allow sub-architecture code to override the memory type used to map PCI I/O regions. In the discussion around this patch series [1], Arnd Bergmann made the comment that maybe all PCI I/O regions should be mapped strongly-ordered, which would have made our proposal to add pci_ioremap_set_mem_type() irrelevant. So, we submitted a patch [2] that did what Arnd suggested. However, Russell in the end merged our initial proposal to add pci_ioremap_set_mem_type(), but it was never used anywhere. Further discussion with Arnd and other folks on IRC lead to the conclusion that in fact using strongly-ordered for all platforms was maybe not desirable, and therefore, using pci_ioremap_set_mem_type() was the most appropriate solution. As a consequence, this commit finally adds the pci_ioremap_set_mem_type() call in the mach-mvebu platform code, which was originally part of our initial patch series [3] and is necessary for the whole mechanism to work. [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-May/256565.html [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-May/256755.html [3] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-May/256563.html Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2016-06-16ARM: mvebu: fix HW I/O coherency related deadlocksThomas Petazzoni
Until now, our understanding for HW I/O coherency to work on the Cortex-A9 based Marvell SoC was that only the PCIe regions should be mapped strongly-ordered. However, we were still encountering some deadlocks, especially when testing the CESA crypto engine. After checking with the HW designers, it was concluded that all the MMIO registers should be mapped as strongly ordered for the HW I/O coherency mechanism to work properly. This fixes some easy to reproduce deadlocks with the CESA crypto engine driver (dmcrypt on a sufficiently large disk partition). Tested-by: Terry Stockert <stockert@inkblotadmirer.me> Tested-by: Romain Perier <romain.perier@free-electrons.com> Cc: Terry Stockert <stockert@inkblotadmirer.me> Cc: Romain Perier <romain.perier@free-electrons.com> Cc: <stable@vger.kernel.org> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2016-06-16s390/cpum_cf: use perf software context for hardware countersHendrik Brueckner
On s390, there are two different hardware PMUs for counting and sampling. Previously, both PMUs have shared the perf_hw_context which is not correct and, recently, results in this warning: ------------[ cut here ]------------ WARNING: CPU: 5 PID: 1 at kernel/events/core.c:8485 perf_pmu_register+0x420/0x428 Modules linked in: CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc1+ #2 task: 00000009c5240000 ti: 00000009c5234000 task.ti: 00000009c5234000 Krnl PSW : 0704c00180000000 0000000000220c50 (perf_pmu_register+0x420/0x428) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: ffffffffffffffff 0000000000b15ac6 0000000000000000 00000009cb440000 000000000022087a 0000000000000000 0000000000b78fa0 0000000000000000 0000000000a9aa90 0000000000000084 0000000000000005 000000000088a97a 0000000000000004 0000000000749dd0 000000000022087a 00000009c5237cc0 Krnl Code: 0000000000220c44: a7f4ff54 brc 15,220aec 0000000000220c48: 92011000 mvi 0(%r1),1 #0000000000220c4c: a7f40001 brc 15,220c4e >0000000000220c50: a7f4ff12 brc 15,220a74 0000000000220c54: 0707 bcr 0,%r7 0000000000220c56: 0707 bcr 0,%r7 0000000000220c58: ebdff0800024 stmg %r13,%r15,128(%r15) 0000000000220c5e: a7f13fe0 tmll %r15,16352 Call Trace: ([<000000000022087a>] perf_pmu_register+0x4a/0x428) ([<0000000000b2c25c>] init_cpum_sampling_pmu+0x14c/0x1f8) ([<0000000000100248>] do_one_initcall+0x48/0x140) ([<0000000000b25d26>] kernel_init_freeable+0x1e6/0x2a0) ([<000000000072bda4>] kernel_init+0x24/0x138) ([<000000000073495e>] kernel_thread_starter+0x6/0xc) ([<0000000000734958>] kernel_thread_starter+0x0/0xc) Last Breaking-Event-Address: [<0000000000220c4c>] perf_pmu_register+0x41c/0x428 ---[ end trace 0c6ef9f5b771ad97 ]--- Using the perf_sw_context is an option because the cpum_cf PMU does not use interrupts. To make this more clear, initialize the capabilities in the PMU structure. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-06-16ARM: sunxi/dt: make the CHIP inherit from allwinner,sun5i-a13Boris Brezillon
The sun4i-timer driver registers its sched_clock only if the machine is compatible with "allwinner,sun5i-a13", "allwinner,sun5i-a10s" or "allwinner,sun4i-a10". Add the missing "allwinner,sun5i-a13" string to the machine compatible. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Fixes: 465a225fb2af ("ARM: sun5i: Add C.H.I.P DTS") Cc: <stable@vger.kernel.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
2016-06-16kvm: vmx: check apicv is active before using VT-d posted interruptYang Zhang
VT-d posted interrupt is relying on the CPU side's posted interrupt. Need to check whether VCPU's APICv is active before enabing VT-d posted interrupt. Fixes: d62caabb41f33d96333f9ef15e09cd26e1c12760 Cc: stable@vger.kernel.org Signed-off-by: Yang Zhang <yang.zhang.wz@gmail.com> Signed-off-by: Shengge Ding <shengge.dsg@alibaba-inc.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16kvm: svm: Do not support AVIC if not CONFIG_X86_LOCAL_APICSuravee Suthikulpanit
Add logic to disable AVIC #ifndef CONFIG_X86_LOCAL_APIC. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16kvm: svm: Fix implicit declaration for __default_cpu_present_to_apicid()Suravee Suthikulpanit
The commit 8221c1370056 ("svm: Manage vcpu load/unload when enable AVIC") introduces a build error due to implicit function declaration when #ifdef CONFIG_X86_32 and #ifndef CONFIG_X86_LOCAL_APIC (as reported by Kbuild test robot i386-randconfig-x0-06121009). So, this patch introduces kvm_cpu_get_apicid() wrapper around __default_cpu_present_to_apicid() with additional handling if CONFIG_X86_LOCAL_APIC is not defined. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: commit 8221c1370056 ("svm: Manage vcpu load/unload when enable AVIC") Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-15arm64: spinlock: Ensure forward-progress in spin_unlock_waitWill Deacon
Rather than wait until we observe the lock being free (which might never happen), we can also return from spin_unlock_wait if we observe that the lock is now held by somebody else, which implies that it was unlocked but we just missed seeing it in that state. Furthermore, in such a scenario there is no longer a need to write back the value that we loaded, since we know that there has been a lock hand-off, which is sufficient to publish any stores prior to the unlock_wait because the ARm architecture ensures that a Store-Release instruction is multi-copy atomic when observed by a Load-Acquire instruction. The litmus test is something like: AArch64 { 0:X1=x; 0:X3=y; 1:X1=y; 2:X1=y; 2:X3=x; } P0 | P1 | P2 ; MOV W0,#1 | MOV W0,#1 | LDAR W0,[X1] ; STR W0,[X1] | STLR W0,[X1] | LDR W2,[X3] ; DMB SY | | ; LDR W2,[X3] | | ; exists (0:X2=0 /\ 2:X0=1 /\ 2:X2=0) where P0 is doing spin_unlock_wait, P1 is doing spin_unlock and P2 is doing spin_lock. Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-15arm64: spinlock: fix spin_unlock_wait for LSE atomicsWill Deacon
Commit d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers") fixed spin_unlock_wait for LL/SC-based atomics under the premise that the LSE atomics (in particular, the LDADDA instruction) are indivisible. Unfortunately, these instructions are only indivisible when used with the -AL (full ordering) suffix and, consequently, the same issue can theoretically be observed with LSE atomics, where a later (in program order) load can be speculated before the write portion of the atomic operation. This patch fixes the issue by performing a CAS of the lock once we've established that it's unlocked, in much the same way as the LL/SC code. Fixes: d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers") Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-15arm64: spinlock: order spin_{is_locked,unlock_wait} against local locksWill Deacon
spin_is_locked has grown two very different use-cases: (1) [The sane case] API functions may require a certain lock to be held by the caller and can therefore use spin_is_locked as part of an assert statement in order to verify that the lock is indeed held. For example, usage of assert_spin_locked. (2) [The insane case] There are two locks, where a CPU takes one of the locks and then checks whether or not the other one is held before accessing some shared state. For example, the "optimized locking" in ipc/sem.c. In the latter case, the sequence looks like: spin_lock(&sem->lock); if (!spin_is_locked(&sma->sem_perm.lock)) /* Access shared state */ and requires that the spin_is_locked check is ordered after taking the sem->lock. Unfortunately, since our spinlocks are implemented using a LDAXR/STXR sequence, the read of &sma->sem_perm.lock can be speculated before the STXR and consequently return a stale value. Whilst this hasn't been seen to cause issues in practice, PowerPC fixed the same issue in 51d7d5205d33 ("powerpc: Add smp_mb() to arch_spin_is_locked()") and, although we did something similar for spin_unlock_wait in d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers") that doesn't actually take care of ordering against local acquisition of a different lock. This patch adds an smp_mb() to the start of our arch_spin_is_locked and arch_spin_unlock_wait routines to ensure that the lock value is always loaded after any other locks have been taken by the current CPU. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-14arm: Use _rcuidle for smp_cross_call() tracepointsPaul E. McKenney
Further testing with false negatives suppressed by commit 293e2421fe25 ("rcu: Remove superfluous versions of rcu_read_lock_sched_held()") identified another unprotected use of RCU from the idle loop. Because RCU actively ignores idle-loop code (for energy-efficiency reasons, among other things), using RCU from the idle loop can result in too-short grace periods, in turn resulting in arbitrary misbehavior. The resulting lockdep-RCU splat is as follows: ------------------------------------------------------------------------ =============================== [ INFO: suspicious RCU usage. ] 4.6.0-rc5-next-20160426+ #1112 Not tainted ------------------------------- include/trace/events/ipi.h:35 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0 RCU used illegally from extended quiescent state! no locks held by swapper/0/0. stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112 Hardware name: Generic OMAP4 (Flattened Device Tree) [<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14) [<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4) [<c047fec8>] (dump_stack) from [<c010dcfc>] (smp_cross_call+0xbc/0x188) [<c010dcfc>] (smp_cross_call) from [<c01c9e28>] (generic_exec_single+0x9c/0x15c) [<c01c9e28>] (generic_exec_single) from [<c01ca0a0>] (smp_call_function_single_async+0 x38/0x9c) [<c01ca0a0>] (smp_call_function_single_async) from [<c0603728>] (cpuidle_coupled_poke_others+0x8c/0xa8) [<c0603728>] (cpuidle_coupled_poke_others) from [<c0603c10>] (cpuidle_enter_state_coupled+0x26c/0x390) [<c0603c10>] (cpuidle_enter_state_coupled) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0) [<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8) [<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c) ------------------------------------------------------------------------ Reported-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <linux-omap@vger.kernel.org> Cc: <linux-arm-kernel@lists.infradead.org>
2016-06-14arm64: mm: mark fault_info table constMark Rutland
Unlike the debug_fault_info table, we never intentionally alter the fault_info table at runtime, and all derived pointers are treated as const currently. Make the table const so that it can be placed in .rodata and protected from unintentional writes, as we do for the syscall tables. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-14arm64: fix dump_instr when PAN and UAO are in useMark Rutland
If the kernel is set to show unhandled signals, and a user task does not handle a SIGILL as a result of an instruction abort, we will attempt to log the offending instruction with dump_instr before killing the task. We use dump_instr to log the encoding of the offending userspace instruction. However, dump_instr is also used to dump instructions from kernel space, and internally always switches to KERNEL_DS before dumping the instruction with get_user. When both PAN and UAO are in use, reading a user instruction via get_user while in KERNEL_DS will result in a permission fault, which leads to an Oops. As we have regs corresponding to the context of the original instruction abort, we can inspect this and only flip to KERNEL_DS if the original abort was taken from the kernel, avoiding this issue. At the same time, remove the redundant (and incorrect) comments regarding the order dump_mem and dump_instr are called in. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: <stable@vger.kernel.org> #4.6+ Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Fixes: 57f4959bad0a154a ("arm64: kernel: Add support for User Access Override") Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-06-14kprobes/x86: Clear TF bit in fault on single-steppingMasami Hiramatsu
Fix kprobe_fault_handler() to clear the TF (trap flag) bit of the flags register in the case of a fault fixup on single-stepping. If we put a kprobe on the instruction which caused a page fault (e.g. actual mov instructions in copy_user_*), that fault happens on the single-stepping buffer. In this case, kprobes resets running instance so that the CPU can retry execution on the original ip address. However, current code forgets to reset the TF bit. Since this fault happens with TF bit set for enabling single-stepping, when it retries, it causes a debug exception and kprobes can not handle it because it already reset itself. On the most of x86-64 platform, it can be easily reproduced by using kprobe tracer. E.g. # cd /sys/kernel/debug/tracing # echo p copy_user_enhanced_fast_string+5 > kprobe_events # echo 1 > events/kprobes/enable And you'll see a kernel panic on do_debug(), since the debug trap is not handled by kprobes. To fix this problem, we just need to clear the TF bit when resetting running kprobe. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: systemtap@sourceware.org Cc: stable@vger.kernel.org # All the way back to ancient kernels Link: http://lkml.kernel.org/r/20160611140648.25885.37482.stgit@devbox [ Updated the comments. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14Merge branch 'kvm-mips-fixes' into HEADPaolo Bonzini
Merge MIPS patches destined to both 4.7 and kvm/next, to avoid unnecessary conflicts. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14MIPS: KVM: Fix CACHE triggered exception emulationJames Hogan
When emulating TLB miss / invalid exceptions during CACHE instruction emulation, be sure to set up the correct PC and host_cp0_badvaddr state for the kvm_mips_emlulate_tlb*_ld() function to pick up for guest EPC and BadVAddr. PC needs to be rewound otherwise the guest EPC will end up pointing at the next instruction after the faulting CACHE instruction. host_cp0_badvaddr must be set because guest CACHE instructions trap with a Coprocessor Unusable exception, which doesn't update the host BadVAddr as a TLB exception would. This doesn't tend to get hit when dynamic translation of emulated instructions is enabled, since only the first execution of each CACHE instruction actually goes through this code path, with subsequent executions hitting the SYNCI instruction that it gets replaced with. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14MIPS: KVM: Don't unwind PC when emulating CACHEJames Hogan
When a CACHE instruction is emulated by kvm_mips_emulate_cache(), the PC is first updated to point to the next instruction, and afterwards it falls through the "dont_update_pc" label, which rewinds the PC back to its original address. This works when dynamic translation of emulated instructions is enabled, since the CACHE instruction is replaced with a SYNCI which works without trapping, however when dynamic translation is disabled the guest hangs on CACHE instructions as they always trap and are never stepped over. Roughly swap the meanings of the "done" and "dont_update_pc" to match kvm_mips_emulate_CP0(), so that "done" will roll back the PC on failure, and "dont_update_pc" won't change PC at all (for the sake of exceptions that have already modified the PC). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14MIPS: KVM: Include bit 31 in segment matchesJames Hogan
When faulting guest addresses are matched against guest segments with the KVM_GUEST_KSEGX() macro, change the mask to 0xe0000000 so as to include bit 31. This is mainly for safety's sake, as it prevents a rogue BadVAddr in the host kseg2/kseg3 segments (e.g. 0xC*******) after a TLB exception from matching the guest kseg0 segment (e.g. 0x4*******), triggering an internal KVM error instead of allowing the corresponding guest kseg0 page to be mapped into the host vmalloc space. Such a rogue BadVAddr was observed to happen with the host MIPS kernel running under QEMU with KVM built as a module, due to a not entirely transparent optimisation in the QEMU TLB handling. This has already been worked around properly in a previous commit. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14MIPS: KVM: Fix modular KVM under QEMUJames Hogan
Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never get a TLB refill exception in it when KVM is built as a module. This was observed to happen with the host MIPS kernel running under QEMU, due to a not entirely transparent optimisation in the QEMU TLB handling where TLB entries replaced with TLBWR are copied to a separate part of the TLB array. Code in those pages continue to be executable, but those mappings persist only until the next ASID switch, even if they are marked global. An ASID switch happens in __kvm_mips_vcpu_run() at exception level after switching to the guest exception base. Subsequent TLB mapped kernel instructions just prior to switching to the guest trigger a TLB refill exception, which enters the guest exception handlers without updating EPC. This appears as a guest triggered TLB refill on a host kernel mapped (host KSeg2) address, which is not handled correctly as user (guest) mode accesses to kernel (host) segments always generate address error exceptions. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.10.x- Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14powerpc/mm/hash: Use the correct PPP mask when updating HPTEAneesh Kumar K.V
With commit e58e87adc8bf9 "powerpc/mm: Update _PAGE_KERNEL_RO" we now use all the three PPP bits. The top bit is now used to have a PPP value of 0b110 which will be mapped to kernel read only. When updating the hpte entry use right mask such that we update the 63rd bit (top 'P' bit) too. Prior to e58e87adc8bf we didn't support KERNEL_RO at all (it was == KERNEL_RW), so this isn't a regression as such. Fixes: e58e87adc8bf ("powerpc/mm: Update _PAGE_KERNEL_RO") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-13Merge tag 'samsung-fixes-4.7-2' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into fixes Fixes for Exynos-based Snow and Peach Pit boards for regressions introduced in 4.7-rc1 because OF graph logic expects specific names of child nodes. * tag 'samsung-fixes-4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: ARM: dts: exynos: Fix port nodes names for Exynos5420 Peach Pit board ARM: dts: exynos: Fix port nodes names for Exynos5250 Snow board Signed-off-by: Olof Johansson <olof@lixom.net>
2016-06-13Merge tag 'socfpga_fix_for_v4.7' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into fixes SoCFPGA fix for v4.7 - Add missing PHY phandle for SoCFPGA VINING board * tag 'socfpga_fix_for_v4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: ARM: dts: socfpga: Add missing PHY phandle Signed-off-by: Olof Johansson <olof@lixom.net>
2016-06-13KVM: s390/mm: Fix CMMA reset during rebootChristian Borntraeger
commit 1e133ab296f ("s390/mm: split arch/s390/mm/pgtable.c") factored out the page table handling code from __gmap_zap and __s390_reset_cmma into ptep_zap_unused and added a simple flag that tells which one of the function (reset or not) is to be made. This also changed the behaviour, as it also zaps unused page table entries on reset. Turns out that this is wrong as s390_reset_cmma uses the page walker, which DOES NOT take the ptl lock. The most simple fix is to not do the zapping part on reset (which uses the walker) Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: 1e133ab296f ("s390/mm: split arch/s390/mm/pgtable.c") Cc: stable@vger.kernel.org # 4.6+ Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-06-13ARM: OMAP: DRA7: powerdomain data: Remove unused pwrsts_mem_retNishanth Menon
As per the latest revision F of public TRM for DRA7/AM57xx SoCs SPRUHZ6F[1] (April 2016), with the exception of MPU power domain, all other power domains do not have memories capable of retention since they all operate in either "ON" or "OFF" mode. For these power states, the retention state for memories are basically ignored by PRCM and does not require to be programmed. [1] http://www.ti.com/lit/pdf/spruhz6 Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2016-06-13ARM: OMAP: DRA7: powerdomain data: Remove unused pwrsts_logic_retNishanth Menon
As per the latest revision F of public TRM for DRA7/AM57xx SoCs SPRUHZ6F[1] (April 2016), with the exception of MPU power domain (and CPUx sub power domains), all other power domains can either operate in "ON" mode OR in some cases, "OFF" mode. For these power states, the logic retention state is basically ignored by PRCM and does not require to be programmed. [1] http://www.ti.com/lit/pdf/spruhz6 Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2016-06-13ARM: OMAP: DRA7: powerdomain data: Set L3init and L4per to ONNishanth Menon
As per the latest revision F of public TRM for DRA7/AM57xx SoCs SPRUHZ6F[1] (April 2016), L4Per and L3init power domains now operate in always "ON" mode due to asymmetric aging limitations. Update the same [1] http://www.ti.com/lit/pdf/spruhz6 Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2016-06-10Merge tag 'powerpc-4.7-3Michael Ellerman:' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from - ptrace: Fix out of bounds array access warning from Khem Raj - pseries: Fix PCI config address for DDW from Gavin Shan - pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added from Michael Ellerman - of: fix autoloading due to broken modalias with no 'compatible' from Wolfram Sang - radix: Fix always false comparison against MMU_NO_CONTEXT from Aneesh Kumar K.V - hash: Compute the segment size correctly for ISA 3.0 from Aneesh Kumar K.V - nohash: Fix build break with 64K pages from Michael Ellerman * tag 'powerpc-4.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/nohash: Fix build break with 64K pages powerpc/mm/hash: Compute the segment size correctly for ISA 3.0 powerpc/mm/radix: Fix always false comparison against MMU_NO_CONTEXT of: fix autoloading due to broken modalias with no 'compatible' powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added powerpc/pseries: Fix PCI config address for DDW powerpc/ptrace: Fix out of bounds array access warning
2016-06-10Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Will Deacon: "A fix for an issue that Alex saw whilst swapping with hardware access/dirty bit support enabled in the kernel: Fix a failure to fault in old pages on a write when CONFIG_ARM64_HW_AFDBM is enabled" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: always take dirty state from new pte in ptep_set_access_flags
2016-06-10Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes from all around the map, plus a commit that introduces a new header of Intel model name symbols (unused) that will make the next merge window easier" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ioapic: Fix incorrect pointers in ioapic_setup_resources() x86/entry/traps: Don't force in_interrupt() to return true in IST handlers x86/cpu/AMD: Extend X86_FEATURE_TOPOEXT workaround to newer models x86/cpu/intel: Introduce macros for Intel family numbers x86, build: copy ldlinux.c32 to image.iso x86/msr: Use the proper trace point conditional for writes
2016-06-10Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A handful of tooling fixes, two PMU driver fixes and a cleanup of redundant code that addresses a security analyzer false positive" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Remove a redundant check perf/x86/intel/uncore: Remove SBOX support for Broadwell server perf ctf: Convert invalid chars in a string before set value perf record: Fix crash when kptr is restricted perf symbols: Check kptr_restrict for root perf/x86/intel/rapl: Fix pmus free during cleanup
2016-06-10x86/ioapic: Fix incorrect pointers in ioapic_setup_resources()Rui Wang
On a 4-socket Brickland system, hot-removing one ioapic is fine. Hot-removing the 2nd one causes panic in mp_unregister_ioapic() while calling release_resource(). It is because the iomem_res pointer has already been released when removing the first ioapic. To explain the use of &res[num] here: res is assigned to ioapic_resources, and later in ioapic_insert_resources() we do: struct resource *r = ioapic_resources; for_each_ioapic(i) { insert_resource(&iomem_resource, r); r++; } Here 'r' is treated as an arry of 'struct resource', and the r++ ensures that each element of the array is inserted separately. Thus we should call release_resouce() on each element at &res[num]. Fix it by assigning the correct pointers to ioapics[i].iomem_res in ioapic_setup_resources(). Signed-off-by: Rui Wang <rui.y.wang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: tony.luck@intel.com Cc: linux-pci@vger.kernel.org Cc: rjw@rjwysocki.net Cc: linux-acpi@vger.kernel.org Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1465369193-4816-3-git-send-email-rui.y.wang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-10x86/entry/traps: Don't force in_interrupt() to return true in IST handlersAndy Lutomirski
Forcing in_interrupt() to return true if we're not in a bona fide interrupt confuses the softirq code. This fixes warnings like: NOHZ: local_softirq_pending 282 ... which can happen when running things like selftests/x86. This will change perf's static percpu buffer usage in IST context. I think this is okay, and it's changing the behavior to match historical (pre-4.0) behavior. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 959274753857 ("x86, traps: Track entry into and exit from IST context") Link: http://lkml.kernel.org/r/cdc215f94d118d691d73df35275022331156fb45.1464130360.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-10KVM: s390: Add stats for PEI eventsAlexander Yarygin
Add partial execution intercepted events in kvm_stats_debugfs. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10KVM: s390: ignore IBC if zeroDavid Hildenbrand
Looks like we forgot about the special IBC value of 0 meaning "no IBC". Let's fix that, otherwise it gets rounded up and suddenly an IBC is active with the lowest possible machine. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Fixes: commit 053dd2308d81 ("KVM: s390: force ibc into valid range") Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10powerpc/mm/radix: Flush page walk cache when freeing page tableAneesh Kumar K.V
Even though a tlb_flush() does a flush with invalidate all cache, we can end up doing an RCU page table free before calling tlb_flush(). That means we can have page walk cache entries even after we free the page table pages. This can result in us doing wrong page table walk. Avoid this by doing pwc flush on every page table free. We can't batch the pwc flush, because the rcu call back function where we free the page table pages doesn't have information of the mmu gather. Thus we have to do a pwc on every page table page freed. Note: I also removed the dummy tlb_flush_pgtable call functions for hash 32. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-10powerpc/mm/radix: Update to tlb functions ric argumentAneesh Kumar K.V
Radix invalidate control (RIC) is used to control which cache to flush using tlb instructions. When doing a PID flush, we currently flush everything including page walk cache. For address range flush, we flush only the TLB. In the next patch, we add support for flushing only the page walk cache. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-10powerpc/nohash: Fix build break with 64K pagesMichael Ellerman
Commit 74701d5947a6 "powerpc/mm: Rename function to indicate we are allocating fragments" renamed page_table_free() to pte_fragment_free(). One occurrence was mistyped as pte_fragment_fre(). This only breaks the nohash 64K page build, which is not the default or enabled in any defconfig. Fixes: 74701d5947a6 ("powerpc/mm: Rename function to indicate we are allocating fragments") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-09Merge tag 'arc-4.7-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - Revert of ll-sc backoff retry workaround in atomics/spinlocks as hardware is now proven to work just fine - Typo fixes (Thanks Andrea Gelmini) - Removal of obsolete DT property (Alexey) - Other minor fixes * tag 'arc-4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: Revert "ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff" Revert "ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle" Revert "ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff" ARC: don't enable DISCONTIGMEM unconditionally ARC: [intc-compact] simplify code for 2 priority levels arc: Get rid of root core-frequency property Fix typos
2016-06-09ARM: 8579/1: mm: Fix definition of pmd_mknotpresentSteve Capper
Currently pmd_mknotpresent will use a zero entry to respresent an invalidated pmd. Unfortunately this definition clashes with pmd_none, thus it is possible for a race condition to occur if zap_pmd_range sees pmd_none whilst __split_huge_pmd_locked is running too with pmdp_invalidate just called. This patch fixes the race condition by modifying pmd_mknotpresent to create non-zero faulting entries (as is done in other architectures), removing the ambiguity with pmd_none. [catalin.marinas@arm.com: using L_PMD_SECT_VALID instead of PMD_TYPE_SECT] Fixes: 8d9625070073 ("ARM: mm: Transparent huge page support for LPAE systems.") Cc: <stable@vger.kernel.org> # 3.11+ Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-06-09ARM: 8578/1: mm: ensure pmd_present only checks the valid bitWill Deacon
In a subsequent patch, pmd_mknotpresent will clear the valid bit of the pmd entry, resulting in a not-present entry from the hardware's perspective. Unfortunately, pmd_present simply checks for a non-zero pmd value and will therefore continue to return true even after a pmd_mknotpresent operation. Since pmd_mknotpresent is only used for managing huge entries, this is only an issue for the 3-level case. This patch fixes the 3-level pmd_present implementation to take into account the valid bit. For bisectability, the change is made before the fix to pmd_mknotpresent. [catalin.marinas@arm.com: comment update regarding pmd_mknotpresent patch] Fixes: 8d9625070073 ("ARM: mm: Transparent huge page support for LPAE systems.") Cc: <stable@vger.kernel.org> # 3.11+ Cc: Russell King <linux@armlinux.org.uk> Cc: Steve Capper <Steve.Capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>