summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2018-03-24powerpc/powernv: Provide a way to force a core into SMT4 modePaul Mackerras
POWER9 processors up to and including "Nimbus" v2.2 have hardware bugs relating to transactional memory and thread reconfiguration. One of these bugs has a workaround which is to get the core into SMT4 state temporarily. This workaround is only needed when running bare-metal. This patch provides a function which gets the core into SMT4 mode by preventing threads from going to a stop state, and waking up those which are already in a stop state. Once at least 3 threads are not in a stop state, the core will be in SMT4 and we can continue. To do this, we add a "dont_stop" flag to the paca to tell the thread not to go into a stop state. If this flag is set, power9_idle_stop() just returns immediately with a return value of 0. The pnv_power9_force_smt4_catch() function does the following: 1. Set the dont_stop flag for each thread in the core, except ourselves (in fact we use an atomic_inc() in case more than one thread is calling this function concurrently). 2. See how many threads are awake, indicated by their requested_psscr field in the paca being 0. If this is at least 3, skip to step 5. 3. Send a doorbell interrupt to each thread that was seen as being in a stop state in step 2. 4. Until at least 3 threads are awake, scan the threads to which we sent a doorbell interrupt and check if they are awake now. This relies on the following properties: - Once dont_stop is non-zero, requested_psccr can't go from zero to non-zero, except transiently (and without the thread doing stop). - requested_psscr being zero guarantees that the thread isn't in a state-losing stop state where thread reconfiguration could occur. - Doing stop with a PSSCR value of 0 won't be a state-losing stop and thus won't allow thread reconfiguration. - Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core must be in SMT4 mode, since SMT modes are powers of 2. This does add a sync to power9_idle_stop(), which is necessary to provide the correct ordering between setting requested_psscr and checking dont_stop. The overhead of the sync should be unnoticeable compared to the latency of going into and out of a stop state. Because some objected to incurring this extra latency on systems where the XER[SO] bug is not relevant, I have put the test in power9_idle_stop inside a feature section. This means that pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will probably hang the system. In order to cater for uses where the caller has an operation that has to be done while the core is in SMT4, the core continues to be kept in SMT4 after pnv_power9_force_smt4_catch() function returns, until the pnv_power9_force_smt4_release() function is called. It undoes the effect of step 1 above and allows the other threads to go into a stop state. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-24powerpc: Add CPU feature bits for TM bug workarounds on POWER9 v2.2Paul Mackerras
This adds a CPU feature bit which is set for POWER9 "Nimbus" DD2.2 processors which will be used to enable the hypervisor to assist hardware with the handling of checkpointed register values while the CPU is in suspend state, in order to work around hardware bugs. The hardware assistance for these workarounds introduced a new hardware bug relating to the XER[SO] bit. We add a separate feature bit for this bug in case future chips fix it while still requiring the hypervisor assistance with suspend state. When the dt_cpu_ftrs subsystem is in use, the software assistance can be enabled using a "tm-suspend-hypervisor-assist" node in the device tree, and a "tm-suspend-xer-so-bug" node enables the workarounds for the XER[SO] bug. In the absence of such nodes, a quirk enables both for POWER9 "Nimbus" DD2.2 processors. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-24powerpc: Free up CPU feature bits on 64-bit machinesPaul Mackerras
This moves all the CPU feature bits that are only used on 32-bit machines to the top 20 bits of the CPU feature word and arranges for them to be defined only in 32-bit builds. The features that are common to 32-bit and 64-bit machines are moved to bits 0-11 of the CPU feature word. This means that for 64-bit platforms, bits 44-63 can now be used for new features that only exist on 64-bit machines. (These bit numbers are counting from the right, i.e. the LSB is bit 0.) Because CPU_FTR_L3_DISABLE_NAP moved from the low 16 bits to the high 16 bits, we have to adjust some assembly code. Also, CPU_FTR_EMB_HV moved from the high 16 bits to the low 16 bits. Note that CPU_FTR_REAL_LE only applies to 64-bit chips, because only 64-bit chips (POWER6, 7, 8, 9) have a true little-endian mode that is a CPU execution mode as opposed to being a page attribute. With this we now have 20 free CPU feature bits on 64-bit machines. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-24powerpc: Book E: Remove unused CPU_FTR_L2CSR bitPaul Mackerras
The CPU_FTR_L2CSR bit is never tested anywhere, so let's reclaim the bit. The last usage was removed in 86d63363defc ("powerpc/e500mc: Remove dead L2 flushing code in idle_e500.S") (Jun 2015). Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-24powerpc: Use feature bit for RTC presence rather than timebase presencePaul Mackerras
All PowerPC CPUs other than the original PPC601 have a timebase register rather than the "real-time clock" (RTC) register that the PPC601 (and the original POWER and POWER2 CPUs) had. Currently we have a CPU feature bit to indicate the presence of the timebase, but it makes more sense to use a bit to indicate the unusual situation rather than the common situation. This therefore defines a CPU_FTR_USE_RTC bit in place of the CPU_FTR_USE_TB bit, and arranges for it to be set on PPC601 systems. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23powerpc/mm: Fixup tlbie vs store ordering issue on POWER9Aneesh Kumar K.V
On POWER9, under some circumstances, a broadcast TLB invalidation might complete before all previous stores have drained, potentially allowing stale stores from becoming visible after the invalidation. This works around it by doubling up those TLB invalidations which was verified by HW to be sufficient to close the risk window. This will be documented in a yet-to-be-published errata. Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Enable the feature in the DT CPU features code for all Power9, rename the feature to CPU_FTR_P9_TLBIE_BUG per benh.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23powerpc/mm/radix: Move the functions that does the actual tlbie closerAneesh Kumar K.V
No functionality change. Just code movement to ease code changes later Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23powerpc/mm/radix: Remove unused codeAneesh Kumar K.V
These function are not used in the code. Remove them. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23powerpc/mm: Workaround Nest MMU bug with TLB invalidationsBenjamin Herrenschmidt
On POWER9 the Nest MMU may fail to invalidate some translations when doing a tlbie "by PID" or "by LPID" that is targeted at the TLB only and not the page walk cache. This works around it by forcing such invalidations to escalate to RIC=2 (full invalidation of TLB *and* PWC) when a coprocessor is in use for the context. Fixes: 03b8abedf4f4 ("cxl: Enable global TLBIs for cxl contexts") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Balbir Singh <bsingharora@gmail.com> [balbirs: fixed spelling and coding style to quiesce checkpatch.pl] Tested-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23powerpc/mm: Add tracking of the number of coprocessors using a contextBenjamin Herrenschmidt
Currently, when using coprocessors (which use the Nest MMU), we simply increment the active_cpu count to force all TLB invalidations to be come broadcast. Unfortunately, due to an errata in POWER9, we will need to know more specifically that coprocessors are in use. This maintains a separate copros counter in the MMU context for that purpose. NB. The commit mentioned in the fixes tag below is not at fault for the bug we're fixing in this commit and the next, but this fix applies on top the infrastructure it introduced. Fixes: 03b8abedf4f4 ("cxl: Enable global TLBIs for cxl contexts") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-23KVM: PPC: Book3S HV: Fix duplication of host SLB entriesPaul Mackerras
Since commit 6964e6a4e489 ("KVM: PPC: Book3S HV: Do SLB load/unload with guest LPCR value loaded", 2018-01-11), we have been seeing occasional machine check interrupts on POWER8 systems when running KVM guests, due to SLB multihit errors. This turns out to be due to the guest exit code reloading the host SLB entries from the SLB shadow buffer when the SLB was not previously cleared in the guest entry path. This can happen because the path which skips from the guest entry code to the guest exit code without entering the guest now does the skip before the SLB is cleared and loaded with guest values, but the host values are loaded after the point in the guest exit path that we skip to. To fix this, we move the code that reloads the host SLB values up so that it occurs just before the point in the guest exit code (the label guest_bypass:) where we skip to from the guest entry path. Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Fixes: 6964e6a4e489 ("KVM: PPC: Book3S HV: Do SLB load/unload with guest LPCR value loaded") Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-03-22Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, thp: do not cause memcg oom for thp mm/vmscan: wake up flushers for legacy cgroups too Revert "mm: page_alloc: skip over regions of invalid pfns where possible" mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink() mm/thp: do not wait for lock_page() in deferred_split_scan() mm/khugepaged.c: convert VM_BUG_ON() to collapse fail x86/mm: implement free pmd/pte page interfaces mm/vmalloc: add interfaces to free unmapped page table h8300: remove extraneous __BIG_ENDIAN definition hugetlbfs: check for pgoff value overflow lockdep: fix fs_reclaim warning MAINTAINERS: update Mark Fasheh's e-mail mm/mempolicy.c: avoid use uninitialized preferred_node
2018-03-22Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "Two regression fixes, two bug fixes for older issues, two fixes for new functionality added this cycle that have userspace ABI concerns, and a small cleanup. These have appeared in a linux-next release and have a build success report from the 0day robot. * The 4.16 rework of altmap handling led to some configurations leaking page table allocations due to freeing from the altmap reservation rather than the page allocator. The impact without the fix is leaked memory and a WARN() message when tearing down libnvdimm namespaces. The rework also missed a place where error handling code needed to be removed that can lead to a crash if devm_memremap_pages() fails. * acpi_map_pxm_to_node() had a latent bug whereby it could misidentify the closest online node to a given proximity domain. * Block integrity handling was reworked several kernels back to allow calling add_disk() after setting up the integrity profile. The nd_btt and nd_blk drivers are just now catching up to fix automatic partition detection at driver load time. * The new peristence_domain attribute, a platform indicator of whether cpu caches are powerfail protected for example, is meant to be a single value enum and not a set of flags. This oversight was caught while reviewing new userspace code in libndctl to communicate the attribute. Fix this new enabling up so that we are not stuck with an unwanted userspace ABI" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm, nfit: fix persistence domain reporting libnvdimm, region: hide persistence_domain when unknown acpi, numa: fix pxm to online numa node associations x86, memremap: fix altmap accounting at free libnvdimm: remove redundant assignment to pointer 'dev' libnvdimm, {btt, blk}: do integrity setup before add_disk() kernel/memremap: Remove stale devres_free() call
2018-03-22x86/mm: implement free pmd/pte page interfacesToshi Kani
Implement pud_free_pmd_page() and pmd_free_pte_page() on x86, which clear a given pud/pmd entry and free up lower level page table(s). The address range associated with the pud/pmd entry must have been purged by INVLPG. Link: http://lkml.kernel.org/r/20180314180155.19492-3-toshi.kani@hpe.com Fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Reported-by: Lei Li <lious.lilei@hisilicon.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-22mm/vmalloc: add interfaces to free unmapped page tableToshi Kani
On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may create pud/pmd mappings. A kernel panic was observed on arm64 systems with Cortex-A75 in the following steps as described by Hanjun Guo. 1. ioremap a 4K size, valid page table will build, 2. iounmap it, pte0 will set to 0; 3. ioremap the same address with 2M size, pgd/pmd is unchanged, then set the a new value for pmd; 4. pte0 is leaked; 5. CPU may meet exception because the old pmd is still in TLB, which will lead to kernel panic. This panic is not reproducible on x86. INVLPG, called from iounmap, purges all levels of entries associated with purged address on x86. x86 still has memory leak. The patch changes the ioremap path to free unmapped page table(s) since doing so in the unmap path has the following issues: - The iounmap() path is shared with vunmap(). Since vmap() only supports pte mappings, making vunmap() to free a pte page is an overhead for regular vmap users as they do not need a pte page freed up. - Checking if all entries in a pte page are cleared in the unmap path is racy, and serializing this check is expensive. - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges. Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB purge. Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which clear a given pud/pmd entry and free up a page for the lower level entries. This patch implements their stub functions on x86 and arm64, which work as workaround. [akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub] Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com Fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings") Reported-by: Lei Li <lious.lilei@hisilicon.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Wang Xuefeng <wxf.wang@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Chintan Pandya <cpandya@codeaurora.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-22h8300: remove extraneous __BIG_ENDIAN definitionArnd Bergmann
A bugfix I did earlier caused a build regression on h8300, which defines the __BIG_ENDIAN macro in a slightly different way than the generic code: arch/h8300/include/asm/byteorder.h:5:0: warning: "__BIG_ENDIAN" redefined We don't need to define it here, as the same macro is already provided by the linux/byteorder/big_endian.h, and that version does not conflict. While this is a v4.16 regression, my earlier patch also got backported to the 4.14 and 4.15 stable kernels, so we need the fixup there as well. Link: http://lkml.kernel.org/r/20180313120752.2645129-1-arnd@arndb.de Fixes: 101110f6271c ("Kbuild: always define endianess in kconfig.h") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-22MIPS: Use the entry point from the ELF file headerMaciej W. Rozycki
In order to fetch the correct entry point with the ISA bit included, for use by non-ELF boot loaders, parse the output of `objdump -f' for the start address recorded in the kernel executable itself, rather than using `nm' to get the value of the `kernel_entry' symbol. Sign-extend the address retrieved if 32-bit, so that execution is correctly started on 64-bit processors as well. The tool always prints the entry point using either 8 or 16 hexadecimal digits, matching the address width (aka class) of the ELF file, even in the presence of leading zeros. Signed-off-by: Maciej W. Rozycki <macro@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18912/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-23powerpc/64s: Fix lost pending interrupt due to race causing lost update to ↵Nicholas Piggin
irq_happened force_external_irq_replay() can be called in the do_IRQ path with interrupts hard enabled and soft disabled if may_hard_irq_enable() set MSR[EE]=1. It updates local_paca->irq_happened with a load, modify, store sequence. If a maskable interrupt hits during this sequence, it will go to the masked handler to be marked pending in irq_happened. This update will be lost when the interrupt returns and the store instruction executes. This can result in unpredictable latencies, timeouts, lockups, etc. Fix this by ensuring hard interrupts are disabled before modifying irq_happened. This could cause any maskable asynchronous interrupt to get lost, but it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE, so very high external interrupt rate and high IPI rate. The hang was bisected down to enabling doorbell interrupts for IPIs. These provided an interrupt type that could run at high rates in the do_IRQ path, stressing the race. Fixes: 1d607bb3bd60 ("powerpc/irq: Add mechanism to force a replay of interrupts") Cc: stable@vger.kernel.org # v4.8+ Reported-by: Carol L. Soto <clsoto@us.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Always validate XFRM esn replay attribute, from Florian Westphal. 2) Fix RCU read lock imbalance in xfrm_get_tos(), from Xin Long. 3) Don't try to get firmware dump if not loaded in iwlwifi, from Shaul Triebitz. 4) Fix BPF helpers to deal with SCTP GSO SKBs properly, from Daniel Axtens. 5) Fix some interrupt handling issues in e1000e driver, from Benjamin Poitier. 6) Use strlcpy() in several ethtool get_strings methods, from Florian Fainelli. 7) Fix rhlist dup insertion, from Paul Blakey. 8) Fix SKB leak in netem packet scheduler, from Alexey Kodanev. 9) Fix driver unload crash when link is up in smsc911x, from Jeremy Linton. 10) Purge out invalid socket types in l2tp_tunnel_create(), from Eric Dumazet. 11) Need to purge the write queue when TCP connections are aborted, otherwise userspace using MSG_ZEROCOPY can't close the fd. From Soheil Hassas Yeganeh. 12) Fix double free in error path of team driver, from Arkadi Sharshevsky. 13) Filter fixes for hv_netvsc driver, from Stephen Hemminger. 14) Fix non-linear packet access in ipv6 ndisc code, from Lorenzo Bianconi. 15) Properly filter out unsupported feature flags in macvlan driver, from Shannon Nelson. 16) Don't request loading the diag module for a protocol if the protocol itself is not even registered. From Xin Long. 17) If datagram connect fails in ipv6, make sure the socket state is consistent afterwards. From Paolo Abeni. 18) Use after free in qed driver, from Dan Carpenter. 19) If received ipv4 PMTU is less than the min pmtu, lock the mtu in the entry. From Sabrina Dubroca. 20) Fix sleep in atomic in tg3 driver, from Jonathan Toppins. 21) Fix vlan in vlan untagging in some situations, from Toshiaki Makita. 22) Fix double SKB free in genlmsg_mcast(). From Nicolas Dichtel. 23) Fix NULL derefs in error paths of tcf_*_init(), from Davide Caratti. 24) Unbalanced PM runtime calls in FEC driver, from Florian Fainelli. 25) Memory leak in gemini driver, from Igor Pylypiv. 26) IDR leaks in error paths of tcf_*_init() functions, from Davide Caratti. 27) Need to use GFP_ATOMIC in seg6_build_state(), from David Lebrun. 28) Missing dev_put() in error path of macsec_newlink(), from Dan Carpenter. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (201 commits) macsec: missing dev_put() on error in macsec_newlink() net: dsa: Fix functional dsa-loop dependency on FIXED_PHY hv_netvsc: common detach logic hv_netvsc: change GPAD teardown order on older versions hv_netvsc: use RCU to fix concurrent rx and queue changes hv_netvsc: disable NAPI before channel close net/ipv6: Handle onlink flag with multipath routes ppp: avoid loop in xmit recursion detection code ipv6: sr: fix NULL pointer dereference when setting encap source address ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state net: aquantia: driver version bump net: aquantia: Implement pci shutdown callback net: aquantia: Allow live mac address changes net: aquantia: Add tx clean budget and valid budget handling logic net: aquantia: Change inefficient wait loop on fw data reads net: aquantia: Fix a regression with reset on old firmware net: aquantia: Fix hardware reset when SPI may rarely hangup s390/qeth: on channel error, reject further cmd requests s390/qeth: lock read device while queueing next buffer s390/qeth: when thread completes, wake up all waiters ...
2018-03-22irqchip/gic-v3: Probe for SCR_EL3 being clear before resetting AP0RnMarc Zyngier
We would like to reset the Group-0 Active Priority Registers at boot time if they are available to us. They would be available if SCR_EL3.FIQ was not set, but we cannot directly probe this bit, and short of checking, we may end-up trapping to EL3, and the firmware may not be please to get such an exception. Yes, this is dumb. Instead, let's use PMR to find out if its value gets affected by SCR_EL3.FIQ being set. We use the fact that when SCR_EL3.FIQ is set, the LSB of the priority is lost due to the shifting back and forth of the actual priority. If we read back a 0, we know that Group0 is unavailable. In case we read a non-zero value, we can safely reset the AP0Rn register. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-22MIPS: ralink: Fix booting on MT7621NeilBrown
Since commit 3af5a67c86a3 ("MIPS: Fix early CM probing") the MT7621 has not been able to boot. This commit caused mips_cm_probe() to be called before mt7621.c::proc_soc_init(). prom_soc_init() has a comment explaining that mips_cm_probe() "wipes out the bootloader config" and means that configuration registers are no longer available. It has some code to re-enable this config. Before this re-enable code is run, the sysc register cannot be read, so when SYSC_REG_CHIP_NAME0 is read, a garbage value is returned and panic() is called. If we move the config-repair code to the top of prom_soc_init(), the registers can be read and boot can proceed. Very occasionally, the first register read after the reconfiguration returns garbage, so add a call to __sync(). Fixes: 3af5a67c86a3 ("MIPS: Fix early CM probing") Signed-off-by: NeilBrown <neil@brown.name> Reviewed-by: Matt Redfearn <matt.redfearn@mips.com> Cc: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.5+ Patchwork: https://patchwork.linux-mips.org/patch/18859/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: ralink: Remove ralink_halt()NeilBrown
ralink_halt() does nothing that machine_halt() doesn't already do, so it adds no value. It actually causes incorrect behaviour due to the "unreachable()" at the end. This tells the compiler that the end of the function will never be reached, which isn't true. The compiler responds by not adding a 'return' instruction, so control simply moves on to whatever bytes come afterwards in memory. In my tested, that was the ralink_restart() function. This means that an attempt to 'halt' the machine would actually cause a reboot. So remove ralink_halt() so that a 'halt' really does halt. Fixes: c06e836ada59 ("MIPS: ralink: adds reset code") Signed-off-by: NeilBrown <neil@brown.name> Cc: John Crispin <john@phrozen.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.9+ Patchwork: https://patchwork.linux-mips.org/patch/18851/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: generic: Add support for Microsemi OcelotAlexandre Belloni
Introduce support for the MIPS based Microsemi Ocelot SoCs. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Allan Nielsen <Allan.Nielsen@microsemi.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18858/ [jhogan@kernel.org: update ocelot_defconfig specification] Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: mscc: Add ocelot PCB123 device treeAlexandre Belloni
Add a device tree for the Microsemi Ocelot PCB123 evaluation board. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Allan Nielsen <Allan.Nielsen@microsemi.com> Cc: linux-mips@linux-mips.org Cc: devicetree@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/18856/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: mscc: Add ocelot dtsiAlexandre Belloni
Add a device tree include file for the Microsemi Ocelot SoC. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Allan Nielsen <Allan.Nielsen@microsemi.com> Cc: linux-mips@linux-mips.org Cc: devicetree@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/18855/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: lantiq: ase: Enable MFD_SYSCONMathias Kresin
Enable syscon to use it for the RCU MFD on Amazon SE as well. The Amazon SE also has similar reset controller system as Danube and XWAY and use their drivers mostly. As these drivers now need syscon also activate the syscon subsystem for for Amazon SE. Fixes: 2b6639d4c794 ("MIPS: lantiq: Enable MFD_SYSCON to be able to use it for the RCU MFD") Signed-off-by: Mathias Kresin <dev@kresin.me> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: John Crispin <john@phrozen.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.14+ Patchwork: https://patchwork.linux-mips.org/patch/18817/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: lantiq: Enable AHB Bus for USBMathias Kresin
On Danube and AR9 the USB core is connected though a AHB bus to the main system cross bar, hence we need to enable the gating clock of the AHB Bus as well to make the USB controller work. Fixes: dea54fbad332 ("phy: Add an USB PHY driver for the Lantiq SoCs using the RCU module") Signed-off-by: Mathias Kresin <dev@kresin.me> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: John Crispin <john@phrozen.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.14+ Patchwork: https://patchwork.linux-mips.org/patch/18814/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21MIPS: lantiq: Fix Danube USB clockMathias Kresin
On Danube the USB0 controller registers are at 1e101000 and the USB0 PHY register is at 1f203018 similar to all other lantiq SoCs. Activate the USB controller gating clock thorough the USB controller driver and not the PHY. This fixes a problem introduced in a previous commit. Fixes: dea54fbad332 ("phy: Add an USB PHY driver for the Lantiq SoCs using the RCU module") Signed-off-by: Mathias Kresin <dev@kresin.me> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: John Crispin <john@phrozen.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.14+ Patchwork: https://patchwork.linux-mips.org/patch/18816/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-21ARM: dts: at91sam9260: pullup rx on usart0Peter Rosin
For consistency with all other serial pins, add this pullup. It also prevents the signal from floating and so consuming a useless extra amount of power in crowbarred state if nothing is connected to RX. Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2018-03-21ARM: dts: at91rm9200: pullup rx on uart0Peter Rosin
For consistency with all other serial pins, add this pullup. It also prevents the signal from floating and so consuming a useless extra amount of power in crowbarred state if nothing is connected to RX. Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2018-03-21ARM: dts: at91: fixes uart pinctrl, set pullup on rx, clear pullup on txPeter Rosin
Remove pullup on uart TX signals, they are push-pull outputs thus pullups are pointless. Add pullup on uart RX signals, they prevent the RX signals to be left floating and so consuming a useless extra amount of power in crowbarred state if nothing is connected to RX. Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2018-03-21media: arch: sh: ecovec: Use new renesas-ceu camera driverJacopo Mondi
SH4 7724 Ecovec platform uses sh_mobile_ceu camera driver, which is now being replaced by a proper V4L2 camera driver named 'renesas-ceu'. Get rid of soc_camera defined components used to register sensor drivers and of platform specific enable/disable routines. Register GPIOs for sensor drivers and declare memory reserved with memblock APIs as dma capable to be used for CEU buffers. While at there re-order include directives to respect alphabetical ordering. Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2018-03-21ARM: EXYNOS: Simplify code in coupled CPU idle hot pathMarek Szyprowski
exynos_enter_aftr() is called by coupled CPU idle code every time CPU enters idle state, what can be considered as a hot path. Replace of_machine_is_compatible() call with a simple SoC revision check. of_machine_is_compatible() function performs a dozen of string comparisons during the full device tree walk, while soc_is_exynos4412() is a simple integer check on SoC revision variable. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
2018-03-21ARM: EXYNOS: Fix coupled CPU idle freeze on Exynos4210Marek Szyprowski
Since commit 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") coupled CPU idle freezes from time to time on Exynos4210. Later commit 313c8c16ee62 ("PM / CPU: replace raw_notifier with atomic_notifier") changed the context in which the CPU idle code is executed, what results in fully reproducible freeze all the time. However, almost the same coupled CPU idle code works fine on Exynos3250 regardless of the changes made in the mentioned commits. It turned out that the IPI call used on Exynos4210 is conflicting with the change done in the first mentioned commit in GIC. Fix this by using the same code path as for Exynos3250, instead of the IPI call for synchronization with second CPU core, call dsb_sev() directly. Tested on Exynos4210-based Trats and Origen boards. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> CC: <stable@vger.kernel.org> # v4.13+ Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
2018-03-21ARM: dts: at91: at91sam9g25: fix mux-mask pinctrl propertyNicolas Ferre
There are only 19 PIOB pins having primary names PB0-PB18. Not all of them have a 'C' function. So the pinctrl property mask ends up being the same as the other SoC of the at91sam9x5 series. Reported-by: Marek Sieranski <marek.sieranski@microchip.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: <stable@vger.kernel.org> # v3.8+ Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2018-03-21ARM: OMAP: Fix SRAM W+X mappingTony Lindgren
We are still using custom SRAM code for some SoCs and are not marking the PM code mapped to SRAM as read-only and executable after we're done. With CONFIG_DEBUG_WX=y, we will get "Found insecure W+X mapping at address" warning. Let's fix this issue the same way as commit 728bbe75c82f ("misc: sram: Introduce support code for protect-exec sram type") is doing for drivers/misc/sram-exec.c. On omap3, we need to restore SRAM when returning from off mode after idle, so init time configuration is not enough. And as we no longer have users for omap_sram_push_address() we can make it static while at it. Note that eventually we should be using sram-exec.c for all SoCs. Cc: stable@vger.kernel.org # v4.12+ Cc: Dave Gerlach <d-gerlach@ti.com> Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Tony Lindgren <tony@atomide.com>
2018-03-21KVM: nVMX: fix vmentry failure code when L2 state would require emulationPaolo Bonzini
Commit 2bb8cafea80b ("KVM: vVMX: signal failure for nested VMEntry if emulation_required", 2018-03-12) introduces a new error path which does not set *entry_failure_code. Fix that to avoid a leak of L0 stack to L1. Reported-by: Radim Krčmář <rkrcmar@redhat.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-21KVM: nVMX: Do not load EOI-exitmap while running L2Liran Alon
When L1 IOAPIC redirection-table is written, a request of KVM_REQ_SCAN_IOAPIC is set on all vCPUs. This is done such that all vCPUs will now recalc their IOAPIC handled vectors and load it to their EOI-exitmap. However, it could be that one of the vCPUs is currently running L2. In this case, load_eoi_exitmap() will be called which would write to vmcs02->eoi_exit_bitmap, which is wrong because vmcs02->eoi_exit_bitmap should always be equal to vmcs12->eoi_exit_bitmap. Furthermore, at this point KVM_REQ_SCAN_IOAPIC was already consumed and therefore we will never update vmcs01->eoi_exit_bitmap. This could lead to remote_irr of some IOAPIC level-triggered entry to remain set forever. Fix this issue by delaying the load of EOI-exitmap to when vCPU is running L1. One may wonder why not just delay entire KVM_REQ_SCAN_IOAPIC processing to when vCPU is running L1. This is done in order to handle correctly the case where LAPIC & IO-APIC of L1 is pass-throughed into L2. In this case, vmcs12->virtual_interrupt_delivery should be 0. In current nVMX implementation, that results in vmcs02->virtual_interrupt_delivery to also be 0. Thus, vmcs02->eoi_exit_bitmap is not used. Therefore, every L2 EOI cause a #VMExit into L0 (either on MSR_WRITE to x2APIC MSR or APIC_ACCESS/APIC_WRITE/EPT_MISCONFIG to APIC MMIO page). In order for such L2 EOI to be broadcasted, if needed, from LAPIC to IO-APIC, vcpu->arch.ioapic_handled_vectors must be updated while L2 is running. Therefore, patch makes sure to delay only the loading of EOI-exitmap but not the update of vcpu->arch.ioapic_handled_vectors. Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-21x86/xen: Delay get_cpu_cap until stack canary is establishedJason Andryuk
Commit 2cc42bac1c79 ("x86-64/Xen: eliminate W+X mappings") introduced a call to get_cpu_cap, which is fstack-protected. This is works on x86-64 as commit 4f277295e54c ("x86/xen: init %gs very early to avoid page faults with stack protector") ensures the stack protector is configured, but it it did not cover x86-32. Delay calling get_cpu_cap until after xen_setup_gdt has initialized the stack canary. Without this, a 32bit PV machine crashes early in boot. (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.6.6-xc x86_64 debug=n Tainted: C ]---- (XEN) CPU: 0 (XEN) RIP: e019:[<00000000c10362f8>] And the PV kernel IP corresponds to init_scattered_cpuid_features 0xc10362f8 <+24>: mov %gs:0x14,%eax Fixes 2cc42bac1c79 ("x86-64/Xen: eliminate W+X mappings") Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-03-20kvm/x86: fix icebp instruction handlingLinus Torvalds
The undocumented 'icebp' instruction (aka 'int1') works pretty much like 'int3' in the absense of in-circuit probing equipment (except, obviously, that it raises #DB instead of raising #BP), and is used by some validation test-suites as such. But Andy Lutomirski noticed that his test suite acted differently in kvm than on bare hardware. The reason is that kvm used an inexact test for the icebp instruction: it just assumed that an all-zero VM exit qualification value meant that the VM exit was due to icebp. That is not unlike the guess that do_debug() does for the actual exception handling case, but it's purely a heuristic, not an absolute rule. do_debug() does it because it wants to ascribe _some_ reasons to the #DB that happened, and an empty %dr6 value means that 'icebp' is the most likely casue and we have no better information. But kvm can just do it right, because unlike the do_debug() case, kvm actually sees the real reason for the #DB in the VM-exit interruption information field. So instead of relying on an inexact heuristic, just use the actual VM exit information that says "it was 'icebp'". Right now the 'icebp' instruction isn't technically documented by Intel, but that will hopefully change. The special "privileged software exception" information _is_ actually mentioned in the Intel SDM, even though the cause of it isn't enumerated. Reported-by: Andy Lutomirski <luto@kernel.org> Tested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-20ARM: dts: meson8b: the CBUS GPIO controller only has 83 GPIOsMartin Blumenstingl
Update the "gpio-ranges" property of the CBUS GPIO controller on Meson8b because it only provides 83 GPIOs. The GPIO definitions in include/dt-bindings/gpio/meson8b-gpio.h inherited all GPIOs from Meson8 until recently. However, Meson8b does not support all GPIOs which are supported by Meson8 (Meson8b doesn't have a GPIOZ bank, most of the pins from the GPIODV bank are missing on Meson8b - just to name a few differences). The actual number of GPIOs is only 83, instead of 120 from Meson8 plus the 10 GPIOs from the DIF bank on Meson8b. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Jerome Brunet <jbrunet@baylibre.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2018-03-20ARM: dts: meson8b-odroidc1: add microSD supportLinus Lüssing
The Odroid C1 features a microSD slot. This patch adds the necessary DT bindings to support it. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Kevin Hilman <khilman@baylibre.com>
2018-03-20sparc/PCI: Stop reserving System ROM and Video ROM in PCI spaceBjorn Helgaas
Previously, pci_register_legacy_regions() reserved PCI address space under every PCI host bridge for the System ROM and the Video ROM, but these regions are not part of PCI address space. Previously, pci_register_legacy_regions() reserved the following areas of PCI address space under every PCI host bridge: [bus 0xa0000-0xbffff] Video RAM area (VGA frame buffer) [bus 0xc0000-0xc7fff] Video ROM [bus 0xf0000-0xfffff] System ROM It does need to reserve the [bus 0xa0000-0xbffff] region (at least if there's a possibility of a VGA device below the bridge) because VGA devices can respond to that even if they don't describe it with a BAR. But the Video ROM and System ROM areas don't seem necessary because they are not areas that legacy PCI devices respond to. They appear to be copied from x86, where they describe areas of system memory that depend on BIOS conventions. On x86, BIOS copies the option ROM of the primary VGA device to RAM at 0xc0000, and the 0xf0000-0xfffff region is reserved for the motherboard BIOS. Neither of these things applies to sparc. Stop reserving the System ROM and Video ROM regions in PCI space. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: David S. Miller <davem@davemloft.net>
2018-03-20sparc: get rid of asm wrapper for nis_syscall()Al Viro
just use current_pt_regs() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-03-20sparc: switch compat {f,}truncate64() to COMPAT_SYSCALL_DEFINEAl Viro
... and drop the pointless checks - sys_truncate() itself might've lacked the check when that stuff was first written, but it has already grown one by the time that stuff went into mainline. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-03-20sparc: switch compat pread64 and pwrite64 to COMPAT_SYSCALL_DEFINEAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-03-20convert compat sync_file_range() to COMPAT_SYSCALL_DEFINEAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-03-20switch sparc_remap_file_pages() to SYSCALL_DEFINEAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-03-20sparc: get rid of memory_ordering(2) wrapperAl Viro
use current_pt_regs() in it instead Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-03-20sparc: trivial conversions to {COMPAT_,}SYSCALL_DEFINE()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>