summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2017-08-04Merge tag 'qcom-arm64-defconfig-fixes-for-4.13-rc2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into fixes Pull "Qualcomm ARM64 based defconfig Fixes for v4.13-rc2" from Andy Gross: * Enable missing HWSPINLOCK * tag 'qcom-arm64-defconfig-fixes-for-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux: arm64: defconfig: enable missing HWSPINLOCK
2017-08-04ARM: dts: tango4: Request RGMII RX and TX clock delaysMarc Gonzalez
RX and TX clock delays are required. Request them explicitly. Fixes: cad008b8a77e6 ("ARM: dts: tango4: Initial device trees") Cc: stable@vger.kernel.org Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2017-08-04Merge tag 'renesas-fixes3-for-v4.13' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes Pull "Third Round of Renesas ARM Based SoC Fixes for v4.13" from Simon Horman: Fix deadlock in regulator quirk for R-Car Gen 2 SoCs The da9063/da9210 regulator quirk for R-Car Gen2 boards uses a bus notifier, and unregisters the notifier when it is no longer needed. However, a notifier must not be unregistered from within the call chain. This bug went unnoticed, as blocking_notifier_chain_unregister() didn't take the semaphore during early boot. This is no longer the case as of upstream commit 1c3c5eab171590f8 ("sched/core: Enable might_sleep() and smp_processor_id() checks early") and a deadlock occurs. * tag 'renesas-fixes3-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: ARM: shmobile: rcar-gen2: Fix deadlock in regulator quirk
2017-08-04Merge tag 'mvebu-fixes-4.13-2' of git://git.infradead.org/linux-mvebu into fixesArnd Bergmann
Pull "mvebu fixes for 4.13 (part 2)" from Gregory CLEMENT: All the fixes are for ARM64 mvebu: - Fix the RTC interrupt on A7K/A8K which was missed when switching from GIC to ICU - Mark the A7K/A8K crypto engine as dma coherent - Fix the number of GPIO on south bridge on Armada 3700 * tag 'mvebu-fixes-4.13-2' of git://git.infradead.org/linux-mvebu: ARM64: dts: marvell: armada-37xx: Fix the number of GPIO on south bridge arm64: dts: marvell: mark the cp110 crypto engine as dma coherent arm64: dts: marvell: use ICU for the CP110 slave RTC
2017-08-04Merge tag 'amlogic-fixes' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into fixes Pull "Amlogic fixes for v4.13-rc" from Kevin Hilman: - 2 minor DT fixes * tag 'amlogic-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic: ARM64: dts: meson-gxl-s905x-libretech-cc: fixup board definition ARM64: dts: meson-gx: use specific compatible for the AO pwms
2017-08-04Merge tag 'v4.13-rockchip-dts32fixes-1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes Pull "Rockchip dts32 fixes for 4.13" from Heiko Stübner: Fix for the recently added mali dt support. The example showed a wrong value, so fix it before it gets copy-pasted to much. * tag 'v4.13-rockchip-dts32fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: ARM: dts: rockchip: fix mali gpu node on rk3288 dt-bindings: gpu: drop wrong compatible from midgard binding example
2017-08-04ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_PAE40 but PAE exists in SoCVineet Gupta
PAE40 confiuration in hardware extends some of the address registers for TLB/cache ops to 2 words. So far kernel was NOT setting the higher word if feature was not enabled in software which is wrong. Those need to be set to 0 in such case. Normally this would be done in the cache flush / tlb ops, however since these registers only exist conditionally, this would have to be conditional to a flag being set on boot which is expensive/ugly - specially for the more common case of PAE exists but not in use. Optimize that by zero'ing them once at boot - nobody will write to them afterwards Cc: stable@vger.kernel.org #4.4+ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-04ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addressesAlexey Brodkin
It is necessary to explicitly set both SLC_AUX_RGN_START1 and SLC_AUX_RGN_END1 which hold MSB bits of the physical address correspondingly of region start and end otherwise SLC region operation is executed in unpredictable manner Without this patch, SLC flushes on HSDK (IOC disabled) were taking seconds. Cc: stable@vger.kernel.org #4.4+ Reported-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: PAR40 regs only written if PAE40 exist]
2017-08-04ARC: dma: implement dma_unmap_page and sg variantVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-04ARCv2: SLC: Make sure busy bit is set properly for region opsAlexey Brodkin
c70c473396cb "ARCv2: SLC: Make sure busy bit is set properly on SLC flushing" fixes problem for entire SLC operation where the problem was initially caught. But given a nature of the issue it is perfectly possible for busy bit to be read incorrectly even when region operation was started. So extending initial fix for regional operation as well. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: stable@vger.kernel.org #4.10 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-04ARC: [plat-sim] Include this platform unconditionallyVineet Gupta
Essentially remove CONFIG_ARC_PLAT_SIM There is no need for any platform specific code, just the board DTS match strings which we can include unconditionally Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-04ARC: [plat-axs10x]: prepare dts files for enabling PAE40 on axs103Eugeniy Paltsev
Enable 64bit adressing, where it needed, to make possible enabling PAE40 on axs103. This patch doesn't affect on any functionality. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev-HKixBCOQz3hWk0Htik3J/w@public.gmane.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-03sock: add SOCK_ZEROCOPY sockoptWillem de Bruijn
The send call ignores unknown flags. Legacy applications may already unwittingly pass MSG_ZEROCOPY. Continue to ignore this flag unless a socket opts in to zerocopy. Introduce socket option SO_ZEROCOPY to enable MSG_ZEROCOPY processing. Processes can also query this socket option to detect kernel support for the feature. Older kernels will return ENOPROTOOPT. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04powerpc/64: Fix __check_irq_replay missing decrementer interruptNicholas Piggin
If the decrementer wraps again and de-asserts the decrementer exception while hard-disabled, __check_irq_replay() has a test to notice the wrap when interrupts are re-enabled. The decrementer check must be done when clearing the PACA_IRQ_HARD_DIS flag, not when the PACA_IRQ_DEC flag is tested. Previously this worked because the decrementer interrupt was always the first one checked after clearing the hard disable flag, but HMI check was moved ahead of that, which introduced this bug. This can cause a missed decrementer interrupt if we soft-disable interrupts then take an HMI which is recorded in irq_happened, then hard-disable interrupts for > 4s to wrap the decrementer. Fixes: e0e0d6b7390b ("powerpc/64: Replay hypervisor maintenance interrupt first") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-04powerpc/perf: POWER9 PMU stops after idle workaroundNicholas Piggin
POWER9 DD2 PMU can stop after a state-loss idle in some conditions. A solution is to set then clear MMCRA[60] after wake from state-loss idle. MMCRA[60] is a non-architected bit, see the user manual for details. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-04crypto: arm64/aes - avoid expanded lookup tables in the final roundArd Biesheuvel
For the final round, avoid the expanded and padded lookup tables exported by the generic AES driver. Instead, for encryption, we can perform byte loads from the same table we used for the inner rounds, which will still be hot in the caches. For decryption, use the inverse AES Sbox directly, which is 4x smaller than the inverse lookup table exported by the generic driver. This should significantly reduce the Dcache footprint of our code, which makes the code more robust against timing attacks. It does not introduce any additional module dependencies, given that we already rely on the core AES module for the shared key expansion routines. It also frees up register x18, which is not available as a scratch register on all platforms, which and so avoiding it improves shareability of this code. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm/aes - avoid expanded lookup tables in the final roundArd Biesheuvel
For the final round, avoid the expanded and padded lookup tables exported by the generic AES driver. Instead, for encryption, we can perform byte loads from the same table we used for the inner rounds, which will still be hot in the caches. For decryption, use the inverse AES Sbox directly, which is 4x smaller than the inverse lookup table exported by the generic driver. This should significantly reduce the Dcache footprint of our code, which makes the code more robust against timing attacks. It does not introduce any additional module dependencies, given that we already rely on the core AES module for the shared key expansion routines. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULLArd Biesheuvel
Implement a NEON fallback for systems that do support NEON but have no support for the optional 64x64->128 polynomial multiplication instruction that is part of the ARMv8 Crypto Extensions. It is based on the paper "Fast Software Polynomial Multiplication on ARM Processors Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and Ricardo Dahab (https://hal.inria.fr/hal-01506572), but has been reworked extensively for the AArch64 ISA. On a low-end core such as the Cortex-A53 found in the Raspberry Pi3, the NEON based implementation is 4x faster than the table based one, and is time invariant as well, making it less vulnerable to timing attacks. When combined with the bit-sliced NEON implementation of AES-CTR, the AES-GCM performance increases by 2x (from 58 to 29 cycles per byte). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm/ghash - add NEON accelerated fallback for vmull.p64Ard Biesheuvel
Implement a NEON fallback for systems that do support NEON but have no support for the optional 64x64->128 polynomial multiplication instruction that is part of the ARMv8 Crypto Extensions. It is based on the paper "Fast Software Polynomial Multiplication on ARM Processors Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and Ricardo Dahab (https://hal.inria.fr/hal-01506572) On a 32-bit guest executing under KVM on a Cortex-A57, the new code is not only 4x faster than the generic table based GHASH driver, it is also time invariant. (Note that the existing vmull.p64 code is 16x faster on this core). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/gcm - implement native driver using v8 Crypto ExtensionsArd Biesheuvel
Currently, the AES-GCM implementation for arm64 systems that support the ARMv8 Crypto Extensions is based on the generic GCM module, which combines the AES-CTR implementation using AES instructions with the PMULL based GHASH driver. This is suboptimal, given the fact that the input data needs to be loaded twice, once for the encryption and again for the MAC calculation. On Cortex-A57 (r1p2) and other recent cores that implement micro-op fusing for the AES instructions, AES executes at less than 1 cycle per byte, which means that any cycles wasted on loading the data twice hurt even more. So implement a new GCM driver that combines the AES and PMULL instructions at the block level. This improves performance on Cortex-A57 by ~37% (from 3.5 cpb to 2.6 cpb) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/aes-bs - implement non-SIMD fallback for AES-CTRArd Biesheuvel
Of the various chaining modes implemented by the bit sliced AES driver, only CTR is exposed as a synchronous cipher, and requires a fallback in order to remain usable once we update the kernel mode NEON handling logic to disallow nested use. So wire up the existing CTR fallback C code. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/chacha20 - take may_use_simd() into accountArd Biesheuvel
To accommodate systems that disallow the use of kernel mode NEON in some circumstances, take the return value of may_use_simd into account when deciding whether to invoke the C fallback routine. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/aes-blk - add a non-SIMD fallback for synchronous CTRArd Biesheuvel
To accommodate systems that may disallow use of the NEON in kernel mode in some circumstances, introduce a C fallback for synchronous AES in CTR mode, and use it if may_use_simd() returns false. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/aes-ce-ccm: add non-SIMD generic fallbackArd Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON. So honour this in the ARMv8 Crypto Extensions implementation of CCM-AES, and fall back to a scalar implementation using the generic crypto helpers for AES, XOR and incrementing the CTR counter. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/aes-ce-cipher: add non-SIMD generic fallbackArd Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/aes-ce-cipher - match round key endianness with generic codeArd Biesheuvel
In order to be able to reuse the generic AES code as a fallback for situations where the NEON may not be used, update the key handling to match the byte order of the generic code: it stores round keys as sequences of 32-bit quantities rather than streams of bytes, and so our code needs to be updated to reflect that. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/sha2-ce - add non-SIMD scalar fallbackArd Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/sha1-ce - add non-SIMD generic fallbackArd Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/crc32 - add non-SIMD scalar fallbackArd Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/crct10dif - add non-SIMD generic fallbackArd Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: arm64/ghash-ce - add non-SIMD scalar fallbackArd Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-04crypto: algapi - make crypto_xor() take separate dst and src argumentsArd Biesheuvel
There are quite a number of occurrences in the kernel of the pattern if (dst != src) memcpy(dst, src, walk.total % AES_BLOCK_SIZE); crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE); or crypto_xor(keystream, src, nbytes); memcpy(dst, keystream, nbytes); where crypto_xor() is preceded or followed by a memcpy() invocation that is only there because crypto_xor() uses its output parameter as one of the inputs. To avoid having to add new instances of this pattern in the arm64 code, which will be refactored to implement non-SIMD fallbacks, add an alternative implementation called crypto_xor_cpy(), taking separate input and output arguments. This removes the need for the separate memcpy(). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-08-03sparc/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooksLorenzo Pieralisi
The pci_fixup_irqs() function allocates IRQs for all PCI devices present in a system; those PCI devices possibly belong to different PCI bus trees (and possibly rooted at different host bridges) and may well be enabled (ie probed and bound to a driver) by the time pci_fixup_irqs() is called when probing a given host bridge driver. Furthermore, current kernel code relying on pci_fixup_irqs() to assign legacy PCI IRQs to devices does not work at all for hotplugged devices in that the code carrying out the IRQ fixup is called at host bridge driver probe time, which just cannot take into account devices hotplugged after the system has booted. The introduction of map/swizzle function hooks in struct pci_host_bridge allows us to define per-bridge map/swizzle functions that can be used at device probe time in PCI core code to allocate IRQs for a given device (through pci_assign_irq()). Convert PCI host bridge initialization code to the pci_scan_root_bus_bridge() API (that allows to pass a struct pci_host_bridge with initialized map/swizzle pointers) and remove the pci_fixup_irqs() call from arch code. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: "David S. Miller" <davem@davemloft.net>
2017-08-03unicore32/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooksLorenzo Pieralisi
The pci_fixup_irqs() function allocates IRQs for all PCI devices present in a system; those PCI devices possibly belong to different PCI bus trees (and possibly rooted at different host bridges) and may well be enabled (ie probed and bound to a driver) by the time pci_fixup_irqs() is called when probing a given host bridge driver. Furthermore, current kernel code relying on pci_fixup_irqs() to assign legacy PCI IRQs to devices does not work at all for hotplugged devices in that the code carrying out the IRQ fixup is called at host bridge driver probe time, which just cannot take into account devices hotplugged after the system has booted. The introduction of map/swizzle function hooks in struct pci_host_bridge allows us to define per-bridge map/swizzle functions that can be used at device probe time in PCI core code to allocate IRQs for a given device (through pci_assign_irq()). Convert PCI host bridge initialization code to the pci_scan_root_bus_bridge() API (that allows to pass a struct pci_host_bridge with initialized map/swizzle pointers) and remove the pci_fixup_irqs() call from arch code. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
2017-08-03tile/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooksLorenzo Pieralisi
The pci_fixup_irqs() function allocates IRQs for all PCI devices present in a system; those PCI devices possibly belong to different PCI bus trees (and possibly rooted at different host bridges) and may well be enabled (ie probed and bound to a driver) by the time pci_fixup_irqs() is called when probing a given host bridge driver. Furthermore, current kernel code relying on pci_fixup_irqs() to assign legacy PCI IRQs to devices does not work at all for hotplugged devices in that the code carrying out the IRQ fixup is called at host bridge driver probe time, which just cannot take into account devices hotplugged after the system has booted. The introduction of map/swizzle function hooks in struct pci_host_bridge allows us to define per-bridge map/swizzle functions that can be used at device probe time in PCI core code to allocate IRQs for a given device (through pci_assign_irq()). Convert PCI host bridge initialization code to the pci_scan_root_bus_bridge() API (that allows to pass a struct pci_host_bridge with initializedmap/swizzle pointers) and remove the pci_fixup_irqs() call from arch code. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Chris Metcalf <cmetcalf@mellanox.com>
2017-08-03MIPS: PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooksLorenzo Pieralisi
The pci_fixup_irqs() function allocates IRQs for all PCI devices present in a system; those PCI devices possibly belong to different PCI bus trees (and possibly rooted at different host bridges) and may well be enabled (ie probed and bound to a driver) by the time pci_fixup_irqs() is called when probing a given host bridge driver. Furthermore, current kernel code relying on pci_fixup_irqs() to assign legacy PCI IRQs to devices does not work at all for hotplugged devices in that the code carrying out the IRQ fixup is called at host bridge driver probe time, which just cannot take into account devices hotplugged after the system has booted. The introduction of map/swizzle function hooks in struct pci_host_bridge allows us to define per-bridge map/swizzle functions, that can be used at device probe time in PCI core code to allocate IRQs for a given device (through pci_assign_irq()). Convert PCI host bridge initialization code to the pci_scan_root_bus_bridge() API (that allows to pass a struct pci_host_bridge with initialized map/swizzle pointers) and remove the pci_fixup_irqs() call from arch code. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@imgtec.com>
2017-08-03m68k/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooksLorenzo Pieralisi
The pci_fixup_irqs() function allocates IRQs for all PCI devices present in a system; those PCI devices possibly belong to different PCI bus trees (and possibly rooted at different host bridges) and may well be enabled (ie probed and bound to a driver) by the time pci_fixup_irqs() is called when probing a given host bridge driver. Furthermore, current kernel code relying on pci_fixup_irqs() to assign legacy PCI IRQs to devices does not work at all for hotplugged devices in that the code carrying out the IRQ fixup is called at host bridge driver probe time, which just cannot take into account devices hotplugged after the system has booted. The introduction of map/swizzle function hooks in struct pci_host_bridge allows us to define per-bridge map/swizzle functions that can be used at device probe time in PCI core code to allocate IRQs for a given device (through pci_assign_irq()). Convert PCI host bridge initialization code to the pci_scan_root_bus_bridge() API (that allows to pass a struct pci_host_bridge with initialized map/swizzle pointers) and remove the pci_fixup_irqs() call from arch code. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
2017-08-03treewide: Consolidate Apple DMI checksLukas Wunner
We're about to amend ACPI bus scan with DMI checks whether we're running on a Mac to support Apple device properties in AML. The DMI checks are performed for every single device, adding overhead for everything x86 that isn't Apple, which is the majority. Rafael and Andy therefore request to perform the DMI match only once and cache the result. Outside of ACPI various other Apple DMI checks exist and it seems reasonable to use the cached value there as well. Rafael, Andy and Darren suggest performing the DMI check in arch code and making it available with a header in include/linux/platform_data/x86/. To this end, add early_platform_quirks() to arch/x86/kernel/quirks.c to perform the DMI check and invoke it from setup_arch(). Switch over all existing Apple DMI checks, thereby fixing two deficiencies: * They are now #defined to false on non-x86 arches and can thus be optimized away if they're located in cross-arch code. * Some of them only match "Apple Inc." but not "Apple Computer, Inc.", which is used by BIOSes released between January 2006 (when the first x86 Macs started shipping) and January 2007 (when the company name changed upon introduction of the iPhone). Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Suggested-by: Darren Hart <dvhart@infradead.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-03alpha/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooksLorenzo Pieralisi
The pci_fixup_irqs() function allocates IRQs for all PCI devices present in a system; those PCI devices possibly belong to different PCI bus trees (and possibly rooted at different host bridges) and may well be enabled (ie probed and bound to a driver) by the time pci_fixup_irqs() is called when probing a given host bridge driver. Furthermore, current kernel code relying on pci_fixup_irqs() to assign legacy PCI IRQs to devices does not work at all for hotplugged devices in that the code carrying out the IRQ fixup is called at host bridge driver probe time, which just cannot take into account devices hotplugged after the system has booted. The introduction of map/swizzle function hooks in struct pci_host_bridge allows us to define per-bridge map/swizzle functions that can be used at device probe time in PCI core code to allocate IRQs for a given device (through pci_assign_irq()). Convert PCI host bridge initialization code to the pci_scan_root_bus_bridge() API (that allows to pass a struct pci_host_bridge with initialized map/swizzle pointers) and remove the pci_fixup_irqs() call from arch code. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
2017-08-03sh/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooksLorenzo Pieralisi
The pci_fixup_irqs() function allocates IRQs for all PCI devices present in a system; those PCI devices possibly belong to different PCI bus trees (and possibly rooted at different host bridges) and may well be enabled (ie probed and bound to a driver) by the time pci_fixup_irqs() is called when probing a given host bridge driver. Furthermore, current kernel code relying on pci_fixup_irqs() to assign legacy PCI IRQs to devices does not work at all for hotplugged devices in that the code carrying out the IRQ fixup is called at host bridge driver probe time, which just cannot take into account devices hotplugged after the system has booted. The introduction of map/swizzle function hooks in struct pci_host_bridge allows us to define per-bridge map/swizzle functions that can be used at device probe time in PCI core code to allocate IRQs for a given device (through pci_assign_irq()). Convert PCI host bridge initialization code to the pci_scan_root_bus_bridge() API (that allows to pass a struct pci_host_bridge with initialized map/swizzle pointers) and remove the pci_fixup_irqs() call from arch code. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
2017-08-03sh/PCI: Remove __init optimisations from IRQ mapping functions/dataMatthew Minter
Currently many IRQ mapping functions and data structures use the __init and __initdata optimisations. These result in the relevant functions being innaccessible after boot time. However for deferred IRQ assignment it is important to have access to these functions at PCI device enable time. Therefore, remove the optimisation from the relevant data structures and functions to prepare for deferred IRQ assignment. Signed-off-by: Matthew Minter <matt@masarand.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
2017-08-03Merge tag 'pm-4.13-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix two cpufreq issues, one introduced recently and one related to recent changes, fix cpufreq documentation, fix up recently added code in the Thunderbolt driver and update runtime PM framework documentation. Specifics: - Fix the handling of the scaling_cur_freq cpufreq policy attribute on x86 systems with the MPERF/APERF registers present to make it behave more as expected after recent changes (Rafael Wysocki). - Drop a leftover callback from the intel_pstate driver which also prevents the cpuinfo_cur_freq cpufreq policy attribute from being incorrectly exposed when intel_pstate works in the active mode (Rafael Wysocki). - Add a missing piece describing the cpuinfo_cur_freq policy attribute to cpufreq documentation (Rafael Wysocki). - Fix up a recently added part of the Thunderbolt driver to avoid aborting system suspends if its mailbox commands time out (Rafael Wysocki). - Update device runtime PM framework documentation to reflect the current behavior of the code (Johan Hovold)" * tag 'pm-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thunderbolt: icm: Ignore mailbox errors in icm_suspend() cpufreq: x86: Make scaling_cur_freq behave more as expected PM / runtime: Document new pm_runtime_set_suspended() constraint cpufreq: docs: Add missing cpuinfo_cur_freq description cpufreq: intel_pstate: Drop ->get from intel_pstate structure
2017-08-03Merge branches 'pm-cpufreq-x86', 'pm-cpufreq-docs' and 'intel_pstate'Rafael J. Wysocki
* pm-cpufreq-x86: cpufreq: x86: Make scaling_cur_freq behave more as expected * pm-cpufreq-docs: cpufreq: docs: Add missing cpuinfo_cur_freq description * intel_pstate: cpufreq: intel_pstate: Drop ->get from intel_pstate structure
2017-08-03Merge tag 'kvm-arm-for-v4.13-rc4' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm KVM/ARM Fixes for v4.13-rc4 - Yet another race with VM destruction plugged - A set of small vgic fixes
2017-08-03KVM: nVMX: Fix interrupt window request with "Acknowledge interrupt on exit"Wanpeng Li
------------[ cut here ]------------ WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7 RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] Call Trace: vmx_check_nested_events+0x131/0x1f0 [kvm_intel] ? vmx_check_nested_events+0x131/0x1f0 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm] ? vmx_vcpu_load+0x1be/0x220 [kvm_intel] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x8f/0x750 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL64_slow_path+0x25/0x25 This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which means that tells the kernel to not make use of any IOAPICs that may be present in the system. Actually external_intr variable in nested_vmx_vmexit() is the req_int_win variable passed from vcpu_enter_guest() which means that the L0's userspace requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) && L0's userspace reqeusts an irq window) is true, so there is no interrupt which L1 requires to inject to L2, we should not attempt to emualte "Acknowledge interrupt on exit" for the irq window requirement in this scenario. This patch fixes it by not attempt to emulate "Acknowledge interrupt on exit" if there is no L1 requirement to inject an interrupt to L2. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> [Added code comment to make it obvious that the behavior is not correct. We should do a userspace exit with open interrupt window instead of the nested VM exit. This patch still improves the behavior, so it was accepted as a (temporary) workaround.] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-08-03ARM: mvebu: enable ARM_GLOBAL_TIMER compilation Armada 38x platformsMarcin Wojtas
Armada 38x SoCs along with legacy timer (time-armada-370-xp.c), comprise generic Cortex-A9 global timer (arm_global_timer.c). Enable its compilation. The system clocksource subsystem will pick one of above two available ones in case the global timer node is present in the device tree. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2017-08-03ARM: dts: armada-38x: Add arm_global_timer nodeMarcin Wojtas
Since generic Cortex-A9 global timer is available after adding it to compilation, enable its node in armada-38x.dtsi. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2017-08-03ARM: dts: marvell: fix PCI bus dtc warningsRob Herring
dtc recently added PCI bus checks. Fix these warnings. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Gregory Clement <gregory.clement@free-electrons.com> Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2017-08-03arm64: defconfig: enable fine-grained task level IRQ time accountingMarcin Wojtas
Tests showed, that under certain conditions, the summary number of jiffies spent on softirq/idle, which are counted by system statistics can be even below 10% of expected value, resulting in false load presentation. The issue was observed on the quad-core Marvell Armada 8k SoC, whose two 10G ports were bound into L2 bridge. Load was controlled by bidirectional UDP traffic, produced by a packet generator. Under such condition, the dominant load is softirq. With 100% single CPU occupation or without any activity (all CPUs 100% idle), total number of jiffies is 10000 (2500 per each core) in 10s interval. Also with other kind of load this was true. However below a saturation threshold it was observed, that with CPU which was occupied almost by softirqs only, the statistic were awkward. See the mpstat output: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle all 0.00 0.00 0.13 0.00 0.00 0.55 0.00 0.00 0.00 99.32 0 0.00 0.00 0.00 0.00 0.00 23.08 0.00 0.00 0.00 76.92 1 0.00 0.00 0.40 0.00 0.00 0.00 0.00 0.00 0.00 99.60 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 Above would mean basically no total load, debug CPU0 occupied in 25%. Raw statistics, printed every 10s from /proc/stat unveiled a root cause - summary idle/softirq jiffies on loaded CPU were below 200, i.e. over 90% samples lost. All problems were gone after enabling fine granulity IRQ time accounting. This patch fixes possible wrong statistics processing by enabling CONFIG_IRQ_TIME_ACCOUNTING for arm64 platfroms, which is by default done on other architectures, e.g. x86 and arm. Tests showed no noticeable performance penalty, nor stability impact. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2017-08-03ARM64: dts: marvell: armada-37xx: Enable uSD on ESPRESSObinMarcin Wojtas
The ESPRESSObin board exposes one of the SDHCI interfaces via J1 uSD slot. This patch enables it. Tested-by: Miquel Raynal <miquel.raynal@free-electrons.com> Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Zbigniew Bodek <zbodek@gmail.com> [gregory.clement@free-electrons.com: removed "no-1-8-v"] Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>