summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2018-01-07Merge branch 'parisc-4.15-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: - Many small fixes to show the real physical addresses of devices instead of hashed addresses. - One important fix to unbreak 32-bit SMP support: We forgot to 16-byte align the spinlocks in the assembler code. - Qemu support: The host will get a chance to sleep when the parisc guest is idle. We use the same mechanism as the power architecture by overlaying the "or %r10,%r10,%r10" instruction which is simply a nop on real hardware. * 'parisc-4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: qemu idle sleep support parisc: Fix alignment of pa_tlb_lock in assembly on 32-bit SMP kernel parisc: Show unhashed EISA EEPROM address parisc: Show unhashed HPA of Dino chip parisc: Show initial kernel memory layout unhashed parisc: Show unhashed hardware inventory
2018-01-06Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Radim Krčmář: "s390: - Two fixes for potential bitmap overruns in the cmma migration code x86: - Clear guest provided GPRs to defeat the Project Zero PoC for CVE 2017-5715" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: vmx: Scrub hardware GPRs at VM-exit KVM: s390: prevent buffer overrun on memory hotplug during migration KVM: s390: fix cmma migration for multiple memory slots
2018-01-06Merge tag 'powerpc-4.15-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: "Just one fix to correctly return SEGV_ACCERR when we take a SEGV on a mapped region. The bug was introduced in the refactoring of the page fault handler we did in the previous release. Thanks to John Sperbeck" * tag 'powerpc-4.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERR
2018-01-06Merge tag 'kvm-s390-master-4.15-2' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux KVM: s390: fixes for cmma migration Two fixes for potential bitmap overruns in the cmma migration code.
2018-01-06parisc: qemu idle sleep supportHelge Deller
Add qemu idle sleep support when running under qemu with SeaBIOS PDC firmware. Like the power architecture we use the "or" assembler instructions, which translate to nops on real hardware, to indicate that qemu shall idle sleep. Signed-off-by: Helge Deller <deller@gmx.de> Cc: Richard Henderson <rth@twiddle.net> CC: stable@vger.kernel.org # v4.9+
2018-01-05Merge tag 'arc-4.15-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - platform updates for setting up clock correctly - fixes to accomodate newer gcc (__builtin_trap, removed inline asm modifier) - other fixes * tag 'arc-4.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: handle gcc generated __builtin_trap for older compiler ARC: handle gcc generated __builtin_trap() ARC: uaccess: dont use "l" gcc inline asm constraint modifier ARC: [plat-axs103] refactor the quad core DT quirk code ARC: [plat-axs103]: Set initial core pll output frequency ARC: [plat-hsdk]: Get rid of core pll frequency set in platform code ARC: [plat-hsdk]: Set initial core pll output frequency ARC: [plat-hsdk] Switch DisplayLink driver from fbdev to DRM arc: do not use __print_symbol() ARC: Fix detection of dual-issue enabled
2018-01-05Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more x86 pti fixes from Thomas Gleixner: "Another small stash of fixes for fallout from the PTI work: - Fix the modules vs. KASAN breakage which was caused by making MODULES_END depend of the fixmap size. That was done when the cpu entry area moved into the fixmap, but now that we have a separate map space for that this is causing more issues than it solves. - Use the proper cache flush methods for the debugstore buffers as they are mapped/unmapped during runtime and not statically mapped at boot time like the rest of the cpu entry area. - Make the map layout of the cpu_entry_area consistent for 4 and 5 level paging and fix the KASLR vaddr_end wreckage. - Use PER_CPU_EXPORT for per cpu variable and while at it unbreak nvidia gfx drivers by dropping the GPL export. The subject line of the commit tells it the other way around, but I noticed that too late. - Fix the ASM alternative macros so they can be used in the middle of an inline asm block. - Rename the BUG_CPU_INSECURE flag to BUG_CPU_MELTDOWN so the attack vector is properly identified. The Spectre mitigations will come with their own bug bits later" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm x86/tlb: Drop the _GPL from the cpu_tlbstate export x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers x86/kaslr: Fix the vaddr_end mess x86/mm: Map cpu_entry_area at the same place on 4/5 level x86/mm: Set MODULES_END to 0xffffffffff000000
2018-01-05Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Thomas Gleixner: - A fix for a add_efi_memmap parameter regression which ensures that the parameter is parsed before it is used. - Reinstate the virtual capsule mapping as the cached copy turned out to break Quark and other things - Remove Matt Fleming as EFI co-maintainer. He stepped back a few days ago. Thanks Matt for all your great work! * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Remove Matt Fleming as EFI co-maintainer efi/capsule-loader: Reinstate virtual capsule mapping x86/efi: Fix kernel param add_efi_memmap regression
2018-01-05Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Four bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/dasd: fix wrongly assigned configuration data s390: fix preemption race in disable_sacf_uaccess s390/sclp: disable FORTIFY_SOURCE for early sclp code s390/pci: handle insufficient resources during dma tlb flush
2018-01-05kvm: vmx: Scrub hardware GPRs at VM-exitJim Mattson
Guest GPR values are live in the hardware GPRs at VM-exit. Do not leave any guest values in hardware GPRs after the guest GPR values are saved to the vcpu_vmx structure. This is a partial mitigation for CVE 2017-5715 and CVE 2017-5753. Specifically, it defeats the Project Zero PoC for CVE 2017-5715. Suggested-by: Eric Northup <digitaleric@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Eric Northup <digitaleric@google.com> Reviewed-by: Benjamin Serebrin <serebrin@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> [Paolo: Add AMD bits, Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-01-05x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWNThomas Gleixner
Use the name associated with the particular attack which needs page table isolation for mitigation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: Jiri Koshina <jikos@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Lutomirski <luto@amacapital.net> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Greg KH <gregkh@linux-foundation.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801051525300.1724@nanos
2018-01-05x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asmDavid Woodhouse
Where an ALTERNATIVE is used in the middle of an inline asm block, this would otherwise lead to the following instruction being appended directly to the trailing ".popsection", and a failed compile. Fixes: 9cebed423c84 ("x86, alternative: Use .pushsection/.popsection") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: ak@linux.intel.com Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180104143710.8961-8-dwmw@amazon.co.uk
2018-01-04kernel/exit.c: export abort() to modulesAndrew Morton
gcc -fisolate-erroneous-paths-dereference can generate calls to abort() from modular code too. [arnd@arndb.de: drop duplicate exports of abort()] Link: http://lkml.kernel.org/r/20180102103311.706364-1-arnd@arndb.de Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com> Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-05x86/tlb: Drop the _GPL from the cpu_tlbstate exportThomas Gleixner
The recent changes for PTI touch cpu_tlbstate from various tlb_flush inlines. cpu_tlbstate is exported as GPL symbol, so this causes a regression when building out of tree drivers for certain graphics cards. Aside of that the export was wrong since it was introduced as it should have been EXPORT_PER_CPU_SYMBOL_GPL(). Use the correct PER_CPU export and drop the _GPL to restore the previous state which allows users to utilize the cards they payed for. As always I'm really thrilled to make this kind of change to support the #friends (or however the hot hashtag of today is spelled) from that closet sauce graphics corp. Fixes: 1e02ce4cccdc ("x86: Store a per-cpu shadow copy of CR4") Fixes: 6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches") Reported-by: Kees Cook <keescook@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org
2018-01-05x86/events/intel/ds: Use the proper cache flush method for mapping ds buffersPeter Zijlstra
Thomas reported the following warning: BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498 caller is native_flush_tlb_single+0x57/0xc0 native_flush_tlb_single+0x57/0xc0 __set_pte_vaddr+0x2d/0x40 set_pte_vaddr+0x2f/0x40 cea_set_pte+0x30/0x40 ds_update_cea.constprop.4+0x4d/0x70 reserve_ds_buffers+0x159/0x410 x86_reserve_hardware+0x150/0x160 x86_pmu_event_init+0x3e/0x1f0 perf_try_init_event+0x69/0x80 perf_event_alloc+0x652/0x740 SyS_perf_event_open+0x3f6/0xd60 do_syscall_64+0x5c/0x190 set_pte_vaddr is used to map the ds buffers into the cpu entry area, but there are two problems with that: 1) The resulting flush is not supposed to be called in preemptible context 2) The cpu entry area is supposed to be per CPU, but the debug store buffers are mapped for all CPUs so these mappings need to be flushed globally. Add the necessary preemption protection across the mapping code and flush TLBs globally. Fixes: c1961a4631da ("x86/events/intel/ds: Map debug buffers in cpu_entry_area") Reported-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180104170712.GB3040@hirez.programming.kicks-ass.net
2018-01-05x86/kaslr: Fix the vaddr_end messThomas Gleixner
vaddr_end for KASLR is only documented in the KASLR code itself and is adjusted depending on config options. So it's not surprising that a change of the memory layout causes KASLR to have the wrong vaddr_end. This can map arbitrary stuff into other areas causing hard to understand problems. Remove the whole ifdef magic and define the start of the cpu_entry_area to be the end of the KASLR vaddr range. Add documentation to that effect. Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap") Reported-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable <stable@vger.kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com>, Cc: Alexander Kuleshov <kuleshovmail@gmail.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
2018-01-04x86/mm: Map cpu_entry_area at the same place on 4/5 levelThomas Gleixner
There is no reason for 4 and 5 level pagetables to have a different layout. It just makes determining vaddr_end for KASLR harder than necessary. Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Gilbert <benjamin.gilbert@coreos.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable <stable@vger.kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com>, Cc: Alexander Kuleshov <kuleshovmail@gmail.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
2018-01-04x86/mm: Set MODULES_END to 0xffffffffff000000Andrey Ryabinin
Since f06bdd4001c2 ("x86/mm: Adapt MODULES_END based on fixmap section size") kasan_mem_to_shadow(MODULES_END) could be not aligned to a page boundary. So passing page unaligned address to kasan_populate_zero_shadow() have two possible effects: 1) It may leave one page hole in supposed to be populated area. After commit 21506525fb8d ("x86/kasan/64: Teach KASAN about the cpu_entry_area") that hole happens to be in the shadow covering fixmap area and leads to crash: BUG: unable to handle kernel paging request at fffffbffffe8ee04 RIP: 0010:check_memory_region+0x5c/0x190 Call Trace: <NMI> memcpy+0x1f/0x50 ghes_copy_tofrom_phys+0xab/0x180 ghes_read_estatus+0xfb/0x280 ghes_notify_nmi+0x2b2/0x410 nmi_handle+0x115/0x2c0 default_do_nmi+0x57/0x110 do_nmi+0xf8/0x150 end_repeat_nmi+0x1a/0x1e Note, the crash likely disappeared after commit 92a0f81d8957, which changed kasan_populate_zero_shadow() call the way it was before commit 21506525fb8d. 2) Attempt to load module near MODULES_END will fail, because __vmalloc_node_range() called from kasan_module_alloc() will hit the WARN_ON(!pte_none(*pte)) in the vmap_pte_range() and bail out with error. To fix this we need to make kasan_mem_to_shadow(MODULES_END) page aligned which means that MODULES_END should be 8*PAGE_SIZE aligned. The whole point of commit f06bdd4001c2 was to move MODULES_END down if NR_CPUS is big, so the cpu_entry_area takes a lot of space. But since 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap") the cpu_entry_area is no longer in fixmap, so we could just set MODULES_END to a fixed 8*PAGE_SIZE aligned address. Fixes: f06bdd4001c2 ("x86/mm: Adapt MODULES_END based on fixmap section size") Reported-by: Jakub Kicinski <kubakici@wp.pl> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Garnier <thgarnie@google.com> Link: https://lkml.kernel.org/r/20171228160620.23818-1-aryabinin@virtuozzo.com
2018-01-04Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "Fixes this time include mostly device tree changes, as usual, the notable ones include: - A number of patches to fix most of the remaining DTC warnings that got introduced when DTC started warning about some obvious mistakes. We still have some remaining warnings that probably may have to wait until 4.16 to get fixed while we try to figure out what the correct contents should be. - On Allwinner A64, Ethernet PHYs need a fix after a mistake in coordination between patches merged through multiple branches. - Various fixes for PMICs on allwinner based boards - Two fixes for ethernet link detection on some Renesas machines - Two stability fixes for rockchip based boards Aside from device-tree, two other areas got fixes for older problems: - For TI Davinci DM365, a couple of fixes were needed to repair the MMC DMA engine support, apparently this has been broken for a while. - One important fix for all Allwinner chips with the PMIC driver as a loadable module" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (23 commits) arm64: dts: uniphier: fix gpio-ranges property of PXs3 SoC arm64: dts: renesas: ulcb: Remove renesas, no-ether-link property arm64: dts: renesas: salvator-x: Remove renesas, no-ether-link property ARM: dts: tango4: remove bogus interrupt-controller property ARM: dts: ls1021a: fix incorrect clock references ARM: dts: aspeed-g4: Correct VUART IRQ number ARM: dts: exynos: Enable Mixer node for Exynos5800 Peach Pi machine ARM: dts: sun8i: a711: Reinstate the PMIC compatible ARM: davinci: fix mmc entries in dm365's dma_slave_map ARM: dts: da850-lego-ev3: Fix battery voltage gpio ARM: davinci: Add dma_mask to dm365's eDMA device ARM: davinci: Use platform_device_register_full() to create pdev for dm365's eDMA arm64: dts: rockchip: limit rk3328-rock64 gmac speed to 100MBit for now arm64: dts: rockchip: remove vdd_log from rk3399-puma arm64: dts: orange-pi-zero-plus2: fix sdcard detect arm64: allwinner: a64-sopine: Fix to use dcdc1 regulator instead of vcc3v3 ARM: dts: sunxi: Convert to CCU index macros for HDMI controller sunxi-rsb: Include OF based modalias in device uevent ARM: dts: at91: disable the nxp,se97b SMBUS timeout on the TSE-850 arm64: dts: rockchip: fix trailing 0 in rk3328 tsadc interrupts ...
2018-01-04arm64: dts: uniphier: fix gpio-ranges property of PXs3 SoCMasahiro Yamada
This is probably a copy-paste mistake. The gpio-ranges of PXs3 is different from that of LD20. Fixes: 277b51e7050f ("arm64: dts: uniphier: add GPIO controller nodes") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-01-04Merge tag 'sunxi-fixes-for-4.15' of ↵Arnd Bergmann
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes Pull "Allwinner fixes for 4.15" from Chen-Yu Tsai: First, one fix that adds proper regulator references for the EMAC external PHYs on A64 boards. The EMAC bindings were developed for 4.13, but reverted at the last minute. They were finalized and brought back for 4.15. However in the time between, regulator support for the A64 boards was merged. When EMAC device tree changes were reintroduced, this was not taken into account. Second, a patch that adds OF based modalias uevent for RSB slave devices. This has been missing since the introduction of RSB, and recently with PMIC regulator support introduced for the A64, has been seen affecting distributions, which have the all-important PMIC mfd drivers built as modules, which then don't get loaded. Other minor cleanups include final conversion of raw indices to CCU binding macros for sun[4567]i HDMI, cleanup of dummy regulators on the A64 SOPINE, a SD card detection polarity fix for the Orange Pi Zero Plus2, and adding a missing compatible for the PMIC on the TBS A711 tablet. * tag 'sunxi-fixes-for-4.15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: ARM: dts: sun8i: a711: Reinstate the PMIC compatible arm64: dts: orange-pi-zero-plus2: fix sdcard detect arm64: allwinner: a64-sopine: Fix to use dcdc1 regulator instead of vcc3v3 ARM: dts: sunxi: Convert to CCU index macros for HDMI controller sunxi-rsb: Include OF based modalias in device uevent arm64: allwinner: a64: add Ethernet PHY regulator for several boards
2018-01-04Merge tag 'renesas-fixes-for-v4.15' of ↵Arnd Bergmann
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes Pull "Renesas ARM Based SoC Fixes for v4.15" from Simon Horman: Vladimir Zapolskiy says: The present change is a bug fix for AVB link iteratively up/down. Steps to reproduce: - start AVB TX stream (Using aplay via MSE), - disconnect+reconnect the eth cable, - after a reconnection the eth connection goes iteratively up/down without user interaction, - this may heal after some seconds or even stay for minutes. As the documentation specifies, the "renesas,no-ether-link" option should be used when a board does not provide a proper AVB_LINK signal. There is no need for this option enabled on RCAR H3/M3 Salvator-X/XS and ULCB starter kits since the AVB_LINK is correctly handled by HW. Choosing to keep or remove the "renesas,no-ether-link" option will have impact on the code flow in the following ways: - keeping this option enabled may lead to unexpected behavior since the RX & TX are enabled/disabled directly from adjust_link function without any HW interrogation, - removing this option, the RX & TX will only be enabled/disabled after HW interrogation. The HW check is made through the LMON pin in PSR register which specifies AVB_LINK signal value (0 - at low level; 1 - at high level). In conclusion, the change is also a safety improvement because it removes the "renesas,no-ether-link" option leading to a proper way of detecting the link state based on HW interrogation and not on software heuristic. Note that DTS files for V3M Starter Kit, Draak and Eagle boards contain the same property, the files are untouched due to unavailable schematics to verify if the fix applies to these boards as well. * tag 'renesas-fixes-for-v4.15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/horms/renesas: arm64: dts: renesas: ulcb: Remove renesas, no-ether-link property arm64: dts: renesas: salvator-x: Remove renesas, no-ether-link property
2018-01-03Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 page table isolation fixes from Thomas Gleixner: "A couple of urgent fixes for PTI: - Fix a PTE mismatch between user and kernel visible mapping of the cpu entry area (differs vs. the GLB bit) and causes a TLB mismatch MCE on older AMD K8 machines - Fix the misplaced CR3 switch in the SYSCALL compat entry code which causes access to unmapped kernel memory resulting in double faults. - Fix the section mismatch of the cpu_tss_rw percpu storage caused by using a different mechanism for declaration and definition. - Two fixes for dumpstack which help to decode entry stack issues better - Enable PTI by default in Kconfig. We should have done that earlier, but it slipped through the cracks. - Exclude AMD from the PTI enforcement. Not necessarily a fix, but if AMD is so confident that they are not affected, then we should not burden users with the overhead" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/process: Define cpu_tss_rw in same section as declaration x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat() x86/dumpstack: Print registers for first stack frame x86/dumpstack: Fix partial register dumps x86/pti: Make sure the user/kernel PTEs match x86/cpu, x86/pti: Do not enable PTI on AMD processors x86/pti: Enable PTI by default
2018-01-03x86/process: Define cpu_tss_rw in same section as declarationNick Desaulniers
cpu_tss_rw is declared with DECLARE_PER_CPU_PAGE_ALIGNED but then defined with DEFINE_PER_CPU_SHARED_ALIGNED leading to section mismatch warnings. Use DEFINE_PER_CPU_PAGE_ALIGNED consistently. This is necessary because it's mapped to the cpu entry area and must be page aligned. [ tglx: Massaged changelog a bit ] Fixes: 1a935bc3d4ea ("x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: thomas.lendacky@amd.com Cc: Borislav Petkov <bpetkov@suse.de> Cc: tklauser@distanz.ch Cc: minipli@googlemail.com Cc: me@kylehuey.com Cc: namit@vmware.com Cc: luto@kernel.org Cc: jpoimboe@redhat.com Cc: tj@kernel.org Cc: cl@linux.com Cc: bp@suse.de Cc: thgarnie@google.com Cc: kirill.shutemov@linux.intel.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180103203954.183360-1-ndesaulniers@google.com
2018-01-03x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()Thomas Gleixner
The preparation for PTI which added CR3 switching to the entry code misplaced the CR3 switch in entry_SYSCALL_compat(). With PTI enabled the entry code tries to access a per cpu variable after switching to kernel GS. This fails because that variable is not mapped to user space. This results in a double fault and in the worst case a kernel crash. Move the switch ahead of the access and clobber RSP which has been saved already. Fixes: 8a09317b895f ("x86/mm/pti: Prepare the x86/entry assembly code for entry/exit CR3 switching") Reported-by: Lars Wendler <wendler.lars@web.de> Reported-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Betkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org>, Cc: Dave Hansen <dave.hansen@linux.intel.com>, Cc: Peter Zijlstra <peterz@infradead.org>, Cc: Greg KH <gregkh@linuxfoundation.org>, , Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>, Cc: Juergen Gross <jgross@suse.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801031949200.1957@nanos
2018-01-03x86/dumpstack: Print registers for first stack frameJosh Poimboeuf
In the stack dump code, if the frame after the starting pt_regs is also a regs frame, the registers don't get printed. Fix that. Reported-by: Andy Lutomirski <luto@amacapital.net> Tested-by: Alexander Tsoy <alexander@tsoy.me> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toralf Förster <toralf.foerster@gmx.de> Cc: stable@vger.kernel.org Fixes: 3b3fa11bc700 ("x86/dumpstack: Print any pt_regs found on the stack") Link: http://lkml.kernel.org/r/396f84491d2f0ef64eda4217a2165f5712f6a115.1514736742.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-03x86/dumpstack: Fix partial register dumpsJosh Poimboeuf
The show_regs_safe() logic is wrong. When there's an iret stack frame, it prints the entire pt_regs -- most of which is random stack data -- instead of just the five registers at the end. show_regs_safe() is also poorly named: the on_stack() checks aren't for safety. Rename the function to show_regs_if_on_stack() and add a comment to explain why the checks are needed. These issues were introduced with the "partial register dump" feature of the following commit: b02fcf9ba121 ("x86/unwinder: Handle stack overflows more gracefully") That patch had gone through a few iterations of development, and the above issues were artifacts from a previous iteration of the patch where 'regs' pointed directly to the iret frame rather than to the (partially empty) pt_regs. Tested-by: Alexander Tsoy <alexander@tsoy.me> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toralf Förster <toralf.foerster@gmx.de> Cc: stable@vger.kernel.org Fixes: b02fcf9ba121 ("x86/unwinder: Handle stack overflows more gracefully") Link: http://lkml.kernel.org/r/5b05b8b344f59db2d3d50dbdeba92d60f2304c54.1514736742.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-03x86/pti: Make sure the user/kernel PTEs matchThomas Gleixner
Meelis reported that his K8 Athlon64 emits MCE warnings when PTI is enabled: [Hardware Error]: Error Addr: 0x0000ffff81e000e0 [Hardware Error]: MC1 Error: L1 TLB multimatch. [Hardware Error]: cache level: L1, tx: INSN The address is in the entry area, which is mapped into kernel _AND_ user space. That's special because we switch CR3 while we are executing there. User mapping: 0xffffffff81e00000-0xffffffff82000000 2M ro PSE GLB x pmd Kernel mapping: 0xffffffff81000000-0xffffffff82000000 16M ro PSE x pmd So the K8 is complaining that the TLB entries differ. They differ in the GLB bit. Drop the GLB bit when installing the user shared mapping. Fixes: 6dc72c3cbca0 ("x86/mm/pti: Share entry text PMD") Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Meelis Roos <mroos@linux.ee> Cc: Borislav Petkov <bp@alien8.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801031407180.1957@nanos
2018-01-03x86/cpu, x86/pti: Do not enable PTI on AMD processorsTom Lendacky
AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against. The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault. Disable page table isolation by default on AMD processors by not setting the X86_BUG_CPU_INSECURE feature, which controls whether X86_FEATURE_PTI is set. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20171227054354.20369.94587.stgit@tlendack-t1.amdoffice.net
2018-01-03efi/capsule-loader: Reinstate virtual capsule mappingArd Biesheuvel
Commit: 82c3768b8d68 ("efi/capsule-loader: Use a cached copy of the capsule header") ... refactored the capsule loading code that maps the capsule header, to avoid having to map it several times. However, as it turns out, the vmap() call we ended up removing did not just map the header, but the entire capsule image, and dropping this virtual mapping breaks capsules that are processed by the firmware immediately (i.e., without a reboot). Unfortunately, that change was part of a larger refactor that allowed a quirk to be implemented for Quark, which has a non-standard memory layout for capsules, and we have slightly painted ourselves into a corner by allowing quirk code to mangle the capsule header and memory layout. So we need to fix this without breaking Quark. Fortunately, Quark does not appear to care about the virtual mapping, and so we can simply do a partial revert of commit: 2a457fb31df6 ("efi/capsule-loader: Use page addresses rather than struct page pointers") ... and create a vmap() mapping of the entire capsule (including header) based on the reinstated struct page array, unless running on Quark, in which case we pass the capsule header copy as before. Reported-by: Ge Song <ge.song@hxt-semitech.com> Tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> Tested-by: Ge Song <ge.song@hxt-semitech.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> Cc: Dave Young <dyoung@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Fixes: 82c3768b8d68 ("efi/capsule-loader: Use a cached copy of the capsule header") Link: http://lkml.kernel.org/r/20180102172110.17018-3-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-03x86/efi: Fix kernel param add_efi_memmap regressionDave Young
'add_efi_memmap' is an early param, but do_add_efi_memmap() has no chance to run because the code path is before parse_early_param(). I believe it worked when the param was introduced but probably later some other changes caused the wrong order and nobody noticed it. Move efi_memblock_x86_reserve_range() after parse_early_param() to fix it. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie> Cc: Ge Song <ge.song@hxt-semitech.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180102172110.17018-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-03ARC: handle gcc generated __builtin_trap for older compilerVineet Gupta
ARC gcc prior to GNU 2018.03 release didn't have a target specific __builtin_trap() implementation, generating default abort() call. Implement the abort() call - emulating what newer gcc does for the same, as suggested by Arnd. Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-01-02parisc: Fix alignment of pa_tlb_lock in assembly on 32-bit SMP kernelHelge Deller
Qemu for PARISC reported on a 32bit SMP parisc kernel strange failures about "Not-handled unaligned insn 0x0e8011d6 and 0x0c2011c9." Those opcodes evaluate to the ldcw() assembly instruction which requires (on 32bit) an alignment of 16 bytes to ensure atomicity. As it turns out, qemu is correct and in our assembly code in entry.S and pacache.S we don't pay attention to the required alignment. This patch fixes the problem by aligning the lock offset in assembly code in the same manner as we do in our C-code. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.0+
2018-01-02parisc: Show initial kernel memory layout unhashedHelge Deller
Fixes: ad67b74d2469d9b8 ("printk: hash addresses printed with %p") Signed-off-by: Helge Deller <deller@gmx.de>
2018-01-02parisc: Show unhashed hardware inventoryHelge Deller
Fixes: ad67b74d2469d9b8 ("printk: hash addresses printed with %p") Signed-off-by: Helge Deller <deller@gmx.de>
2018-01-02powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERRJohn Sperbeck
The recent refactoring of the powerpc page fault handler in commit c3350602e876 ("powerpc/mm: Make bad_area* helper functions") caused access to protected memory regions to indicate SEGV_MAPERR instead of the traditional SEGV_ACCERR in the si_code field of a user-space signal handler. This can confuse debug libraries that temporarily change the protection of memory regions, and expect to use SEGV_ACCERR as an indication to restore access to a region. This commit restores the previous behavior. The following program exhibits the issue: $ ./repro read || echo "FAILED" $ ./repro write || echo "FAILED" $ ./repro exec || echo "FAILED" #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <signal.h> #include <sys/mman.h> #include <assert.h> static void segv_handler(int n, siginfo_t *info, void *arg) { _exit(info->si_code == SEGV_ACCERR ? 0 : 1); } int main(int argc, char **argv) { void *p = NULL; struct sigaction act = { .sa_sigaction = segv_handler, .sa_flags = SA_SIGINFO, }; assert(argc == 2); p = mmap(NULL, getpagesize(), (strcmp(argv[1], "write") == 0) ? PROT_READ : 0, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); assert(p != MAP_FAILED); assert(sigaction(SIGSEGV, &act, NULL) == 0); if (strcmp(argv[1], "read") == 0) printf("%c", *(unsigned char *)p); else if (strcmp(argv[1], "write") == 0) *(unsigned char *)p = 0; else if (strcmp(argv[1], "exec") == 0) ((void (*)(void))p)(); return 1; /* failed to generate SEGV */ } Fixes: c3350602e876 ("powerpc/mm: Make bad_area* helper functions") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: John Sperbeck <jsperbeck@google.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Add commit references in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-31Merge branch 'x86/urgent' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A couple of fixlets for x86: - Fix the ESPFIX double fault handling for 5-level pagetables - Fix the commandline parsing for 'apic=' on 32bit systems and update documentation - Make zombie stack traces reliable - Fix kexec with stack canary - Fix the delivery mode for APICs which was missed when the x86 vector management was converted to single target delivery. Caused a regression due to the broken hardware which ignores affinity settings in lowest prio delivery mode. - Unbreak modules when AMD memory encryption is enabled - Remove an unused parameter of prepare_switch_to" * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Switch all APICs to Fixed delivery mode x86/apic: Update the 'apic=' description of setting APIC driver x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 case x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR) x86: Remove unused parameter of prepare_switch_to x86/stacktrace: Make zombie stack traces reliable x86/mm: Unbreak modules that use the DMA API x86/build: Make isoimage work on Debian x86/espfix/64: Fix espfix double-fault handling on 5-level systems
2017-12-31Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 page table isolation fixes from Thomas Gleixner: "Four patches addressing the PTI fallout as discussed and debugged yesterday: - Remove stale and pointless TLB flush invocations from the hotplug code - Remove stale preempt_disable/enable from __native_flush_tlb() - Plug the memory leak in the write_ldt() error path" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ldt: Make LDT pgtable free conditional x86/ldt: Plug memory leak in error path x86/mm: Remove preempt_disable/enable() from __native_flush_tlb() x86/smpboot: Remove stale TLB flush invocations
2017-12-31Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: - plug a memory leak in the intel pmu init code - clang fixes - tooling fix to avoid including kernel headers - a fix for jvmti to generate correct debug information for inlined code - replace backtick with a regular shell function - fix the build in hardened environments * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Plug memory leak in intel_pmu_init() x86/asm: Allow again using asm.h when building for the 'bpf' clang target tools arch s390: Do not include header files from the kernel sources perf jvmti: Generate correct debug information for inlined code perf tools: Fix up build in hardened environments perf tools: Use shell function for perl cflags retrieval
2017-12-31Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A rather large update after the kaisered maintainer finally found time to handle regression reports. - The larger part addresses a regression caused by the x86 vector management rework. The reservation based model does not work reliably for MSI interrupts, if they cannot be masked (yes, yet another hw engineering trainwreck). The reason is that the reservation mode assigns a dummy vector when the interrupt is allocated and switches to a real vector when the interrupt is requested. If the MSI entry cannot be masked then the initialization might raise an interrupt before the interrupt is requested, which ends up as spurious interrupt and causes device malfunction and worse. The fix is to exclude MSI interrupts which do not support masking from reservation mode and assign a real vector right away. - Extend the extra lockdep class setup for nested interrupts with a class for the recently added irq_desc::request_mutex so lockdep can differeniate and does not emit false positive warnings. - A ratelimit guard for the bad irq printout so in case a bad irq comes back immediately the system does not drown in dmesg spam" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI genirq/irqdomain: Rename early argument of irq_domain_activate_irq() x86/vector: Use IRQD_CAN_RESERVE flag genirq: Introduce IRQD_CAN_RESERVE flag genirq/msi: Handle reactivation only on success gpio: brcmstb: Make really use of the new lockdep class genirq: Guard handle_bad_irq log messages kernel/irq: Extend lockdep class for request mutex
2017-12-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc bugfix from David Miller. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: repair calling incorrect hweight function from stubs
2017-12-31x86/ldt: Make LDT pgtable free conditionalThomas Gleixner
Andy prefers to be paranoid about the pagetable free in the error path of write_ldt(). Make it conditional and warn whenever the installment of a secondary LDT fails. Requested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-12-31x86/ldt: Plug memory leak in error pathThomas Gleixner
The error path in write_ldt() tries to free 'old_ldt' instead of the newly allocated 'new_ldt', resulting in a memory leak. It also misses to clean up a half populated LDT pagetable, which is not a leak as it gets cleaned up when the process exits. Free both the potentially half populated LDT pagetable and the newly allocated LDT struct. This can be done unconditionally because once an LDT is mapped subsequent maps will succeed, because the PTE page is already populated and the two LDTs fit into that single page. Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on") Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1712311121340.1899@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-31x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()Thomas Gleixner
The preempt_disable/enable() pair in __native_flush_tlb() was added in commit: 5cf0791da5c1 ("x86/mm: Disable preemption during CR3 read+write") ... to protect the UP variant of flush_tlb_mm_range(). That preempt_disable/enable() pair should have been added to the UP variant of flush_tlb_mm_range() instead. The UP variant was removed with commit: ce4a4e565f52 ("x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code") ... but the preempt_disable/enable() pair stayed around. The latest change to __native_flush_tlb() in commit: 6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches") ... added an access to a per CPU variable outside the preempt disabled regions, which makes no sense at all. __native_flush_tlb() must always be called with at least preemption disabled. Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch bad callers independent of the smp_processor_id() debugging. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171230211829.679325424@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-31x86/smpboot: Remove stale TLB flush invocationsThomas Gleixner
smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector() invoke local_flush_tlb() for no obvious reason. Digging in history revealed that the original code in the 2.1 era added those because the code manipulated a swapper_pg_dir pagetable entry. The pagetable manipulation was removed long ago in the 2.3 timeframe, but the TLB flush invocations stayed around forever. Remove them along with the pointless pr_debug()s which come from the same 2.1 change. Reported-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171230211829.586548655@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-29Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 page table isolation updates from Thomas Gleixner: "This is the final set of enabling page table isolation on x86: - Infrastructure patches for handling the extra page tables. - Patches which map the various bits and pieces which are required to get in and out of user space into the user space visible page tables. - The required changes to have CR3 switching in the entry/exit code. - Optimizations for the CR3 switching along with documentation how the ASID/PCID mechanism works. - Updates to dump pagetables to cover the user space page tables for W+X scans and extra debugfs files to analyze both the kernel and the user space visible page tables The whole functionality is compile time controlled via a config switch and can be turned on/off on the command line as well" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits) x86/ldt: Make the LDT mapping RO x86/mm/dump_pagetables: Allow dumping current pagetables x86/mm/dump_pagetables: Check user space page table for WX pages x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy x86/mm/pti: Add Kconfig x86/dumpstack: Indicate in Oops whether PTI is configured and enabled x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming x86/mm: Use INVPCID for __native_flush_tlb_single() x86/mm: Optimize RESTORE_CR3 x86/mm: Use/Fix PCID to optimize user/kernel switches x86/mm: Abstract switching CR3 x86/mm: Allow flushing for future ASID switches x86/pti: Map the vsyscall page if needed x86/pti: Put the LDT in its own PGD if PTI is on x86/mm/64: Make a full PGD-entry size hole in the memory map x86/events/intel/ds: Map debug buffers in cpu_entry_area x86/cpu_entry_area: Add debugstore entries to cpu_entry_area x86/mm/pti: Map ESPFIX into user space x86/mm/pti: Share entry text PMD x86/entry: Align entry text section to PMD boundary ...
2017-12-29genirq/msi, x86/vector: Prevent reservation mode for non maskable MSIThomas Gleixner
The new reservation mode for interrupts assigns a dummy vector when the interrupt is allocated and assigns a real vector when the interrupt is requested. The reservation mode prevents vector pressure when devices with a large amount of queues/interrupts are initialized, but only a minimal subset of those queues/interrupts is actually used. This mode has an issue with MSI interrupts which cannot be masked. If the driver is not careful or the hardware emits an interrupt before the device irq is requestd by the driver then the interrupt ends up on the dummy vector as a spurious interrupt which can cause malfunction of the device or in the worst case a lockup of the machine. Change the logic for the reservation mode so that the early activation of MSI interrupts checks whether: - the device is a PCI/MSI device - the reservation mode of the underlying irqdomain is activated - PCI/MSI masking is globally enabled - the PCI/MSI device uses either MSI-X, which supports masking, or MSI with the maskbit supported. If one of those conditions is false, then clear the reservation mode flag in the irq data of the interrupt and invoke irq_domain_activate_irq() with the reserve argument cleared. In the x86 vector code, clear the can_reserve flag in the vector allocation data so a subsequent free_irq() won't create the same situation again. The interrupt stays assigned to a real vector until pci_disable_msi() is invoked and all allocations are undone. Fixes: 4900be83602b ("x86/vector/msi: Switch to global reservation mode") Reported-by: Alexandru Chirvasitu <achirvasub@gmail.com> Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
2017-12-29genirq/irqdomain: Rename early argument of irq_domain_activate_irq()Thomas Gleixner
The 'early' argument of irq_domain_activate_irq() is actually used to denote reservation mode. To avoid confusion, rename it before abuse happens. No functional change. Fixes: 72491643469a ("genirq/irqdomain: Update irq_domain_ops.activate() signature") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandru Chirvasitu <achirvasub@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org
2017-12-29x86/vector: Use IRQD_CAN_RESERVE flagThomas Gleixner
Set the new CAN_RESERVE flag when the initial reservation for an interrupt happens. The flag is used in a subsequent patch to disable reservation mode for a certain class of MSI devices. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org
2017-12-29x86/apic: Switch all APICs to Fixed delivery modeThomas Gleixner
Some of the APIC incarnations are operating in lowest priority delivery mode. This worked as long as the vector management code allocated the same vector on all possible CPUs for each interrupt. Lowest priority delivery mode does not necessarily respect the affinity setting and may redirect to some other online CPU. This was documented somewhere in the old code and the conversion to single target delivery missed to update the delivery mode of the affected APIC drivers which results in spurious interrupts on some of the affected CPU/Chipset combinations. Switch the APIC drivers over to Fixed delivery mode and remove all leftovers of lowest priority delivery mode. Switching to Fixed delivery mode is not a problem on these CPUs because the kernel already uses Fixed delivery mode for IPIs. The reason for this is that th SDM explicitely forbids lowest prio mode for IPIs. The reason is obvious: If the irq routing does not honor destination targets in lowest prio mode then an IPI targeted at CPU1 might end up on CPU0, which would be a fatal problem in many cases. As a consequence of this change, the apic::irq_delivery_mode field is now pointless, but this needs to be cleaned up in a separate patch. Fixes: fdba46ffb4c2 ("x86/apic: Get rid of multi CPU affinity") Reported-by: vcaputo@pengaru.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: vcaputo@pengaru.com Cc: Pavel Machek <pavel@ucw.cz> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712281140440.1688@nanos