summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2022-01-09riscv: try to allocate crashkern region from 32bit addressible memoryNick Kossifidis
When allocating crash kernel region without explicitly specifying its base address/size, memblock_phys_alloc_range will attempt to allocate memory top to bottom (memblock.bottom_up is false), so the crash kernel region will end up in highmem on 64bit systems. This way swiotlb can't work on the crash kernel, since there won't be any 32bit addressible memory available for the bounce buffers. Try to allocate 32bit addressible memory if available, for the crash kernel by restricting the top search address to be less than SZ_4G. If that fails fallback to the previous behavior. I tested this on HiFive Unmatched where the pci-e controller needs swiotlb to work, with this patch it's possible to access the pci-e controller on crash kernel and mount the rootfs from the nvme. Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> Fixes: e53d28180d4d ("RISC-V: Add kdump support") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: use hart id instead of cpu id on machine_kexecNick Kossifidis
raw_smp_processor_id() doesn't return the hart id as stated in arch/riscv/include/asm/smp.h, use smp_processor_id() instead to get the cpu id, and cpuid_to_hartid_map() to pass the hart id to the next kernel. This fixes kexec on HiFive Unleashed/Unmatched where cpu ids and hart ids don't match (on qemu-virt they match). Fixes: fba8a8674f68 ("RISC-V: Add kexec support") Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: Don't use va_pa_offset on kdumpNick Kossifidis
On kdump instead of using an intermediate step to relocate the kernel, that lives in a "control buffer" outside the current kernel's mapping, we jump to the crash kernel directly by calling riscv_kexec_norelocate(). The current implementation uses va_pa_offset while switching to physical addressing, however since we moved the kernel outside the linear mapping this won't work anymore since riscv_kexec_norelocate() is part of the kernel mapping and we should use kernel_map.va_kernel_pa_offset, and also take XIP kernel into account. We don't really need to use va_pa_offset on riscv_kexec_norelocate, we can just set STVEC to the physical address of the new kernel instead and let the hart jump to the new kernel on the next instruction after setting SATP to zero. This fixes kdump and is also simpler/cleaner. I tested this on the latest qemu and HiFive Unmatched and works as expected. Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear mapping") Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: sifive: fu540-c000: Fix PLIC nodeGeert Uytterhoeven
Fix the device node for the Platform-Level Interrupt Controller (PLIC): - Add missing "#address-cells" property, - Sort properties according to DT bindings. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: sifive: fu540-c000: Drop bogus soc node compatible valuesGeert Uytterhoeven
"make dtbs_check": arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dt.yaml: soc: $nodename:0: '/' was expected From schema: Documentation/devicetree/bindings/riscv/sifive.yaml arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dt.yaml: soc: compatible: 'oneOf' conditional failed, one must be fixed: 'sifive,fu540-c000' is not one of ['sifive,hifive-unleashed-a00'] 'sifive,fu540-c000' is not one of ['sifive,hifive-unmatched-a00'] 'sifive,fu540-c000' was expected 'sifive,fu740-c000' was expected 'sifive,fu540' was expected 'sifive,fu740' was expected From schema: Documentation/devicetree/bindings/riscv/sifive.yaml This happens because the "soc" subnode declares compatibility with "sifive,fu540-c000" and "sifive,fu540", while these are only intended for the root node. Fix this by removing the bogus compatible values from the "soc" node. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: sifive: Group tuples in register propertiesGeert Uytterhoeven
To improve human readability and enable automatic validation, the tuples in "reg" properties containing register blocks should be grouped using angle brackets. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: sifive: Group tuples in interrupt propertiesGeert Uytterhoeven
To improve human readability and enable automatic validation, the tuples in the various properties containing interrupt specifiers should be grouped. Fix this by grouping the tuples of "interrupts" and "interrupts-extended" properties using angle brackets. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: microchip: mpfs: Group tuples in interrupt propertiesGeert Uytterhoeven
To improve human readability and enable automatic validation, the tuples in the various properties containing interrupt specifiers should be grouped. Fix this by grouping the tuples of "interrupts" and "interrupts-extended" properties using angle brackets. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: microchip: mpfs: Fix clock controller nodeGeert Uytterhoeven
Fix the device node for the clock controller: - Remove bogus "reg-names" property, - Remove unneeded "clock-output-names" property. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: microchip: mpfs: Fix reference clock nodeGeert Uytterhoeven
"make dtbs_check" reports: arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dt.yaml: soc: refclk: {'compatible': ['fixed-clock'], '#clock-cells': [[0]], 'clock-frequency': [[600000000]], 'clock-output-names': ['msspllclk'], 'phandle': [[7]]} should not be valid under {'type': 'object'} From schema: dtschema/schemas/simple-bus.yaml Fix this by moving the node out of the "soc" subnode. While at it, rename it to "msspllclk", and drop the now superfluous "clock-output-names" property. Move the actual clock-frequency value to the board DTS, since it is not set until bitstream programming time. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: microchip: mpfs: Fix PLIC nodeGeert Uytterhoeven
Fix the device node for the Platform-Level Interrupt Controller (PLIC): - Add missing "#address-cells" property, - Sort properties according to DT bindings. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: microchip: mpfs: Drop empty chosen nodeGeert Uytterhoeven
It does not make sense to have an (empty) chosen node in an SoC-specific .dtsi, as chosen is meant for system-specific configuration. It is already provided in microchip-mpfs-icicle-kit.dts anyway. Fixes: 0fa6107eca4186ad ("RISC-V: Initial DTS for Microchip ICICLE board") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: canaan: Group tuples in interrupt propertiesGeert Uytterhoeven
To improve human readability and enable automatic validation, the tuples in the various properties containing interrupt specifiers should be grouped. Fix this by grouping the tuples of "interrupts" and "interrupts-extended" properties using angle brackets. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09riscv: dts: canaan: Fix SPI FLASH node namesGeert Uytterhoeven
"make dtbs_check": arch/riscv/boot/dts/canaan/sipeed_maix_bit.dt.yaml: spi-flash@0: $nodename:0: 'spi-flash@0' does not match '^flash(@.*)?$' From schema: Documentation/devicetree/bindings/mtd/jedec,spi-nor.yaml Fix this by renaming all SPI FLASH nodes to "flash". Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09MIPS: compressed: Fix build with ZSTD compressionPaul Cercueil
Fix the following build issues: mips64el-linux-ld: arch/mips/boot/compressed/decompress.o: in function `FSE_buildDTable_internal': decompress.c:(.text.FSE_buildDTable_internal+0x2cc): undefined reference to `__clzdi2' mips64el-linux-ld: arch/mips/boot/compressed/decompress.o: in function `BIT_initDStream': decompress.c:(.text.BIT_initDStream+0x7c): undefined reference to `__clzdi2' mips64el-linux-ld: decompress.c:(.text.BIT_initDStream+0x158): undefined reference to `__clzdi2' mips64el-linux-ld: arch/mips/boot/compressed/decompress.o: in function `ZSTD_buildFSETable_body_default.constprop.0': decompress.c:(.text.ZSTD_buildFSETable_body_default.constprop.0+0x2a8): undefined reference to `__clzdi2' mips64el-linux-ld: arch/mips/boot/compressed/decompress.o: in function `FSE_readNCount_body_default': decompress.c:(.text.FSE_readNCount_body_default+0x130): undefined reference to `__ctzdi2' mips64el-linux-ld: decompress.c:(.text.FSE_readNCount_body_default+0x1a4): undefined reference to `__ctzdi2' mips64el-linux-ld: decompress.c:(.text.FSE_readNCount_body_default+0x2e4): undefined reference to `__clzdi2' mips64el-linux-ld: arch/mips/boot/compressed/decompress.o: in function `HUF_readStats_body_default': decompress.c:(.text.HUF_readStats_body_default+0x184): undefined reference to `__clzdi2' mips64el-linux-ld: decompress.c:(.text.HUF_readStats_body_default+0x1b4): undefined reference to `__clzdi2' mips64el-linux-ld: arch/mips/boot/compressed/decompress.o: in function `ZSTD_DCtx_getParameter': decompress.c:(.text.ZSTD_DCtx_getParameter+0x60): undefined reference to `__clzdi2' Fixes: a510b616131f ("MIPS: Add support for ZSTD-compressed kernels") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Nick Terrell <terrelln@fb.com> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-01-09MIPS: BCM47XX: Add support for Netgear WN2500RP v1 & v2Florian Fainelli
Add support for the Netgear WN2500 RP v1 and v2 Wi-Fi range extenders based on the BCM5357 chipset and supporting 802.11n and 802.11ac. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-01-09MIPS: BCM47XX: Add support for Netgear R6300 v1Florian Fainelli
Add support for the Netgear R6300 v1 Wi-Fi router using a Broadcom BCM4706 chipset and supporting 802.11n and 802.11ac. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-01-09MIPS: BCM47XX: Add LEDs and buttons for Asus RTN-10UFlorian Fainelli
Add the definitions for the buttons and LEDs used on the Asus RTN-10U router. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-01-09MIPS: BCM47XX: Add board entry for Linksys WRT320N v1Florian Fainelli
This router is based on a Broadcom BCM4717A1 chipset and supports 802.11n Wi-Fi. Add a board entry for that router and register LEDs and buttons accordingly. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-01-09MIPS: BCM47XX: Define Linksys WRT310N V2 buttonsFlorian Fainelli
Update the buttons registration code to register the two buttons (WPS, system rester) using the existing BCM47XX_BOARD_LINKSYS_WRT310NV2 board entry. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-01-09MIPS: Remove duplicated include in local.hYang Li
Fix following includecheck warning: ./arch/mips/include/asm/local.h: asm/asm.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-01-08x86/kbuild: Enable CONFIG_KALLSYMS_ALL=y in the defconfigsIngo Molnar
Most distro kernels have this option enabled, to improve debug output. Lockdep also selects it. Enable this in the defconfig kernel as well, to make it more representative of what people are using on x86. Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/YdTn7gssoMVDMgMw@gmail.com
2022-01-08ARM: dts: gpio-ranges property is now requiredPhil Elwell
Since [1], added in 5.7, the absence of a gpio-ranges property has prevented GPIOs from being restored to inputs when released. Add those properties for BCM283x and BCM2711 devices. [1] commit 2ab73c6d8323 ("gpio: Support GPIO controllers without pin-ranges") Link: https://lore.kernel.org/r/20220104170247.956760-1-linus.walleij@linaro.org Fixes: 2ab73c6d8323 ("gpio: Support GPIO controllers without pin-ranges") Fixes: 266423e60ea1 ("pinctrl: bcm2835: Change init order for gpio hogs") Reported-by: Stefan Wahren <stefan.wahren@i2se.com> Reported-by: Florian Fainelli <f.fainelli@gmail.com> Reported-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Phil Elwell <phil@raspberrypi.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20211206092237.4105895-3-phil@raspberrypi.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Olof Johansson <olof@lixom.net>
2022-01-08Merge tag 'soc-fixes-5.16-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Olof Johansson: "A few more fixes have come in, nothing overly severe but would be good to get in by final release: - More specific compatible fields on the qspi controller for socfpga, to enable quirks in the driver - A runtime PM fix for Renesas to fix mismatched reference counts on errors" * tag 'soc-fixes-5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: dts: socfpga: change qspi to "intel,socfpga-qspi" dt-bindings: spi: cadence-quadspi: document "intel,socfpga-qspi" reset: renesas: Fix Runtime PM usage
2022-01-08ptrace/m68k: Stop open coding ptrace_report_syscallEric W. Biederman
The generic function ptrace_report_syscall does a little more than syscall_trace on m68k. The function ptrace_report_syscall stops early if PT_TRACED is not set, it sets ptrace_message, and returns the result of fatal_signal_pending. Setting ptrace_message to a passed in value of 0 is effectively not setting ptrace_message, making that additional work a noop. Returning the result of fatal_signal_pending and letting the caller ignore the result becomes a noop in this change. When a process is ptraced, the flag PT_PTRACED is always set in current->ptrace. Testing for PT_PTRACED in ptrace_report_syscall is just an optimization to fail early if the process is not ptraced. Later on in ptrace_notify, ptrace_stop will test current->ptrace under tasklist_lock and skip performing any work if the task is not ptraced. Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lkml.kernel.org/r/20220103213312.9144-8-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-01-08exit/xtensa: In arch/xtensa/entry.S:Linvalid_mask call make_task_deadEric W. Biederman
There have historically been two big uses of do_exit. The first is it's design use to be the guts of the exit(2) system call. The second use is to terminate a task after something catastrophic has happened like a NULL pointer in kernel code. The function make_task_dead has been added to accomidate the second use. The call to do_exit in Linvalidmask is clearly not a normal userspace exit. As failure handling there are two possible ways to go. If userspace can trigger the issue force_exit_sig should be called. Otherwise make_task_dead probably from the implementation of die is appropriate. Replace the call of do_exit in Linvalidmask with make_task_dead as I don't know xtensa and especially xtensa assembly language well enough to do anything else. Link: https://lkml.kernel.org/r/YdUmN7n4W5YETUhW@zeniv-ca.linux.org.uk Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-01-08csky: Fix function name in csky_alignment() and die()Nathan Chancellor
When building ARCH=csky defconfig: arch/csky/kernel/traps.c: In function 'die': arch/csky/kernel/traps.c:112:17: error: implicit declaration of function 'make_dead_task' [-Werror=implicit-function-declaration] 112 | make_dead_task(SIGSEGV); | ^~~~~~~~~~~~~~ The function's name is make_task_dead(), change it so there is no more build error. Fixes: 0e25498f8cd4 ("exit: Add and use make_task_dead.") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lkml.kernel.org/r/20211227184851.2297759-4-nathan@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2022-01-08h8300: Fix build errors from do_exit() to make_task_dead() transitionNathan Chancellor
When building ARCH=h8300 defconfig: arch/h8300/kernel/traps.c: In function 'die': arch/h8300/kernel/traps.c:109:2: error: implicit declaration of function 'make_dead_task' [-Werror=implicit-function-declaration] 109 | make_dead_task(SIGSEGV); | ^~~~~~~~~~~~~~ arch/h8300/mm/fault.c: In function 'do_page_fault': arch/h8300/mm/fault.c:54:2: error: implicit declaration of function 'make_dead_task' [-Werror=implicit-function-declaration] 54 | make_dead_task(SIGKILL); | ^~~~~~~~~~~~~~ The function's name is make_task_dead(), change it so there is no more build error. Additionally, include linux/sched/task.h in arch/h8300/kernel/traps.c to avoid the same error because do_exit()'s declaration is in kernel.h but make_task_dead()'s is in task.h, which is not included in traps.c. Fixes: 0e25498f8cd4 ("exit: Add and use make_task_dead.") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lkml.kernel.org/r/20211227184851.2297759-3-nathan@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2022-01-08hexagon: Fix function name in die()Nathan Chancellor
When building ARCH=hexagon defconfig: arch/hexagon/kernel/traps.c:217:2: error: implicit declaration of function 'make_dead_task' [-Werror,-Wimplicit-function-declaration] make_dead_task(err); ^ The function's name is make_task_dead(), change it so there is no more build error. Fixes: 0e25498f8cd4 ("exit: Add and use make_task_dead.") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lkml.kernel.org/r/20211227184851.2297759-2-nathan@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2022-01-08microblaze: use built-in function to get CPU_{MAJOR,MINOR,REV}Masahiro Yamada
Use built-in functions instead of shell commands to avoid forking processes. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
2022-01-08kbuild: do not quote string values in include/config/auto.confMasahiro Yamada
The previous commit fixed up all shell scripts to not include include/config/auto.conf. Now that include/config/auto.conf is only included by Makefiles, we can change it into a more Make-friendly form. Previously, Kconfig output string values enclosed with double-quotes (both in the .config and include/config/auto.conf): CONFIG_X="foo bar" Unlike shell, Make handles double-quotes (and single-quotes as well) verbatim. We must rip them off when used. There are some patterns: [1] $(patsubst "%",%,$(CONFIG_X)) [2] $(CONFIG_X:"%"=%) [3] $(subst ",,$(CONFIG_X)) [4] $(shell echo $(CONFIG_X)) These are not only ugly, but also fragile. [1] and [2] do not work if the value contains spaces, like CONFIG_X=" foo bar " [3] does not work correctly if the value contains double-quotes like CONFIG_X="foo\"bar" [4] seems to work better, but has a cost of forking a process. Anyway, quoted strings were always PITA for our Makefiles. This commit changes Kconfig to stop quoting in include/config/auto.conf. These are the string type symbols referenced in Makefiles or scripts: ACPI_CUSTOM_DSDT_FILE ARC_BUILTIN_DTB_NAME ARC_TUNE_MCPU BUILTIN_DTB_SOURCE CC_IMPLICIT_FALLTHROUGH CC_VERSION_TEXT CFG80211_EXTRA_REGDB_KEYDIR EXTRA_FIRMWARE EXTRA_FIRMWARE_DIR EXTRA_TARGETS H8300_BUILTIN_DTB INITRAMFS_SOURCE LOCALVERSION MODULE_SIG_HASH MODULE_SIG_KEY NDS32_BUILTIN_DTB NIOS2_DTB_SOURCE OPENRISC_BUILTIN_DTB SOC_CANAAN_K210_DTB_SOURCE SYSTEM_BLACKLIST_HASH_LIST SYSTEM_REVOCATION_KEYS SYSTEM_TRUSTED_KEYS TARGET_CPU UNUSED_KSYMS_WHITELIST XILINX_MICROBLAZE0_FAMILY XILINX_MICROBLAZE0_HW_VER XTENSA_VARIANT_NAME I checked them one by one, and fixed up the code where necessary. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-01-07riscv/mm: Enable THP migrationNanyong Sun
Add two THP helpers required to create PMD migration swap entries, and enable THP migration via ARCH_ENABLE_THP_MIGRATION. This can reduce time of THP migration without splitting and guarantee the migrated pages are still contiguous. Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-07riscv/mm: Adjust PAGE_PROT_NONE to comply with THP semanticsNanyong Sun
This is a preparation for enabling THP migration. As the commit b65399f6111b("arm64/mm: Change THP helpers to comply with generic MM semantics") mentioned, pmd_present() and pmd_trans_huge() are expected to behave in the following manner: ------------------------------------------------------------------------- | PMD states | pmd_present | pmd_trans_huge | ------------------------------------------------------------------------- | Mapped | Yes | Yes | ------------------------------------------------------------------------- | Splitting | Yes | Yes | ------------------------------------------------------------------------- | Migration/Swap | No | No | ------------------------------------------------------------------------- At present the PROT_NONE bit reuses the READ bit could not comply with above semantics with two problems: 1. When splitting a PMD THP, PMD is first invalidated with pmdp_invalidate()->pmd_mkinvalid(), which clears the PRESENT bit and PROT_NONE bit/READ bit, if the PMD is read-only, then the PAGE_LEAF property is also cleared, which results in pmd_present() return false. 2. When migrating, the swap entry only clear the PRESENT bit and PROT_NONE bit/READ bit, the W/X bit may be set, so _PAGE_LEAF may be true which results in pmd_present() return true. Solution: Adjust PROT_NONE bit from READ to GLOBAL bit can satisfy the above rules: 1. GLOBAL bit has no other meanings, not like the R/W/X bit, which is also relative with _PAGE_LEAF property. 2. GLOBAL bit is at bit 5, making swap entry start from bit 6, bit 0-5 are zero, which means the PRESENT, PROT_NONE, and PAGE_LEAF are all false, then the pmd_present() and pmd_trans_huge() return false when in migration/swap. Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-07kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUIDJing Liu
KVM_GET_SUPPORTED_CPUID should not include any dynamic xstates in CPUID[0xD] if they have not been requested with prctl. Otherwise a process which directly passes KVM_GET_SUPPORTED_CPUID to KVM_SET_CPUID2 would now fail even if it doesn't intend to use a dynamically enabled feature. Userspace must know that prctl is required and allocate >4K xstate buffer before setting any dynamic bit. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-5-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07kvm: x86: Fix xstate_required_size() to follow XSTATE alignment ruleJing Liu
CPUID.0xD.1.EBX enumerates the size of the XSAVE area (in compacted format) required by XSAVES. If CPUID.0xD.i.ECX[1] is set for a state component (i), this state component should be located on the next 64-bytes boundary following the preceding state component in the compacted layout. Fix xstate_required_size() to follow the alignment rule. AMX is the first state component with 64-bytes alignment to catch this bug. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-4-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07x86/fpu: Prepare guest FPU for dynamically enabled FPU featuresThomas Gleixner
To support dynamically enabled FPU features for guests prepare the guest pseudo FPU container to keep track of the currently enabled xfeatures and the guest permissions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-3-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07x86/fpu: Extend fpu_xstate_prctl() with guest permissionsThomas Gleixner
KVM requires a clear separation of host user space and guest permissions for dynamic XSTATE components. Add a guest permissions member to struct fpu and a separate set of prctl() arguments: ARCH_GET_XCOMP_GUEST_PERM and ARCH_REQ_XCOMP_GUEST_PERM. The semantics are equivalent to the host user space permission control except for the following constraints: 1) Permissions have to be requested before the first vCPU is created 2) Permissions are frozen when the first vCPU is created to ensure consistency. Any attempt to expand permissions via the prctl() after that point is rejected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-2-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07riscv/head: fix misspelling of guaranteedhasheddan
Fixes misspelling of guaranteed in comment describing why fetching fence is guaranteed to work when switching to kernel page tables. Signed-off-by: hasheddan <georgedanielmangum@gmail.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "Two small fixes for x86: - lockdep WARN due to missing lock nesting annotation - NULL pointer dereference when accessing debugfs" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Check for rmaps allocation KVM: SEV: Mark nested locking of kvm->lock
2022-01-07KVM: x86: Check for rmaps allocationNikunj A Dadhania
With TDP MMU being the default now, access to mmu_rmaps_stat debugfs file causes following oops: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 3185 Comm: cat Not tainted 5.16.0-rc4+ #204 RIP: 0010:pte_list_count+0x6/0x40 Call Trace: <TASK> ? kvm_mmu_rmaps_stat_show+0x15e/0x320 seq_read_iter+0x126/0x4b0 ? aa_file_perm+0x124/0x490 seq_read+0xf5/0x140 full_proxy_read+0x5c/0x80 vfs_read+0x9f/0x1a0 ksys_read+0x67/0xe0 __x64_sys_read+0x19/0x20 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fca6fc13912 Return early when rmaps are not present. Reported-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220105040337.4234-1-nikunj@amd.com> Cc: stable@vger.kernel.org Fixes: 3bcd0662d66f ("KVM: X86: Introduce mmu_rmaps_stat per-vm debugfs file") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: SEV: Mark nested locking of kvm->lockWanpeng Li
Both source and dest vms' kvm->locks are held in sev_lock_two_vms. Mark one with a different subtype to avoid false positives from lockdep. Fixes: c9d61dcb0bc26 (KVM: SEV: accept signals in sev_lock_two_vms) Reported-by: Yiru Xu <xyru1999@gmail.com> Tested-by: Jinrong Liang <cloudliang@tencent.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1641364863-26331-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07x86/sgx: Fix NULL pointer dereference on non-SGX systemsDave Hansen
== Problem == Nathan Chancellor reported an oops when aceessing the 'sgx_total_bytes' sysfs file: https://lore.kernel.org/all/YbzhBrimHGGpddDM@archlinux-ax161/ The sysfs output code accesses the sgx_numa_nodes[] array unconditionally. However, this array is allocated during SGX initialization, which only occurs on systems where SGX is supported. If the sysfs file is accessed on systems without SGX support, sgx_numa_nodes[] is NULL and an oops occurs. == Solution == To fix this, hide the entire nodeX/x86/ attribute group on systems without SGX support using the ->is_visible attribute group callback. Unfortunately, SGX is initialized via a device_initcall() which occurs _after_ the ->is_visible() callback. Instead of moving SGX initialization earlier, call sysfs_update_group() during SGX initialization to update the group visiblility. This update requires moving the SGX sysfs code earlier in sgx/main.c. There are no code changes other than the addition of arch_update_sysfs_visibility() and a minor whitespace fixup to arch_node_attr_is_visible() which checkpatch caught. CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-sgx@vger.kernel.org Cc: x86@kernel.org Fixes: 50468e431335 ("x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/20220104171527.5E8416A8@davehans-spike.ostc.intel.com
2022-01-07KVM: SVM: include CR3 in initial VMSA state for SEV-ES guestsMichael Roth
Normally guests will set up CR3 themselves, but some guests, such as kselftests, and potentially CONFIG_PVH guests, rely on being booted with paging enabled and CR3 initialized to a pre-allocated page table. Currently CR3 updates via KVM_SET_SREGS* are not loaded into the guest VMCB until just prior to entering the guest. For SEV-ES/SEV-SNP, this is too late, since it will have switched over to using the VMSA page prior to that point, with the VMSA CR3 copied from the VMCB initial CR3 value: 0. Address this by sync'ing the CR3 value into the VMCB save area immediately when KVM_SET_SREGS* is issued so it will find it's way into the initial VMSA. Suggested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-Id: <20211216171358.61140-10-michael.roth@amd.com> [Remove vmx_post_set_cr3; add a remark about kvm_set_cr3 not calling the new hook. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: VMX: Provide vmread version using asm-goto-with-outputsPeter Zijlstra
Use asm-goto-output for smaller fast path code. Message-Id: <YbcbbGW2GcMx6KpD@hirez.programming.kicks-ass.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: x86: Fix wall clock writes in Xen shared_info not to mark page dirtyDavid Woodhouse
When dirty ring logging is enabled, any dirty logging without an active vCPU context will cause a kernel oops. But we've already declared that the shared_info page doesn't get dirty tracking anyway, since it would be kind of insane to mark it dirty every time we deliver an event channel interrupt. Userspace is supposed to just assume it's always dirty any time a vCPU can run or event channels are routed. So stop using the generic kvm_write_wall_clock() and just write directly through the gfn_to_pfn_cache that we already have set up. We can make kvm_write_wall_clock() static in x86.c again now, but let's not remove the 'sec_hi_ofs' argument even though it's not used yet. At some point we *will* want to use that for KVM guests too. Fixes: 629b5348841a ("KVM: x86/xen: update wallclock region") Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211210163625.2886-6-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel deliveryDavid Woodhouse
This adds basic support for delivering 2 level event channels to a guest. Initially, it only supports delivery via the IRQ routing table, triggered by an eventfd. In order to do so, it has a kvm_xen_set_evtchn_fast() function which will use the pre-mapped shared_info page if it already exists and is still valid, while the slow path through the irqfd_inject workqueue will remap the shared_info page if necessary. It sets the bits in the shared_info page but not the vcpu_info; that is deferred to __kvm_xen_has_interrupt() which raises the vector to the appropriate vCPU. Add a 'verbose' mode to xen_shinfo_test while adding test cases for this. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211210163625.2886-5-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: x86/xen: Maintain valid mapping of Xen shared_info pageDavid Woodhouse
Use the newly reinstated gfn_to_pfn_cache to maintain a kernel mapping of the Xen shared_info page so that it can be accessed in atomic context. Note that we do not participate in dirty tracking for the shared info page and we do not explicitly mark it dirty every single tim we deliver an event channel interrupts. We wouldn't want to do that even if we *did* have a valid vCPU context with which to do so. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211210163625.2886-4-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: Reinstate gfn_to_pfn_cache with invalidation supportDavid Woodhouse
This can be used in two modes. There is an atomic mode where the cached mapping is accessed while holding the rwlock, and a mode where the physical address is used by a vCPU in guest mode. For the latter case, an invalidation will wake the vCPU with the new KVM_REQ_GPC_INVALIDATE, and the architecture will need to refresh any caches it still needs to access before entering guest mode again. Only one vCPU can be targeted by the wake requests; it's simple enough to make it wake all vCPUs or even a mask but I don't see a use case for that additional complexity right now. Invalidation happens from the invalidate_range_start MMU notifier, which needs to be able to sleep in order to wake the vCPU and wait for it. This means that revalidation potentially needs to "wait" for the MMU operation to complete and the invalidate_range_end notifier to be invoked. Like the vCPU when it takes a page fault in that period, we just spin — fixing that in a future patch by implementing an actual *wait* may be another part of shaving this particularly hirsute yak. As noted in the comments in the function itself, the only case where the invalidate_range_start notifier is expected to be called *without* being able to sleep is when the OOM reaper is killing the process. In that case, we expect the vCPU threads already to have exited, and thus there will be nothing to wake, and no reason to wait. So we clear the KVM_REQUEST_WAIT bit and send the request anyway, then complain loudly if there actually *was* anything to wake up. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211210163625.2886-3-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07x86/kvm: Silence per-cpu pr_info noise about KVM clocks and steal timeDavid Woodhouse
I made the actual CPU bringup go nice and fast... and then Linux spends half a minute printing stupid nonsense about clocks and steal time for each of 256 vCPUs. Don't do that. Nobody cares. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211209150938.3518-12-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: x86: Update vPMCs when retiring branch instructionsEric Hankland
When KVM retires a guest branch instruction through emulation, increment any vPMCs that are configured to monitor "branch instructions retired," and update the sample period of those counters so that they will overflow at the right time. Signed-off-by: Eric Hankland <ehankland@google.com> [jmattson: - Split the code to increment "branch instructions retired" into a separate commit. - Moved/consolidated the calls to kvm_pmu_trigger_event() in the emulation of VMLAUNCH/VMRESUME to accommodate the evolution of that code. ] Fixes: f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests") Signed-off-by: Jim Mattson <jmattson@google.com> Message-Id: <20211130074221.93635-7-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>