summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2024-03-12cxl/region: Add memory hotplug notifier for cxl regionDave Jiang
When the CXL region is formed, the driver computes the performance data for the region. However this data is not available at the node data collection that has been populated by the HMAT during kernel initialization. Add a memory hotplug notifier to update the access coordinates to the 'struct memory_target' context kept by the HMAT_REPORTING code. Add CXL_CALLBACK_PRI for a memory hotplug callback priority. Set the priority number to be called before HMAT_CALLBACK_PRI. The CXL update must happen before hmat_callback(). A new HMAT_REPORTING helper hmat_update_target_coordinates() is added in order to allow CXL to update the memory_target access coordinates. A new ext_updated member is added to the memory_target to indicate that the access coordinates within the memory_target has been updated by an external agent such as CXL. This prevents data being overwritten by the hmat_update_target_attrs() triggered by hmat_callback(). Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Huang, Ying <ying.huang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-12-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl/region: Add sysfs attribute for locality attributes of CXL regionsDave Jiang
Add read/write latencies and bandwidth sysfs attributes for the enabled CXL region. The bandwidth is the aggregated bandwidth of all devices that contribute to the CXL region. The latency is the worst latency of the device amongst all the devices that contribute to the CXL region. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-11-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl/region: Calculate performance data for a regionDave Jiang
Calculate and store the performance data for a CXL region. Find the worst read and write latency for all the included ranges from each of the devices that attributes to the region and designate that as the latency data. Sum all the read and write bandwidth data for each of the device region and that is the total bandwidth for the region. The perf list is expected to be constructed before the endpoint decoders are registered and thus there should be no early reading of the entries from the region assemble action. The calling of the region qos calculate function is under the protection of cxl_dpa_rwsem and will ensure that all DPA associated work has completed. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-10-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl: Set cxlmd->endpoint before adding port deviceDave Jiang
Move setting of cxlmd->endpoint to before calling add_device() on the port device. Otherwise when referencing cxlmd->endpoint in region discovery code that is triggered by the port driver probe function, the endpoint port pointer is not valid. Current code does not hit this issue yet since cxlmd->endpoint is not being referenced during region discovery. However follow on code that does performance calculations will. Tested-by: Wonjae Lee <wj28.lee@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-9-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl: Move QoS class to be calculated from the nearest CPUDave Jiang
Retrieve the qos_class (QTG ID) using the access coordinates from the nearest CPU rather than the nearst initiator that may not be a CPU. This may be the more appropriate number that applications care about. For most cases, access0 and access1 have the same values. Link: https://lore.kernel.org/linux-cxl/20240112113023.00006c50@Huawei.com/ Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-8-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl: Split out host bridge access coordinatesDave Jiang
The difference between access class 0 and access class 1 for 'struct access_coordinate', if any, is that class 0 is for the distance from the target to the closest initiator and that class 1 is for the distance from the target to the closest CPU. For CXL memory, the nearest initiator may not necessarily be a CPU node. The performance path from the CXL endpoint to the host bridge should remain the same. However, the numbers extracted and stored from HMAT is the difference for the two access classes. Split out the performance numbers for the host bridge (generic target) from the calculation of the entire path in order to allow calculation of both access classes for a CXL region. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-7-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl: Split out combine_coordinates() for common shared usageDave Jiang
Refactor the common code of combining coordinates in order to reduce code. Create a new function cxl_cooordinates_combine() it combine two 'struct access_coordinate'. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-6-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12ACPI: HMAT / cxl: Add retrieval of generic port coordinates for both access ↵Dave Jiang
classes Update acpi_get_genport_coordinates() to allow retrieval of both access classes of the 'struct access_coordinate' for a generic target. The update will allow CXL code to compute access coordinates for both access class. Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-5-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12ACPI: HMAT: Introduce 2 levels of generic port access classDave Jiang
In order to compute access0 and access1 classes for CXL memory, 2 levels of generic port information must be stored. Access0 will indicate the generic port access coordinates to the closest initiator and access1 will indicate the generic port access coordinates to the cloest CPU. Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-4-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12base/node / ACPI: Enumerate node access class for 'struct access_coordinate'Dave Jiang
Both generic node and HMAT handling code have been using magic numbers to indicate access classes for 'struct access_coordinate'. Introduce enums to enumerate the access0 and access1 classes shared by the two subsystems. Update the function parameters and callers as appropriate to utilize the new enum. Access0 is named to ACCESS_COORDINATE_LOCAL in order to indicate that the access class is for 'struct access_coordinate' between a target node and the nearest initiator node. Access1 is named to ACCESS_COORDINATE_CPU in order to indicate that the access class is for 'struct access_coordinate' between a target node and the nearest CPU node. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-3-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12ACPI: HMAT: Remove register of memory node for generic targetDave Jiang
For generic targets, there's no reason to call register_memory_node_under_compute_node() with the access levels that are only visible to HMAT handling code. Only update the attributes and rename hmat_register_generic_target_initiators() to hmat_update_generic_target(). The original call path ends up triggering register_memory_node_under_compute_node(). Although the access level would be "3" and not impact any current node arrays, it introduces unwanted data into the numa node access_coordinate array. Fixes: a3a3e341f169 ("acpi: numa: Add setting of generic port system locality attributes") Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-2-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12Merge tag 'soc-drivers-6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "This is the usual mix of updates for drivers that are used on (mostly ARM) SoCs with no other top-level subsystem tree, including: - The SCMI firmware subsystem gains support for version 3.2 of the specification and updates to the notification code - Feature updates for Tegra and Qualcomm platforms for added hardware support - A number of platforms get soc_device additions for identifying newly added chips from Renesas, Qualcomm, Mediatek and Google - Trivial improvements for firmware and memory drivers amongst others, in particular 'const' annotations throughout multiple subsystems" * tag 'soc-drivers-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (96 commits) tee: make tee_bus_type const soc: qcom: aoss: add missing kerneldoc for qmp members soc: qcom: geni-se: drop unused kerneldoc struct geni_wrapper param soc: qcom: spm: fix building with CONFIG_REGULATOR=n bus: ti-sysc: constify the struct device_type usage memory: stm32-fmc2-ebi: keep power domain on memory: stm32-fmc2-ebi: add MP25 RIF support memory: stm32-fmc2-ebi: add MP25 support memory: stm32-fmc2-ebi: check regmap_read return value dt-bindings: memory-controller: st,stm32: add MP25 support dt-bindings: bus: imx-weim: convert to YAML watchdog: s3c2410_wdt: use exynos_get_pmu_regmap_by_phandle() for PMU regs soc: samsung: exynos-pmu: Add regmap support for SoCs that protect PMU regs MAINTAINERS: Update SCMI entry with HWMON driver MAINTAINERS: samsung: gs101: match patches touching Google Tensor SoC memory: tegra: Fix indentation memory: tegra: Add BPMP and ICC info for DLA clients memory: tegra: Correct DLA client names dt-bindings: memory: renesas,rpc-if: Document R-Car V4M support firmware: arm_scmi: Update the supported clock protocol version ...
2024-03-12Merge tag 'soc-dt-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds
Pull SoC device tree updates from Arnd Bergmann: "There is very little going on with new SoC support this time, all the new chips are variations of others that we already support, and they are all based on ARMv8 cores: - Mediatek MT7981B (Filogic 820) and MT7988A (Filogic 880) are networking SoCs designed to be used in wireless routers, similar to the already supported MT7986A (Filogic 830). - NXP i.MX8DXP is a variant of i.MX8QXP, with two CPU cores less. These are used in many embedded and industrial applications. - Renesas R8A779G2 (R-Car V4H ES2.0) and R8A779H0 (R-Car V4M) are automotive SoCs. - TI J722S is another automotive variant of its K3 family, related to the AM62 series. There are a total of 7 new arm32 machines and 45 arm64 ones, including - Two Android phones based on the old Tegra30 chip - Two machines using Cortex-A53 SoCs from Allwinner, a mini PC and a SoM development board - A set-top box using Amlogic Meson G12A S905X2 - Eight embedded board using NXP i.MX6/8/9 - Three machines using Mediatek network router chips - Ten Chromebooks, all based on Mediatek MT8186 - One development board based on Mediatek MT8395 (Genio 1200) - Seven tablets and phones based on Qualcomm SoCs, most of them from Samsung. - A third development board for Qualcomm SM8550 (Snapdragon 8 Gen 2) - Three variants of the "White Hawk" board for Renesas automotive SoCs - Ten Rockchips RK35xx based machines, including NAS, Tablet, Game console and industrial form factors. - Three evaluation boards for TI K3 based SoCs The other changes are mainly the usual feature additions for existing hardware, cleanups, and dtc compile time fixes. One notable change is the inclusion of PowerVR SGX GPU nodes on TI SoCs" * tag 'soc-dt-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (824 commits) riscv: dts: Move BUILTIN_DTB_SOURCE to common Kconfig riscv: dts: starfive: jh7100: fix root clock names ARM: dts: samsung: exynos4412: decrease memory to account for unusable region arm64: dts: qcom: sm8250-xiaomi-elish: set rotation arm64: dts: qcom: sm8650: Fix SPMI channels size arm64: dts: qcom: sm8550: Fix SPMI channels size arm64: dts: rockchip: Fix name for UART pin header on qnap-ts433 arm: dts: marvell: clearfog-gtr-l8: align port numbers with enclosure arm: dts: marvell: clearfog-gtr-l8: add support for second sfp connector dt-bindings: soc: renesas: renesas-soc: Add pattern for gray-hawk dtc: Enable dtc interrupt_provider check arm64: dts: st: add video encoder support to stm32mp255 arm64: dts: st: add video decoder support to stm32mp255 ARM: dts: stm32: enable crypto accelerator on stm32mp135f-dk ARM: dts: stm32: enable CRC on stm32mp135f-dk ARM: dts: stm32: add CRC on stm32mp131 ARM: dts: add stm32f769-disco-mb1166-reva09 ARM: dts: stm32: add display support on stm32f769-disco ARM: dts: stm32: rename mmc_vcard to vcc-3v3 on stm32f769-disco ARM: dts: stm32: add DSI support on stm32f769 ...
2024-03-12Merge tag 'm68k-for-v6.9-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - Make the Zorro bus type constant - defconfig updates * tag 'm68k-for-v6.9-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: defconfig: Update defconfigs for v6.8-rc1 zorro: Make zorro_bus_type const
2024-03-12Merge branch 'pci/controller/qcom'Bjorn Helgaas
- Split dt-binding qcom,pcie.yaml into qcom,pcie-common.yaml and separate files for SA8775p, SC7280, SC8180X, SC8280XP, SM8150, SM8250, SM8350, SM8450, SM8550 for easier reviewing (Krzysztof Kozlowski) - Allow 'required-opps' DT property for SoCs that require a minimum performance level for the power domain (Johan Hovold) - Remove requirement for 'msi-map-mask' DT property since it depends on how MSIs are mapped (Johan Hovold) - Disable ASPM L0s for sc8280xp, sa8540p and sa8295p because their PHY configuration isn't tuned for L0s, which results in many Correctable Errors (Johan Hovold) - Enable BDF to SID translation by disabling bypass mode (Manivannan Sadhasivam) - Add DT binding and driver support for X1E80100 (Abel Vesa) * pci/controller/qcom: PCI: qcom: Add X1E80100 PCIe support dt-bindings: PCI: qcom: Document the X1E80100 PCIe Controller PCI: qcom: Enable BDF to SID translation properly PCI: qcom: Disable ASPM L0s for sc8280xp, sa8540p and sa8295p dt-bindings: PCI: qcom: Do not require 'msi-map-mask' dt-bindings: PCI: qcom: Allow 'required-opps' dt-bindings: PCI: qcom,pcie-sa8775p: Move SA8775p to dedicated schema dt-bindings: PCI: qcom,pcie-sc7280: Move SC7280 to dedicated schema dt-bindings: PCI: qcom,pcie-sc8180x: Move SC8180X to dedicated schema dt-bindings: PCI: qcom,pcie-sc8280xp: Move SC8280XP to dedicated schema dt-bindings: PCI: qcom,pcie-sm8350: Move SM8350 to dedicated schema dt-bindings: PCI: qcom,pcie-sm8150: Move SM8150 to dedicated schema dt-bindings: PCI: qcom,pcie-sm8250: Move SM8250 to dedicated schema dt-bindings: PCI: qcom,pcie-sm8450: Move SM8450 to dedicated schema dt-bindings: PCI: qcom,pcie-sm8550: Move SM8550 to dedicated schema
2024-03-12Merge branch 'pci/controller/imx'Bjorn Helgaas
- Replace variant switches with drvdata clock descriptions and clk_bulk API (Frank Li) - Replace variant switches with drvdata PHY flag for devm_phy_get() (Frank Li) - Replace variant switches with drvdata HAS_RESET flags for handling resets (Frank Li) - Replace variant switches with drvdata for LTSSM control bits (Frank Li) - Replace variant switches with drvdata for controller Root Complex vs Endpoint modes (Frank Li) - Replace variant switches with drvdata .init_phy() callback pointers (Frank Li) - Drop dt-binding redundant duplicate clock check (Frank Li) - reg/reg-name (Frank Li) - Drop addr_space retrieval code since dw_pcie_ep_init() already does it (Frank Li) - Add epc_features to drvdata (Frank Li) - Add iMX95 Root Complex and Endpoint support and dt-binding compatible strings (Frank Li) * pci/controller/imx: PCI: imx6: Add iMX95 Endpoint (EP) support dt-bindings: imx6q-pcie: Add iMX95 pcie endpoint compatible string PCI: imx6: Add epc_features in imx6_pcie_drvdata PCI: imx6: Clean up addr_space retrieval code PCI: imx6: Add iMX95 PCIe Root Complex support dt-bindings: imx6q-pcie: Add imx95 pcie compatible string dt-bindings: imx6q-pcie: Restruct reg and reg-name dt-bindings: imx6q-pcie: Clean up duplicate clocks check PCI: imx6: Simplify switch-case logic by introducing init_phy() callback PCI: imx6: Simplify configure_type() by using mode_off and mode_mask PCI: imx6: Simplify ltssm_enable() by using ltssm_off and ltssm_mask PCI: imx6: Simplify reset handling by using *_FLAG_HAS_*_RESET PCI: imx6: Simplify PHY handling by using IMX6_PCIE_FLAG_HAS_PHYDRV PCI: imx6: Simplify clock handling by using clk_bulk*() function
2024-03-12Merge branch 'pci/controller/hyperv'Bjorn Helgaas
- Fix ring buffer size at 16KB (not 4 pages), which reduces memory usage by 128KBytes with 64KB pages (Michael Kelley) * pci/controller/hyperv: PCI: hv: Fix ring buffer size calculation
2024-03-12Merge branch 'pci/controller/dwc'Bjorn Helgaas
- Fall back to allocating 64-bit MSI DMA address if unable to allocate a 32-bit address (Ajay Agarwal) * pci/controller/dwc: PCI: dwc: endpoint: Fix advertised resizable BAR size PCI: dwc: Strengthen the MSI address allocation logic
2024-03-12Merge branch 'pci/controller/cadence'Bjorn Helgaas
- Clear the ARI Capability Next Function Number of the last function (Jasko-EXT Wojciech) * pci/controller/cadence: PCI: cadence: Clear the ARI Capability Next Function Number of the last function
2024-03-12Merge branch 'pci/controller/broadcom'Bjorn Helgaas
- Fix polling for MDIO write completion, which previously used the wrong access width so it always indicated "completed" (Jonathan Bell) * pci/controller/broadcom: PCI: brcmstb: Fix broken brcm_pcie_mdio_write() polling
2024-03-12Merge branch 'pci/misc'Bjorn Helgaas
- Make pcie_port_bus_type const (Ricardo B. Marliere) * pci/misc: PCI: Make pcie_port_bus_type const
2024-03-12Merge branch 'pci/endpoint'Bjorn Helgaas
- Make pci_epf_bus_type const (Ricardo B. Marliere) - Update pci_epf_alloc_space() interface and move bar_fixed_size[] testing from pci_epf_test_alloc_space() and pci_epf_configure_bar() into it (Niklas Cassel) - Drop redundant size & alignment checking from epf_ntb_db_bar_init() since pci_epf_alloc_space() already does it (Niklas Cassel) - Fix ntb_register_device() name leak in error path (Yang Yingliang) - Return actual error code for pci_vntb_probe() failure (Yang Yingliang) - Prefix sysfs function names with "pci_epf_mhi_", e.g., "/sys/kernel/config/functions/pci_epf_mhi_sdx55", to leave room for other endpoint functions (Manivannan Sadhasivam) - Add EPF MHI support for SA8775P SoC (Mrinmay Sarkar) - Consolidate endpoint BAR hardware description in new struct pci_epc_bar_desc (Niklas Cassel) - Drop only_64bit on reserved BARs (Niklas Cassel) * pci/endpoint: PCI: endpoint: Drop only_64bit on reserved BARs PCI: endpoint: Clean up hardware description for BARs PCI: epf-mhi: Add support for SA8775P SoC PCI: epf-mhi: Add "pci_epf_mhi_" prefix to the function names PCI: epf-vntb: Return actual error code during pci_vntb_probe() failure NTB: fix possible name leak in ntb_register_device() PCI: endpoint: pci-epf-vntb: Remove superfluous checks for pci_epf_alloc_space() API PCI: endpoint: pci-epf-test: Remove superfluous checks for pci_epf_alloc_space() API PCI: endpoint: Improve pci_epf_alloc_space() API PCI: endpoint: Refactor pci_epf_alloc_space() API PCI: endpoint: Make pci_epf_bus_type const
2024-03-12Merge branch 'pci/virtualization'Bjorn Helgaas
- Avoid Secondary Bus Reset on the LSI / Agere FW643, which allows it to be assigned to VMs with VFIO, at the cost of leaking FW643 state between VMs (Edmund Raile) * pci/virtualization: PCI: Mark LSI FW643 to avoid bus reset
2024-03-12Merge branch 'pci/sysfs'Bjorn Helgaas
- Compile pci-sysfs.c only if CONFIG_SYSFS=y, which reduces kernel size by ~120KB when it's disabled (Lukas Wunner) - Remove obsolete pci_cleanup_rom() declaration (Lukas Wunner) - Rework pci_dev_resource_resize_attr(n) macros to call a function instead of duplicating most of the body, which saves about 2.5KB of text (Ilpo Järvinen) * pci/sysfs: PCI/sysfs: Demacrofy pci_dev_resource_resize_attr(n) functions PCI: Remove obsolete pci_cleanup_rom() declaration PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y # Conflicts: # drivers/pci/Makefile
2024-03-12Merge tag 's390-6.9-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Various virtual vs physical address usage fixes - Fix error handling in Processor Activity Instrumentation device driver, and export number of counters with a sysfs file - Allow for multiple events when Processor Activity Instrumentation counters are monitored in system wide sampling - Change multiplier and shift values of the Time-of-Day clock source to improve steering precision - Remove a couple of unneeded GFP_DMA flags from allocations - Disable mmap alignment if randomize_va_space is also disabled, to avoid a too small heap - Various changes to allow s390 to be compiled with LLVM=1, since ld.lld and llvm-objcopy will have proper s390 support witch clang 19 - Add __uninitialized macro to Compiler Attributes. This is helpful with s390's FPU code where some users have up to 520 byte stack frames. Clearing such stack frames (if INIT_STACK_ALL_PATTERN or INIT_STACK_ALL_ZERO is enabled) before they are used contradicts the intention (performance improvement) of such code sections. - Convert switch_to() to an out-of-line function, and use the generic switch_to header file - Replace the usage of s390's debug feature with pr_debug() calls within the zcrypt device driver - Improve hotplug support of the Adjunct Processor device driver - Improve retry handling in the zcrypt device driver - Various changes to the in-kernel FPU code: - Make in-kernel FPU sections preemptible - Convert various larger inline assemblies and assembler files to C, mainly by using singe instruction inline assemblies. This increases readability, but also allows makes it easier to add proper instrumentation hooks - Cleanup of the header files - Provide fast variants of csum_partial() and csum_partial_copy_nocheck() based on vector instructions - Introduce and use a lock to synchronize accesses to zpci device data structures to avoid inconsistent states caused by concurrent accesses - Compile the kernel without -fPIE. This addresses the following problems if the kernel is compiled with -fPIE: - It uses dynamic symbols (.dynsym), for which the linker refuses to allow more than 64k sections. This can break features which use '-ffunction-sections' and '-fdata-sections', including kpatch-build and function granular KASLR - It unnecessarily uses GOT relocations, adding an extra layer of indirection for many memory accesses - Fix shared_cpu_list for CPU private L2 caches, which incorrectly were reported as globally shared * tag 's390-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (117 commits) s390/tools: handle rela R_390_GOTPCDBL/R_390_GOTOFF64 s390/cache: prevent rebuild of shared_cpu_list s390/crypto: remove retry loop with sleep from PAES pkey invocation s390/pkey: improve pkey retry behavior s390/zcrypt: improve zcrypt retry behavior s390/zcrypt: introduce retries on in-kernel send CPRB functions s390/ap: introduce mutex to lock the AP bus scan s390/ap: rework ap_scan_bus() to return true on config change s390/ap: clarify AP scan bus related functions and variables s390/ap: rearm APQNs bindings complete completion s390/configs: increase number of LOCKDEP_BITS s390/vfio-ap: handle hardware checkstop state on queue reset operation s390/pai: change sampling event assignment for PMU device driver s390/boot: fix minor comment style damages s390/boot: do not check for zero-termination relocation entry s390/boot: make type of __vmlinux_relocs_64_start|end consistent s390/boot: sanitize kaslr_adjust_relocs() function prototype s390/boot: simplify GOT handling s390: vmlinux.lds.S: fix .got.plt assertion s390/boot: workaround current 'llvm-objdump -t -j ...' behavior ...
2024-03-12Merge branch 'pci/switchtec'Bjorn Helgaas
- Fix error handling path in switchtec_pci_probe() (Christophe JAILLET) * pci/switchtec: PCI: switchtec: Fix an error handling path in switchtec_pci_probe()
2024-03-12Merge branch 'pci/pm'Bjorn Helgaas
- Disable use of D3cold on Asus B1400 PCI-NVMe bridges because some BIOSes can't power them back on, replacing a more general ACPI sleep quirk (Daniel Drake) - Allow runtime PM when the driver enables it but doesn't need any runtime PM callbacks (Raag Jadav) - Drain runtime-idle callbacks before driver removal to avoid races between .remove() and .runtime_idle(), which caused intermittent page faults when the rtsx .runtime_idle() accessed registers that its .remove() had already unmapped (Rafael J. Wysocki) * pci/pm: PCI/PM: Drain runtime-idle callbacks before driver removal PCI/PM: Allow runtime PM with no PM callbacks at all Revert "ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default" PCI: Disable D3cold on Asus B1400 PCI-NVMe bridge
2024-03-12Merge branch 'pci/p2pdma'Bjorn Helgaas
- Fix a sleeping issue in a RCU read section (Christophe JAILLET) * pci/p2pdma: PCI/P2PDMA: Fix a sleeping issue in a RCU read section
2024-03-12Merge branch 'pci/enumeration'Bjorn Helgaas
- Collect interrupt-related code in irq.c (Ilpo Järvinen) - Mark 3ware-9650SE Root Port Extended Tags as broken (Jörg Wedekind) * pci/enumeration: PCI: Mark 3ware-9650SE Root Port Extended Tags as broken PCI: Place interrupt related code into irq.c # Conflicts: # drivers/pci/Makefile
2024-03-12Merge branch 'pci/dpc'Bjorn Helgaas
- After a DPC event, print all logged TLP Prefixes instead of printing the first prefix several times (Ilpo Järvinen) - Ignore the expected Surprise Down error that may cause a DPC event when hot-removing a device (Smita Koralahalli) - Add an RP PIO log size quirk for Intel Raptor Lake Root Ports, which still don't advertise the correct log size, which prevented logging of RP PIO Log registers when DPC is triggered (Paul Menzel) * pci/dpc: PCI/DPC: Quirk PIO log size for Intel Raptor Lake Root Ports PCI/DPC: Ignore Surprise Down error on hot removal PCI/DPC: Print all TLP Prefixes, not just the first
2024-03-12Merge branch 'pci/devres'Bjorn Helgaas
- Unmap MMIO mappings in pci_iounmap() to avoid a leak when ARCH_HAS_GENERIC_IOPORT_MAP is defined (Philipp Stanner) - Move pci_iomap.c to drivers/pci/ since it's all PCI-related (Philipp Stanner) - Move other PCI-related devres code from lib/devres.c to drivers/pci/ (Philipp Stanner) - Move other devres code from pci.c to devres.c (Philipp Stanner) * pci/devres: PCI: Move devres code from pci.c to devres.c PCI: Move PCI-specific devres code to drivers/pci/ PCI: Move pci_iomap.c to drivers/pci/ pci_iounmap(): Fix MMIO mapping leak
2024-03-12Merge branch 'pci/aspm'Bjorn Helgaas
- Collect ASPM-related code into aspm.c (David E. Box) - Save and restore ASPM L1 PM Substates configuration so these states continue working after suspend/resume (David E. Box) - Move the ASPM L1.2-related LTR save/restore next to the ASPM save/restore (David E. Box) - Move the required L1 disable before L1 Substate configuration into pci_restore_aspm_l1ss_state() (Bjorn Helgaas) - Update save_save when ASPM config is changed, so a .slot_reset() during error recovery restores the changed config, not the .probe()-time config (Vidya Sagar) * pci/aspm: PCI/ASPM: Update save_state when configuration changes PCI/ASPM: Disable L1 before configuring L1 Substates PCI/ASPM: Call pci_save_ltr_state() from pci_save_pcie_state() PCI/ASPM: Save L1 PM Substates Capability for suspend/resume PCI/ASPM: Move pci_save_ltr_state() to aspm.c PCI/ASPM: Always build aspm.c PCI/ASPM: Move pci_configure_ltr() to aspm.c
2024-03-12PCI/ASPM: Update save_state when configuration changesVidya Sagar
Many PCIe device drivers save the configuration state of their device during probe and restore it when their .slot_reset() hook is called during PCIe error recovery. If the ASPM configuration is changed after the driver's probe is called and before an error event occurs, .slot_reset() restores the ASPM configuration to what it was at the time of probe, not to what it was just before the occurrence of the error event. This leads to a mismatch in ASPM configuration between the device and its upstream device. Update the saved configuration of the device when the ASPM configuration changes. Link: https://lore.kernel.org/r/20240222174436.3565146-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> [bhelgaas: commit log, rebase to pci/aspm, rename to pci_update_aspm_saved_state() since it updates only LNKCTL, update only ASPMC and CLKREQ_EN in LNKCTL] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: David E. Box <david.e.box@linux.intel.com>
2024-03-12PCI/ASPM: Disable L1 before configuring L1 SubstatesBjorn Helgaas
Per PCIe r6.1, sec 5.5.4, L1 must be disabled while setting ASPM L1 PM Substates enable bits. Previously this was enforced by clearing PCI_EXP_LNKCTL_ASPMC before calling pci_restore_aspm_l1ss_state(). Move the L1 (and L0s, although that doesn't seem required) disable into pci_restore_aspm_l1ss_state() itself so it's closer to the code that depends on it. Link: https://lore.kernel.org/r/20240223213733.GA115410@bhelgaas Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-03-12PCI/ASPM: Call pci_save_ltr_state() from pci_save_pcie_state()David E. Box
ASPM state is saved and restored from pci_save/restore_pcie_state(). Since the LTR Capability is linked with ASPM, move the LTR save and restore calls there as well. No functional change intended. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20240128233212.1139663-6-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240223205851.114931-6-helgaas@kernel.org Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-03-12Merge tag 'x86-boot-2024-03-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: - Continuing work by Ard Biesheuvel to improve the x86 early startup code, with the long-term goal to make it position independent: - Get rid of early accesses to global objects, either by moving them to the stack, deferring the access until later, or dropping the globals entirely - Move all code that runs early via the 1:1 mapping into .head.text, and move code that does not out of it, so that build time checks can be added later to ensure that no inadvertent absolute references were emitted into code that does not tolerate them - Remove fixup_pointer() and occurrences of __pa_symbol(), which rely on the compiler emitting absolute references, which is not guaranteed - Improve the early console code - Add early console message about ignored NMIs, so that users are at least warned about their existence - even if we cannot do anything about them - Improve the kexec code's kernel load address handling - Enable more X86S (simplified x86) bits - Simplify early boot GDT handling - Micro-optimize the boot code a bit - Misc cleanups * tag 'x86-boot-2024-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) x86/sev: Move early startup code into .head.text section x86/sme: Move early SME kernel encryption handling into .head.text x86/boot: Move mem_encrypt= parsing to the decompressor efi/libstub: Add generic support for parsing mem_encrypt= x86/startup_64: Simplify virtual switch on primary boot x86/startup_64: Simplify calculation of initial page table address x86/startup_64: Defer assignment of 5-level paging global variables x86/startup_64: Simplify CR4 handling in startup code x86/boot: Use 32-bit XOR to clear registers efi/x86: Set the PE/COFF header's NX compat flag unconditionally x86/boot/64: Load the final kernel GDT during early boot directly, remove startup_gdt[] x86/boot/64: Use RIP_REL_REF() to access early_top_pgt[] x86/boot/64: Use RIP_REL_REF() to access early page tables x86/boot/64: Use RIP_REL_REF() to access '__supported_pte_mask' x86/boot/64: Use RIP_REL_REF() to access early_dynamic_pgts[] x86/boot/64: Use RIP_REL_REF() to assign 'phys_base' x86/boot/64: Simplify global variable accesses in GDT/IDT programming x86/trampoline: Bypass compat mode in trampoline_start64() if not needed kexec: Allocate kernel above bzImage's pref_address x86/boot: Add a message about ignored early NMIs ...
2024-03-12PCI/ASPM: Save L1 PM Substates Capability for suspend/resumeDavid E. Box
4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") restored the L1 PM Substates Capability after resume, which reduced power consumption by making the ASPM L1.x states work after resume. a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"") reverted 4ff116d0d5fd because resume failed on some systems, so power consumption after resume increased again. a7152be79b62 mentioned that we restore L1 PM substate configuration even though ASPM L1 may already be enabled. This is due the fact that the pci_restore_aspm_l1ss_state() was called before pci_restore_pcie_state(). Save and restore the L1 PM Substates Capability, following PCIe r6.1, sec 5.5.4 more closely by: 1) Do not restore ASPM configuration in pci_restore_pcie_state() but do that after PCIe capability is restored in pci_restore_aspm_state() following PCIe r6.1, sec 5.5.4. 2) If BIOS reenables L1SS, particularly L1.2, we need to clear the enables in the right order, downstream before upstream. Defer restoring the L1SS config until we are at the downstream component. Then update the config for both ends of the link in the prescribed order. 3) Program ASPM L1 PM substate configuration before L1 enables. 4) Program ASPM L1 PM substate enables last, after rest of the fields in the capability are programmed. [bhelgaas: commit log, squash L1SS-related patches, do both LNKCTL restores in pci_restore_pcie_state()] Link: https://lore.kernel.org/r/20240128233212.1139663-3-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240128233212.1139663-4-david.e.box@linux.intel.com Link: https://lore.kernel.org/r/20240223205851.114931-5-helgaas@kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217321 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877 Co-developed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Co-developed-by: David E. Box <david.e.box@linux.intel.com> Reported-by: Koba Ko <koba.ko@canonical.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Tasev Nikola <tasev.stefanoska@skynet.be> # Asus UX305FA Cc: Mark Enriquez <enriquezmark36@gmail.com> Cc: Thomas Witt <kernel@witt.link> Cc: Werner Sembach <wse@tuxedocomputers.com> Cc: Vidya Sagar <vidyas@nvidia.com>
2024-03-12Merge tag 'rfds-for-linus-2024-03-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RFDS mitigation from Dave Hansen: "RFDS is a CPU vulnerability that may allow a malicious userspace to infer stale register values from kernel space. Kernel registers can have all kinds of secrets in them so the mitigation is basically to wait until the kernel is about to return to userspace and has user values in the registers. At that point there is little chance of kernel secrets ending up in the registers and the microarchitectural state can be cleared. This leverages some recent robustness fixes for the existing MDS vulnerability. Both MDS and RFDS use the VERW instruction for mitigation" * tag 'rfds-for-linus-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: KVM/x86: Export RFDS_NO and RFDS_CLEAR to guests x86/rfds: Mitigate Register File Data Sampling (RFDS) Documentation/hw-vuln: Add documentation for RFDS x86/mmio: Disable KVM mitigation when X86_FEATURE_CLEAR_CPU_BUF is set
2024-03-12auxdisplay: img-ascii-lcd: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2024-03-12auxdisplay: hd44780: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2024-03-12auxdisplay: cfag12864bfb: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2024-03-12of: Move all FDT reserved-memory handling into of_reserved_mem.cRob Herring
The split of /reserved-memory handling between fdt.c and of_reserved_mem.c makes for reading and restructuring the code difficult. As of_reserved_mem.c is only built for CONFIG_OF_EARLY_FLATTREE already, move all the code to one spot. Acked-by: Saravana Kannan <saravanak@google.com> Reviewed-by: Oreoluwa Babatunde <quic_obabatun@quicinc.com> Link: https://lore.kernel.org/r/20240311181303.1516514-2-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2024-03-12perf: RISC-V: Introduce Andes PMU to support perf event samplingYu Chien Peter Lin
Assign riscv_pmu_irq_num the value of (256 + 18) for the custome PMU and add SSCOUNTOVF and SIP alternatives to ALT_SBI_PMU_OVERFLOW() and ALT_SBI_PMU_OVF_CLEAR_PENDING() macros, respectively. To make use of Andes PMU extension, "xandespmu" needs to be appended to the riscv,isa-extensions for each cpu node in device-tree, and make sure CONFIG_ANDES_CUSTOM_PMU is enabled. Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com> Reviewed-by: Charles Ci-Jyun Wu <dminus@andestech.com> Reviewed-by: Leo Yu-Chi Liang <ycliang@andestech.com> Co-developed-by: Locus Wei-Han Chen <locus84@andestech.com> Signed-off-by: Locus Wei-Han Chen <locus84@andestech.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240222083946.3977135-8-peterlin@andestech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-12perf: RISC-V: Eliminate redundant interrupt enable/disable operationsYu Chien Peter Lin
The interrupt enable/disable operations are already performed by the IRQ chip functions riscv_intc_irq_unmask()/riscv_intc_irq_mask() during enable_percpu_irq()/disable_percpu_irq(). It can be done only once. Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240222083946.3977135-7-peterlin@andestech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-12Merge tag 'irq-for-riscv-02-23-24' of ↵Palmer Dabbelt
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-next INTC changes to consume for RISCV * tag 'irq-for-riscv-02-23-24' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/riscv-intc: Introduce Andes hart-level interrupt controller irqchip/riscv-intc: Allow large non-standard interrupt number
2024-03-12spi: Restore delays for non-GPIO chip selectJanne Grunau
SPI controller with integrated chip select handling still need to adhere to SPI device's CS setup, hold and inactive delays. For controller without set_cs_timing spi core shall handle the delays to avoid duplicated delay handling in each controller driver. Fixes a regression for the out of tree SPI controller and SPI HID transport on Apple M1/M1 Pro/Max notebooks. Fixes: 4d8ff6b0991d ("spi: Add multi-cs memories support in SPI core") Signed-off-by: Janne Grunau <j@jannau.net> Link: https://msgid.link/r/20240311-spi-cs-delays-regression-v1-1-0075020a90b2@jannau.net Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-12spi: lpspi: Avoid potential use-after-free in probe()Alexander Sverdlin
fsl_lpspi_probe() is allocating/disposing memory manually with spi_alloc_host()/spi_alloc_target(), but uses devm_spi_register_controller(). In case of error after the latter call the memory will be explicitly freed in the probe function by spi_controller_put() call, but used afterwards by "devm" management outside probe() (spi_unregister_controller() <- devm_spi_unregister() below). Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070 ... Call trace: kernfs_find_ns kernfs_find_and_get_ns sysfs_remove_group sysfs_remove_groups device_remove_attrs device_del spi_unregister_controller devm_spi_unregister release_nodes devres_release_all really_probe driver_probe_device __device_attach_driver bus_for_each_drv __device_attach device_initial_probe bus_probe_device deferred_probe_work_func process_one_work worker_thread kthread ret_from_fork Fixes: 5314987de5e5 ("spi: imx: add lpspi bus driver") Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Link: https://msgid.link/r/20240312112050.2503643-1-alexander.sverdlin@siemens.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-12regulator: core: Propagate the regulator state in case of exclusive getKory Maincent
Previously, performing an exclusive get on an already-enabled regulator resulted in inconsistent state initialization between child and parent regulators. While the child's counts were updated, its parent's counters remained unaffected. Consequently, attempting to disable an already-enabled exclusive regulator triggered unbalanced disables warnings from its parent regulator. This commit addresses the issue by propagating the enable state to the parent regulator using a regulator_enable call. This ensures consistent state management across the regulator hierarchy, preventing warnings! Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://msgid.link/r/20240312091638.1266167-1-kory.maincent@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-12dm: call the resume method on internal suspendMikulas Patocka
There is this reported crash when experimenting with the lvm2 testsuite. The list corruption is caused by the fact that the postsuspend and resume methods were not paired correctly; there were two consecutive calls to the origin_postsuspend function. The second call attempts to remove the "hash_list" entry from a list, while it was already removed by the first call. Fix __dm_internal_resume so that it calls the preresume and resume methods of the table's targets. If a preresume method of some target fails, we are in a tricky situation. We can't return an error because dm_internal_resume isn't supposed to return errors. We can't return success, because then the "resume" and "postsuspend" methods would not be paired correctly. So, we set the DMF_SUSPENDED flag and we fake normal suspend - it may confuse userspace tools, but it won't cause a kernel crash. ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:56! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 1 PID: 8343 Comm: dmsetup Not tainted 6.8.0-rc6 #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 RIP: 0010:__list_del_entry_valid_or_report+0x77/0xc0 <snip> RSP: 0018:ffff8881b831bcc0 EFLAGS: 00010282 RAX: 000000000000004e RBX: ffff888143b6eb80 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff819053d0 RDI: 00000000ffffffff RBP: ffff8881b83a3400 R08: 00000000fffeffff R09: 0000000000000058 R10: 0000000000000000 R11: ffffffff81a24080 R12: 0000000000000001 R13: ffff88814538e000 R14: ffff888143bc6dc0 R15: ffffffffa02e4bb0 FS: 00000000f7c0f780(0000) GS:ffff8893f0a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000057fb5000 CR3: 0000000143474000 CR4: 00000000000006b0 Call Trace: <TASK> ? die+0x2d/0x80 ? do_trap+0xeb/0xf0 ? __list_del_entry_valid_or_report+0x77/0xc0 ? do_error_trap+0x60/0x80 ? __list_del_entry_valid_or_report+0x77/0xc0 ? exc_invalid_op+0x49/0x60 ? __list_del_entry_valid_or_report+0x77/0xc0 ? asm_exc_invalid_op+0x16/0x20 ? table_deps+0x1b0/0x1b0 [dm_mod] ? __list_del_entry_valid_or_report+0x77/0xc0 origin_postsuspend+0x1a/0x50 [dm_snapshot] dm_table_postsuspend_targets+0x34/0x50 [dm_mod] dm_suspend+0xd8/0xf0 [dm_mod] dev_suspend+0x1f2/0x2f0 [dm_mod] ? table_deps+0x1b0/0x1b0 [dm_mod] ctl_ioctl+0x300/0x5f0 [dm_mod] dm_compat_ctl_ioctl+0x7/0x10 [dm_mod] __x64_compat_sys_ioctl+0x104/0x170 do_syscall_64+0x184/0x1b0 entry_SYSCALL_64_after_hwframe+0x46/0x4e RIP: 0033:0xf7e6aead <snip> ---[ end trace 0000000000000000 ]--- Fixes: ffcc39364160 ("dm: enhance internal suspend and resume interface") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-03-12dm raid: fix false positive for requeue needed during reshapeMing Lei
An empty flush doesn't have a payload, so it should never be looked at when considering to possibly requeue a bio for the case when a reshape is in progress. Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Reported-by: Patrick Plenefisch <simonpatp@gmail.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>