git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-03-10	EDAC/ie31200: Make the memory controller resources configurable	Qiuxu Zhuo
	The resources such as MMIO, register offset, register mask, memory DIMM information, ECC error log location, etc., of the memory controller, and the number of memory controllers can be device-ID-specific. It requires adding numerous 'if (device_id == new_id)' special handling cases to the code to support a new SoC. Make these kinds of resources configurable and separate them from the code to facilitate the addition of new SoC support. No functional changes intended. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Link: https://lore.kernel.org/r/20250310011411.31685-7-qiuxu.zhuo@intel.com
2025-03-10	EDAC/ie31200: Simplify the pci_device_id table	Qiuxu Zhuo
	Use PCI_VDEVICE() to simplify the pci_device_id table. No functional changes intended. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Link: https://lore.kernel.org/r/20250310011411.31685-6-qiuxu.zhuo@intel.com
2025-03-10	EDAC/ie31200: Fix the 3rd parameter name of *populate_dimm_info()	Qiuxu Zhuo
	The 3rd parameter of *populate_dimm_info() pertains to the DIMM index within a channel, not the channel index. Fix the parameter name to dimm to reflect its actual purpose. No functional changes intended. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Link: https://lore.kernel.org/r/20250310011411.31685-5-qiuxu.zhuo@intel.com
2025-03-10	EDAC/ie31200: Fix the error path order of ie31200_init()	Qiuxu Zhuo
	The error path order of ie31200_init() is incorrect, fix it. Fixes: 709ed1bcef12 ("EDAC/ie31200: Fallback if host bridge device is already initialized") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Link: https://lore.kernel.org/r/20250310011411.31685-4-qiuxu.zhuo@intel.com
2025-03-10	EDAC/ie31200: Fix the DIMM size mask for several SoCs	Qiuxu Zhuo
	The DIMM size mask for {Sky, Kaby, Coffee} Lake is not bits{7:0}, but bits{5:0}. Fix it. Fixes: 953dee9bbd24 ("EDAC, ie31200_edac: Add Skylake support") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Link: https://lore.kernel.org/r/20250310011411.31685-3-qiuxu.zhuo@intel.com
2025-03-10	EDAC/ie31200: Fix the size of EDAC_MC_LAYER_CHIP_SELECT layer	Qiuxu Zhuo
	The EDAC_MC_LAYER_CHIP_SELECT layer pertains to the rank, not the DIMM. Fix its size to reflect the number of ranks instead of the number of DIMMs. Also delete the unused macros IE31200_{DIMMS,RANKS}. Fixes: 7ee40b897d18 ("ie31200_edac: Introduce the driver") Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Link: https://lore.kernel.org/r/20250310011411.31685-2-qiuxu.zhuo@intel.com
2025-03-10	nvmet: pci-epf: Do not add an IRQ vector if not needed	Damien Le Moal
	The function nvmet_pci_epf_create_cq() always unconditionally calls nvmet_pci_epf_add_irq_vector() to add an IRQ vector for a completion queue. But this is not correct if the host requested the creation of a completion queue for polling, without an IRQ vector specified (i.e. the flag NVME_CQ_IRQ_ENABLED is not set). Fix this by calling nvmet_pci_epf_add_irq_vector() and setting the queue flag NVMET_PCI_EPF_Q_IRQ_ENABLED for the cq only if NVME_CQ_IRQ_ENABLED is set. While at it, also fix the error path to add the missing removal of the added IRQ vector if nvmet_cq_create() fails. Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-10	nvmet: pci-epf: Set NVMET_PCI_EPF_Q_LIVE when a queue is fully created	Damien Le Moal
	The function nvmet_pci_epf_create_sq() use test_and_set_bit() to check that a submission queue is not already live and if not, set the NVMET_PCI_EPF_Q_LIVE queue flag to declare the sq live (ready to use). However, this is done on entry to the function, before the submission queue is actually fully initialized and ready to use. This creates a race situation with the function nvmet_pci_epf_poll_sqs_work() which looks at the NVMET_PCI_EPF_Q_LIVE queue flag to poll the submission queue when it is live. This race can lead to invalid DMA transfers if nvmet_pci_epf_poll_sqs_work() runs after the NVMET_PCI_EPF_Q_LIVE flag is set but before setting the sq pci address and doorbell ofset. Avoid this race by only testing the NVMET_PCI_EPF_Q_LIVE flag on entry to nvmet_pci_epf_create_sq() and setting it after the submission queue is fully setup before nvmet_pci_epf_create_sq() returns success. Since the function nvmet_pci_epf_create_cq() also has the same racy flag setting pattern, also make a similar change in that function. Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-10	rust: task: fix `SAFETY` comment in `Task::wake_up`	Panagiotis Foliadis
	The `SAFETY` comment inside the `wake_up` method references erroneously the `signal_pending` C function instead of the `wake_up_process` which is actually called. Fix the comment to reference the correct C function. Fixes: fe95f58320e6 ("rust: task: adjust safety comments in Task methods") Signed-off-by: Panagiotis Foliadis <pfoliadis@posteo.net> Reviewed-by: Charalampos Mitrodimas <charmitro@posteo.net> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250308-comment-fix-v1-1-4bba709fd36d@posteo.net [ Slightly reworded. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-03-10	Drivers: hv: vmbus: Don't release fb_mmio resource in vmbus_free_mmio()	Michael Kelley
	The VMBus driver manages the MMIO space it owns via the hyperv_mmio resource tree. Because the synthetic video framebuffer portion of the MMIO space is initially setup by the Hyper-V host for each guest, the VMBus driver does an early reserve of that portion of MMIO space in the hyperv_mmio resource tree. It saves a pointer to that resource in fb_mmio. When a VMBus driver requests MMIO space and passes "true" for the "fb_overlap_ok" argument, the reserved framebuffer space is used if possible. In that case it's not necessary to do another request against the "shadow" hyperv_mmio resource tree because that resource was already requested in the early reserve steps. However, the vmbus_free_mmio() function currently does no special handling for the fb_mmio resource. When a framebuffer device is removed, or the driver is unbound, the current code for vmbus_free_mmio() releases the reserved resource, leaving fb_mmio pointing to memory that has been freed. If the same or another driver is subsequently bound to the device, vmbus_allocate_mmio() checks against fb_mmio, and potentially gets garbage. Furthermore a second unbind operation produces this "nonexistent resource" error because of the unbalanced behavior between vmbus_allocate_mmio() and vmbus_free_mmio(): [ 55.499643] resource: Trying to free nonexistent resource <0x00000000f0000000-0x00000000f07fffff> Fix this by adding logic to vmbus_free_mmio() to recognize when MMIO space in the fb_mmio reserved area would be released, and don't release it. This filtering ensures the fb_mmio resource always exists, and makes vmbus_free_mmio() more parallel with vmbus_allocate_mmio(). Fixes: be000f93e5d7 ("drivers:hv: Track allocations of children of hv_vmbus in private resource tree") Signed-off-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/20250310035208.275764-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250310035208.275764-1-mhklinux@outlook.com>
2025-03-10	lib/crc: remove unnecessary prompt for CONFIG_CRC64	Eric Biggers
	All modules that need CONFIG_CRC64 already select it, so there is no need to bother users about the option. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250304230712.167600-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	lib/crc: remove unnecessary prompt for CONFIG_LIBCRC32C	Eric Biggers
	All modules that need CONFIG_LIBCRC32C already select it, so there is no need to bother users about the option. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250304230712.167600-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	lib/crc: remove unnecessary prompt for CONFIG_CRC8	Eric Biggers
	All modules that need CONFIG_CRC8 already select it, so there is no need to bother users about the option. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250304230712.167600-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	lib/crc: remove unnecessary prompt for CONFIG_CRC7	Eric Biggers
	All modules that need CONFIG_CRC7 already select it, so there is no need to bother users about the option. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250304230712.167600-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	lib/crc: remove unnecessary prompt for CONFIG_CRC4	Eric Biggers
	All modules that need CONFIG_CRC4 already select it, so there is no need to bother users about the option. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250304230712.167600-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	lib/crc7: unexport crc7_be_syndrome_table	Eric Biggers
	Since neither crc7_be_syndrome_table nor crc7_be_byte() are used outside lib/crc7.c, fold them into lib/crc7.c. Link: https://lore.kernel.org/r/20250304224052.157915-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	lib/crc_kunit.c: update comment in crc_benchmark()	Eric Biggers
	None of the CRC library functions use __pure anymore, so the comment in crc_benchmark() is outdated. But the comment was not really correct anyway, since the CRC computation could (in principle) be optimized out regardless of __pure. Update the comment to have a proper explanation. Link: https://lore.kernel.org/r/20250305015830.37813-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	lib/crc_kunit.c: add test and benchmark for crc7_be()	Eric Biggers
	Wire up crc7_be() to crc_kunit. Previously it had no test. Link: https://lore.kernel.org/r/20250304223943.157493-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	x86/crc32: optimize tail handling for crc32c short inputs	Eric Biggers
	For handling the 0 <= len < sizeof(unsigned long) bytes left at the end, do a 4-2-1 step-down instead of a byte-at-a-time loop. This allows taking advantage of wider CRC instructions. Note that crc32c-3way.S already uses this same optimization too. crc_kunit shows an improvement of about 25% for len=127. Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20250304213216.108925-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	riscv/crc64: add Zbc optimized CRC64 functions	Eric Biggers
	Wire up crc64_be_arch() and crc64_nvme_arch() for 64-bit RISC-V using crc-clmul-template.h. This greatly improves the performance of these CRCs on Zbc-capable CPUs in 64-bit kernels. These optimized CRC64 functions are not yet supported in 32-bit kernels, since crc-clmul-template.h assumes that the CRC fits in an unsigned long. That implementation limitation could be addressed, but it would add a fair bit of complexity, so it has been omitted for now. Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250216225530.306980-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	riscv/crc-t10dif: add Zbc optimized CRC-T10DIF function	Eric Biggers
	Wire up crc_t10dif_arch() for RISC-V using crc-clmul-template.h. This greatly improves CRC-T10DIF performance on Zbc-capable CPUs. Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250216225530.306980-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	riscv/crc32: reimplement the CRC32 functions using new template	Eric Biggers
	Delete the previous Zbc optimized CRC32 code, and re-implement it using the new template. The new implementation is more optimized and shares more code among CRC variants. Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250216225530.306980-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	riscv/crc: add "template" for Zbc optimized CRC functions	Eric Biggers
	Add a "template" crc-clmul-template.h that can generate RISC-V Zbc optimized CRC functions. Each generated CRC function is parameterized by CRC length and bit order, and it accepts a pointer to the constants struct required for the specific CRC polynomial desired. Update gen-crc-consts.py to support generating the needed constants structs. This makes it possible to easily wire up a Zbc optimized implementation of almost any CRC. The design generally follows what I did for x86, but it is simplified by using RISC-V's scalar carryless multiplication Zbc, which has no equivalent on x86. RISC-V's clmulr instruction is also helpful. A potential switch to Zvbc (or support for Zvbc alongside Zbc) is left for future work. For long messages Zvbc should be fastest, but it would need to be shown to be worthwhile over just using Zbc which is significantly more convenient to use, especially in the kernel context. Compared to the existing Zbc-optimized CRC32 code and the earlier proposed Zbc-optimized CRC-T10DIF code (https://lore.kernel.org/r/20250211071101.181652-1-zhihang.shao.iscas@gmail.com), this submission deduplicates the code among CRC variants and is significantly more optimized. It uses "folding" to take better advantage of instruction-level parallelism (to a more limited extent than x86 for now, but it could be extended to more), it reworks the Barrett reduction to eliminate unnecessary instructions, and it documents all the math used and makes all the constants reproducible. Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250216225530.306980-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10	drm/atomic: Filter out redundant DPMS calls	Ville Syrjälä
	Video players (eg. mpv) do periodic XResetScreenSaver() calls to keep the screen on while the video playing. The modesetting ddx plumbs these straight through into the kernel as DPMS setproperty ioctls, without any filtering whatsoever. When implemented via atomic these end up as empty commits on the crtc (which will nonetheless take one full frame), which leads to a dropped frame every time XResetScreenSaver() is called. Let's just filter out redundant DPMS property changes in the kernel to avoid this issue. v2: Explain the resulting commits a bit better (Sima) Document the behaviour in uapi docs (Sima) Cc: stable@vger.kernel.org Testcase: igt/kms_flip/flip-vs-dpms-on-nop Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250219160239.17502-1-ville.syrjala@linux.intel.com
2025-03-10	nvme-pci: fix stuck reset on concurrent DPC and HP	Keith Busch
	The PCIe error handling has the nvme driver quiesce the device, attempt to restart it, then wait for that restart to complete. A PCIe DPC event also toggles the PCIe link. If the slot doesn't have out-of-band presence detection, this will trigger a pciehp re-enumeration. The error handling that calls nvme_error_resume is holding the device lock while this happens. This lock blocks pciehp's request to disconnect the driver from proceeding. Meanwhile the nvme's reset can't make forward progress because its device isn't there anymore with outstanding IO, and the timeout handler won't do anything to fix it because the device is undergoing error handling. End result: deadlocked. Fix this by having the timeout handler short cut the disabling for a disconnected PCIe device. The downside is that we're relying on an IO timeout to clean up this mess, which could be a minute by default. Tested-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-10	dt-bindings: pwm: imx: Add i.MX93, i.MX94 and i.MX95 support	Frank Li
	Add compatible string "fsl,imx93-pwm", "fsl,imx94-pwm" and "fsl,imx95-pwm", which is backward compatible with i.MX7ULP. Set it to fall back to "fsl,imx7ulp-pwm". Signed-off-by: Frank Li <Frank.Li@nxp.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20250306170845.240555-1-Frank.Li@nxp.com Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
2025-03-10	Merge patch series "auxdisplay: charlcd: Refactor memory allocation"	Andy Shevchenko
	Andy Shevchenko says: The users of charlcd_alloc() call for additional memory allocation. We may do it at the time of the main call as many other APIs do. For this partially revert the change that brought us to the current state of affairs, and refactor the code based on the original implementation. Link: https://lore.kernel.org/r/20250224173010.219024-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2025-03-10	auxdisplay: hd44780: Rename hd to hdc in hd44780_common_alloc()	Andy Shevchenko
	The hd44780_common_alloc() uses hd for local variable while the respective header uses hdc, rename to make it consistent and avoid potential confuse with the drivers that use both for different reasons. No functional changes intended. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2025-03-10	auxdisplay: hd44780: Call charlcd_alloc() from hd44780_common_alloc()	Andy Shevchenko
	HD44780 APIs all operate on struct charlcd objects. Moreover, the current users always call charlcd_alloc() and hd44780_common_alloc(). Make the latter call the former, so eliminate the additional allocation, to make it consistent with the rest of API and avoid duplication. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
2025-03-10	auxdisplay: panel: Make use of hd44780_common_free()	Andy Shevchenko
	Use the symmetrical API to free the common resources. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
2025-03-10	auxdisplay: hd44780: Make use of hd44780_common_free()	Andy Shevchenko
	Use the symmetrical API to free the common resources. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
2025-03-10	auxdisplay: hd44780: Introduce hd44780_common_free()	Andy Shevchenko
	Introduce hd44780_common_free() for symmetrical operation to hd44780_common_alloc(). It will allow to modify both in the future without touching the users. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
2025-03-10	auxdisplay: lcd2s: Allocate memory for custom data in charlcd_alloc()	Andy Shevchenko
	Allocate memory for custom data in charlcd_alloc() instead of doing that explicitly in the driver. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
2025-03-10	auxdisplay: charlcd: Partially revert "Move hwidth and bwidth to struct ↵	Andy Shevchenko
	hd44780_common" Commit 2545c1c948a6 ("auxdisplay: Move hwidth and bwidth to struct hd44780_common") makes charlcd_alloc() argument-less effectively dropping the single allocation for the struct charlcd_priv object along with the driver specific one. Restore that behaviour here. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
2025-03-10	drm/xe/guc_pc: Retry and wait longer for GuC PC start	Rodrigo Vivi
	In a rare situation of thermal limit during resume, GuC can be slow and run into delays like this: xe 0000:00:02.0: [drm] GT1: excessive init time: 667ms! \ [status = 0x8002F034, timeouts = 0] xe 0000:00:02.0: [drm] GT1: excessive init time: \ [freq = 100MHz (req = 800MHz), before = 100MHz, \ perf_limit_reasons = 0x1C001000] xe 0000:00:02.0: [drm] ERROR GT1: GuC PC Start failed ------------[ cut here ]------------ xe 0000:00:02.0: [drm] GT1: Failed to start GuC PC: -EIO When this happens, it will block entirely the GPU to be used. So, let's try and with a huge timeout in the hope it comes back. Also, let's collect some information on how long it is usually taking on situations like this, so perhaps the time can be tuned later. Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250307160307.1093391-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit b4b05e53b550a886b4754b87fd0dd2b304579e85) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-10	drm/xe/pm: Temporarily disable D3Cold on BMG	Rodrigo Vivi
	Currently, many instability cases related to D3Cold -> D0 transition on BMG are under investigation. Among them some bad cases where the device is lost after 1 to 3 transitions from D3Cold to D0 on the runtime pm, with pcieport upstream bridge port link retrain failure. In other cases, it works fine, but with some sudden random memory corruptions after D3cold, that could be 0xffff missed ack on GT forcewake or GuC reload related failures. In some other cases though, D3Cold -> D0 works pretty reliably. It looks like it is a combination of GPU cards and Host boards at this point. So, there is no possible/available quirk at this time. This patch disables the D3Cold by default on BMG by reducing the vram_d3cold_threshold to 0. Users and developers who wants to enable it are still able to via $ echo 300 > /sys/bus/pci/devices/<addr>/vram_d3cold_threshold Fixes: 3adcf970dc7e ("drm/xe/bmg: Drop force_probe requirement") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4037 Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4395 Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4396 Cc: Karthik Poosa <karthik.poosa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250308005636.1475420-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit d945cc876277851053c0cf37927c8d7bd9d0e880) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-10	drm/i915/cdclk: Do cdclk post plane programming later	Ville Syrjälä
	We currently call intel_set_cdclk_post_plane_update() far too early. When pipes are active during the reprogramming the current spot only works for the cd2x divider update case, as that is synchronize to the pipe's vblank. Squashing and crawling are not synchronized in any way, so doing the programming while the pipes/planes are potentially still using the old hardware state could lead to underruns. Move the post plane reprgramming to a spot where we know that the pipes/planes have switched over the new hardware state. Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250218211913.27867-2-ville.syrjala@linux.intel.com Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com> (cherry picked from commit fb64f5568c0e0b5730733d70a012ae26b1a55815) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-10	drm/xe/userptr: Fix an incorrect assert	Thomas Hellström
	The assert incorrectly checks the total length processed which can in fact be greater than the number of pages. Fix. Fixes: 0a98219bcc96 ("drm/xe/hmm: Don't dereference struct page pointers without notifier lock") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250307100109.21397-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 70e5043ba85eae199b232e39921abd706b5c1fa4) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-10	drm/xe: Release guc ids before cancelling work	Tejas Upadhyay
	A GT resets can be occurring in parallel while cancelling work in async call which can requeue these workers. to avoid that, lets first release guc ids and then cancel work so they don't requeued. Fixes: 8ae8a2e8dd21 ("drm/xe: Long running job update") Fixes: 12c2f962fe71 ("drm/xe: cancel pending job timer before freeing scheduler") Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306131211.975503-1-tejas.upadhyay@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 8e8d76f62329127b31c64a034b052fb9e30e92af) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-10	ASoC: qcom: sm8250: explicitly set format in sm8250_be_hw_params_fixup()	Alexey Klimov
	Setting format to s16le is required for compressed playback on compatible soundcards. Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://patch.msgid.link/20250228161430.373961-1-alexey.klimov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-10	ASoC: cs35l41: check the return value from spi_setup()	Vitaliy Shevtsov
	Currently the return value from spi_setup() is not checked for a failure. It is unlikely it will ever fail in this particular case but it is still better to add this check for the sake of completeness and correctness. This is cheap since it is performed once when the device is being probed. Handle spi_setup() return value. Found by Linux Verification Center (linuxtesting.org) with Svace. Fixes: 872fc0b6bde8 ("ASoC: cs35l41: Set the max SPI speed for the whole device") Signed-off-by: Vitaliy Shevtsov <v.shevtsov@mt-integration.ru> Link: https://patch.msgid.link/20250304115643.2748-1-v.shevtsov@mt-integration.ru Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-10	selftests: ublk: add --foreground command line	Ming Lei
	Add --foreground command for helping to debug. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10	selftests: ublk: fix build failure	Ming Lei
	Fixes the following build failure: ublk//file_backed.c: In function ‘backing_file_tgt_init’: ublk//file_backed.c:28:42: error: ‘O_DIRECT’ undeclared (first use in this function); did you mean ‘O_DIRECTORY’? 28 \| fd = open(file, O_RDWR \| O_DIRECT); \| ^~~~~~~~ \| O_DIRECTORY when trying to reuse this same utility for liburing test. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10	selftests: ublk: make ublk_stop_io_daemon() more reliable	Ming Lei
	Improve ublk_stop_io_daemon() in the following ways: - don't wait if ->ublksrv_pid becomes -1, which means that the disk has been stopped - don't wait if ublk char device doesn't exist any more, so we can avoid to rely on inoitfy for wait until the char device is closed And this way may reduce time of delete command a lot. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250303124324.3563605-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10	x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes	Florent Revest
	Currently, load_microcode_amd() iterates over all NUMA nodes, retrieves their CPU masks and unconditionally accesses per-CPU data for the first CPU of each mask. According to Documentation/admin-guide/mm/numaperf.rst: "Some memory may share the same node as a CPU, and others are provided as memory only nodes." Therefore, some node CPU masks may be empty and wouldn't have a "first CPU". On a machine with far memory (and therefore CPU-less NUMA nodes): - cpumask_of_node(nid) is 0 - cpumask_first(0) is CONFIG_NR_CPUS - cpu_data(CONFIG_NR_CPUS) accesses the cpu_info per-CPU array at an index that is 1 out of bounds This does not have any security implications since flashing microcode is a privileged operation but I believe this has reliability implications by potentially corrupting memory while flashing a microcode update. When booting with CONFIG_UBSAN_BOUNDS=y on an AMD machine that flashes a microcode update. I get the following splat: UBSAN: array-index-out-of-bounds in arch/x86/kernel/cpu/microcode/amd.c:X:Y index 512 is out of range for type 'unsigned long[512]' [...] Call Trace: dump_stack __ubsan_handle_out_of_bounds load_microcode_amd request_microcode_amd reload_store kernfs_fop_write_iter vfs_write ksys_write do_syscall_64 entry_SYSCALL_64_after_hwframe Change the loop to go over only NUMA nodes which have CPUs before determining whether the first CPU on the respective node needs microcode update. [ bp: Massage commit message, fix typo. ] Fixes: 7ff6edf4fef3 ("x86/microcode/AMD: Fix mixed steppings support") Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250310144243.861978-1-revest@chromium.org
2025-03-10	badblocks: Fix a nonsense WARN_ON() which checks whether a u64 variable < 0	Coly Li
	In _badblocks_check(), there are lines of code like this, 1246 sectors -= len; [snipped] 1251 WARN_ON(sectors < 0); The WARN_ON() at line 1257 doesn't make sense because sectors is unsigned long long type and never to be <0. Fix it by checking directly checking whether sectors is less than len. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Coly Li <colyli@kernel.org> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250309160556.42854-1-colyli@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10	block: make sure ->nr_integrity_segments is cloned in blk_rq_prep_clone	Ming Lei
	Make sure ->nr_integrity_segments is cloned in blk_rq_prep_clone(), otherwise requests cloned by device-mapper multipath will not have the proper nr_integrity_segments values set, then BUG() is hit from sg_alloc_table_chained(). Fixes: b0fd271d5fba ("block: add request clone interface (v2)") Cc: stable@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250310115453.2271109-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10	block: protect hctx attributes/params using q->elevator_lock	Nilay Shroff
	Currently, hctx attributes (nr_tags, nr_reserved_tags, and cpu_list) are protected using `q->sysfs_lock`. However, these attributes can be updated in multiple scenarios: - During the driver's probe method. - When updating nr_hw_queues. - When writing to the sysfs attribute nr_requests, which can modify nr_tags. The nr_requests attribute is already protected using q->elevator_lock, but none of the update paths actually use q->sysfs_lock to protect hctx attributes. So to ensure proper synchronization, replace q->sysfs_lock with q->elevator_lock when reading hctx attributes through sysfs. Additionally, blk_mq_update_nr_hw_queues allocates and updates hctx. The allocation of hctx is protected using q->elevator_lock, however, updating hctx params happens without any protection, so safeguard hctx param update path by also using q->elevator_lock. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Link: https://lore.kernel.org/r/20250306093956.2818808-1-nilay@linux.ibm.com [axboe: wrap comment at 80 chars] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10	block: protect read_ahead_kb using q->limits_lock	Nilay Shroff
	The bdi->ra_pages could be updated under q->limits_lock because it's usually calculated from the queue limits by queue_limits_commit_update. So protect reading/writing the sysfs attribute read_ahead_kb using q->limits_lock instead of q->sysfs_lock. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Link: https://lore.kernel.org/r/20250304102551.2533767-8-nilay@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-10	block: protect wbt_lat_usec using q->elevator_lock	Nilay Shroff
	The wbt latency and state could be updated while initializing the elevator or exiting the elevator. It could be also updated while configuring IO latency QoS parameters using cgroup. The elevator code path is now protected with q->elevator_lock. So we should protect the access to sysfs attribute wbt_lat_usec using q->elevator _lock instead of q->sysfs_lock. White we're at it, also protect ioc_qos_write(), which configures wbt parameters via cgroup, using q->elevator_lock. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Link: https://lore.kernel.org/r/20250304102551.2533767-7-nilay@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>