linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-05-20	drm/i915/dpll: Rename intel_unreference_dpll_crtc	Suraj Kandpal
	Rename intel_unreference_dpll_crtc to intel_dpll_crtc_put in an effort to keep names of exported functions start with the filename. --v2 -Make the new name more sensible [Jani] Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Mika Kahola <mika.kahola@intel.com> Link: https://lore.kernel.org/r/20250515071801.2221120-11-suraj.kandpal@intel.com
2025-05-20	drm/i915/dpll: Rename intel_[enable/disable]_dpll	Suraj Kandpal
	Rename intel_[enable/disable]_dpll to intel_dpll_[enable/disable] in an effort to make sure all functions that are exported start with the filename. Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250515071801.2221120-10-suraj.kandpal@intel.com
2025-05-20	drm/i915/dpll: Rename crtc_get_shared_dpll	Suraj Kandpal
	Rename crtc_get_shared_dpll to take into the individual PLL framework which came in at DISPLAY_VER >= 14. Also having shared dpll stuff also in intel_dpll.c is just confusing. --v2 -Change naming to dpll_global to keep consistency with rest of the naming --v3 -Just use intel_dpll [Jani] --v4 -Modify commit message [Jani] Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250515071801.2221120-9-suraj.kandpal@intel.com
2025-05-20	drm/i915/dpll: Move away from using shared dpll	Suraj Kandpal
	Rename functions to move away from using shared dpll in the dpll framework as much as possible since dpll may not always be shared. --v2 -Use intel_dpll_global instead of global_dpll [Jani] --v3 -Just use intel_dpll [Jani] --v4 -Drop the global from comments [Jani] Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250515071801.2221120-8-suraj.kandpal@intel.com
2025-05-20	drm/i915/dpll: Rename intel_shared_dpll	Suraj Kandpal
	Rename intel_shared_dpll to intel_dpll to represent both shared and individual dplls. Since from MTL each PHY has it's own PLL making the shared PLL naming a little outdated. In an effort to make this framework accepting of future changes this needs to be done. --v2 -Use intel_dpll_global to make sure names start with the filename [Jani/Ville] -Explain the need of this rename [Jani] --v3 -Just keep it intel_dpll [Jani] --v4 -Fix comment [Jani] -Use just num_dpll and dplls [Jani] Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250515071801.2221120-7-suraj.kandpal@intel.com
2025-05-20	drm/i915/dpll: Rename intel_shared_dpll_funcs	Suraj Kandpal
	Rename intel_shared_dpll_funcs to intel_dpll_funcs since it needs to represent both shared and individual dplls. --v2 -Change intel_global_dpll to intel_dpll_global to be more in line with the naming standard where the name should start with the file name [Jani] --v3 -Drop shared and global altogether [Jani] --v4 -Keep declarations sorted [Jani] Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250515071801.2221120-6-suraj.kandpal@intel.com
2025-05-20	drm/i915/dpll: Rename macro for_each_shared_dpll	Suraj Kandpal
	Rename the macro for_each_shared_dpll to for_each_dpll since this loop will not necessarily be used for only shared dpll in future. Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250515071801.2221120-5-suraj.kandpal@intel.com
2025-05-20	drm/i915/dpll: Rename intel_shared_dpll_state	Suraj Kandpal
	Rename intel_shared_dpll_state to just intel_dpll_state since it may not necessarily store share dpll state info specially since DISPLAY_VER >= 14 PLL's are not shared. Also change the name of variables which may have been associated as a shared_dpll. Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250515071801.2221120-4-suraj.kandpal@intel.com
2025-05-20	drm/i915/dpll: Rename intel_dpll_funcs	Suraj Kandpal
	Rename intel_dpll_funcs to intel_dpll_global_funcs so that later on intel_shared_dpll_funcs can be renamed to intel_dpll_funcs. This is done to move away from the shared naming convention since starting MTL dpll's are not shared among PHYs. Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250515071801.2221120-3-suraj.kandpal@intel.com
2025-05-20	drm/i915/dpll: Rename intel_dpll	Suraj Kandpal
	Rename intel_dpll to intel_dpll_global so that intel_shared_dpll can be renamed to intel_dpll in an effort to move away from the shared naming convention. Also intel_dpll according to it's comment tracks global dpll rather than individual hence making more sense this gets changed. Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250515071801.2221120-2-suraj.kandpal@intel.com
2025-05-20	PCI: Remove hybrid-devres usage warnings from kernel-doc	Philipp Stanner
	pci/iomap.c still contains warnings about those functions not behaving in a managed manner if pcim_enable_device() was called. Since all hybrid behavior that users could know about has been removed by now, those explicit warnings are no longer necessary. Remove the hybrid-devres usage warnings from the kernel-doc. Signed-off-by: Philipp Stanner <phasta@kernel.org> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20250519112959.25487-8-phasta@kernel.org
2025-05-20	PCI: Remove redundant set of request functions	Philipp Stanner
	When the demangling of the hybrid devres functions within PCI was implemented, it was necessary to implement several PCI functions a second time to avoid cyclic calls, since the hybrid functions in pci.c call the managed functions in devres.c, which in turn can be directly used outside of PCI and needed request infrastructure, too. Therefore, __pcim_request_region_range(), __pci_release_region_range() and wrappers around them were implemented. The hybrid nature has recently been removed from all functions in pci.c. Therefore, the functions in devres.c can now directly use their counterparts in pci.c without causing a call-cycle. Remove __pcim_request_region_range(), __pcim_request_region_range() and the wrappers. Use the corresponding request functions from pci.c in devres.c Signed-off-by: Philipp Stanner <phasta@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20250519112959.25487-7-phasta@kernel.org
2025-05-20	PCI: Remove exclusive requests flags from _pcim_request_region()	Philipp Stanner
	pcim_request_region_exclusive(), the only user in PCI devres that needed exclusive region requests, has been removed. All features related to exclusive requests can, therefore, be removed, too. Remove them. Signed-off-by: Philipp Stanner <phasta@kernel.org> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250519112959.25487-6-phasta@kernel.org
2025-05-20	arm64: dts: renesas: white-hawk-ard-audio: Fix TPU0 groups	Thuan Nguyen
	White Hawk ARD audio uses a clock generated by the TPU, but commit 3d144ef10a44 ("pinctrl: renesas: r8a779g0: Fix TPU suffixes") renamed pin group "tpu_to0_a" to "tpu_to0_b". Update DTS accordingly otherwise the sound driver does not receive a clock signal. Fixes: 3d144ef10a448f89 ("pinctrl: renesas: r8a779g0: Fix TPU suffixes") Signed-off-by: Thuan Nguyen <thuan.nguyen-hong@banvien.com.vn> Signed-off-by: Duy Nguyen <duy.nguyen.rh@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/TYCPR01MB8740608B675365215ADB0374B49CA@TYCPR01MB8740.jpnprd01.prod.outlook.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2025-05-20	MAINTAINERS: Add entry for newly added EcoNet platform.	Caleb James DeLisle
	Add a MAINTAINERS entry as part of integration of the EcoNet MIPS platform. Signed-off-by: Caleb James DeLisle <cjd@cjdns.fr> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-05-20	mips: dts: Add EcoNet DTS with EN751221 and SmartFiber XP8421-B board	Caleb James DeLisle
	Add DTS files in support of EcoNet platform, including SmartFiber XP8421-B, a low cost commercially available board based on EN751221. Signed-off-by: Caleb James DeLisle <cjd@cjdns.fr> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-05-20	dt-bindings: vendor-prefixes: Add SmartFiber	Caleb James DeLisle
	Add "smartfiber" vendor prefix for manufactorer of EcoNet based boards. Signed-off-by: Caleb James DeLisle <cjd@cjdns.fr> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-05-20	mips: Add EcoNet MIPS platform support	Caleb James DeLisle
	Add platform support for EcoNet MIPS SoCs. Signed-off-by: Caleb James DeLisle <cjd@cjdns.fr> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-05-20	dt-bindings: mips: Add EcoNet platform binding	Caleb James DeLisle
	Document the top-level device tree binding for EcoNet MIPS-based SoCs. Signed-off-by: Caleb James DeLisle <cjd@cjdns.fr> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-05-20	gpiolib: remove unneeded #ifdef	Bartosz Golaszewski
	We are already within another `#ifdef CONFIG_GPIOLIB_IRQCHIP` in gpiochip_to_irq() so there's no need for another guard. Remove it. Acked-by: Peng Fan <peng.fan@nxp.com> Link: https://lore.kernel.org/r/20250519-gpio-irq-kconfig-fixes-v1-3-fe6ba1c6116d@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-05-20	gpio: mpc8xxx: select GPIOLIB_IRQCHIP	Bartosz Golaszewski
	This driver uses gpiochip_irq_reqres() and gpiochip_irq_relres() which are only built with GPIOLIB_IRQCHIP=y. Add the missing Kconfig select. Fixes: 7688a54d5b53 ("gpio: mpc8xxx: Make irq_chip immutable") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505180309.1nosQMkI-lkp@intel.com/ Acked-by: Peng Fan <peng.fan@nxp.com> Link: https://lore.kernel.org/r/20250519-gpio-irq-kconfig-fixes-v1-2-fe6ba1c6116d@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-05-20	gpio: pxa: select GPIOLIB_IRQCHIP	Bartosz Golaszewski
	This driver uses gpiochip_irq_reqres() and gpiochip_irq_relres() which are only built with GPIOLIB_IRQCHIP=y. Add the missing Kconfig select. Fixes: 20117cf426b6 ("gpio: pxa: Make irq_chip immutable") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505181429.mzyIatOU-lkp@intel.com/ Acked-by: Peng Fan <peng.fan@nxp.com> Link: https://lore.kernel.org/r/20250519-gpio-irq-kconfig-fixes-v1-1-fe6ba1c6116d@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-05-20	MIPS: bcm63xx: nvram: avoid inefficient use of crc32_le_combine()	Eric Biggers
	bcm963xx_nvram_checksum() was using crc32_le_combine() to update a CRC with four zero bytes. However, this is about 5x slower than just CRC'ing four zero bytes in the normal way. Just do that instead. (We could instead make crc32_le_combine() faster on short lengths. But all its callers do just fine without it, so I'd like to just remove it.) Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-05-20	mips: dts: pic32: pic32mzda: Rename the sdhci nodename to match with common ↵	Charan Pedumuru
	mmc-controller binding Rename the sdhci nodename from "sdhci@" to "mmc@" to align with linux common mmc-controller binding. Signed-off-by: Charan Pedumuru <charan.pedumuru@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-05-20	MIPS: SMP: Move the AP sync point before the non-parallel aware functions	Gregory CLEMENT
	When CONFIG_HOTPLUG_PARALLEL is enabled, the code executing before cpuhp_ap_sync_alive() is executed in parallel, while after it is serialized. The functions set_cpu_sibling_map() and set_cpu_core_map() were not designed to be executed in parallel, so by moving the cpuhp_ap_sync_alive() before cpuhp_ap_sync_alive(), we then ensure they will be called serialized. The measurement done on EyeQ5 did not show any relevant boot time increase after applying this patch. Fixes: 76c43eb507bc ("MIPS: SMP: Implement parallel CPU bring up for EyeQ") Reported-by: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-05-20	ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14ASP10	Ed Burcher
	Lenovo Yoga Pro 7 (gen 10) with Realtek ALC3306 and combined CS35L56 amplifiers need quirk ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN to enable bass Signed-off-by: Ed Burcher <git@edburcher.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20250519224907.31265-2-git@edburcher.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-20	xfrm: use kfree_sensitive() for SA secret zeroization	Zilin Guan
	High-level copy_to_user_* APIs already redact SA secret fields when redaction is enabled, but the state teardown path still freed aead, aalg and ealg structs with plain kfree(), which does not clear memory before deallocation. This can leave SA keys and other confidential data in memory, risking exposure via post-free vulnerabilities. Since this path is outside the packet fast path, the cost of zeroization is acceptable and prevents any residual key material. This patch replaces those kfree() calls unconditionally with kfree_sensitive(), which zeroizes the entire buffer before freeing. Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-05-20	cpufreq: scmi: Skip SCMI devices that aren't used by the CPUs	Mike Tipton
	Currently, all SCMI devices with performance domains attempt to register a cpufreq driver, even if their performance domains aren't used to control the CPUs. The cpufreq framework only supports registering a single driver, so only the first device will succeed. And if that device isn't used for the CPUs, then cpufreq will scale the wrong domains. To avoid this, return early from scmi_cpufreq_probe() if the probing SCMI device isn't referenced by the CPU device phandles. This keeps the existing assumption that all CPUs are controlled by a single SCMI device. Signed-off-by: Mike Tipton <quic_mdtipton@quicinc.com> Reviewed-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-20	Merge branch 'rust/cpufreq-dt' into cpufreq/arm/linux-next	Viresh Kumar

2025-05-20	cpufreq: Add Rust-based cpufreq-dt driver	Viresh Kumar
	Introduce a Rust-based implementation of the cpufreq-dt driver, covering most of the functionality provided by the existing C version. Some features, such as retrieving platform data from `cpufreq-dt-platdev.c`, are still pending. The driver has been tested with QEMU, and frequency scaling works as expected. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-20	rust: opp: Extend OPP abstractions with cpufreq support	Viresh Kumar
	Extend the OPP abstractions to include support for interacting with the cpufreq core, including the ability to retrieve frequency tables from OPP table. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-20	rust: cpufreq: Extend abstractions for driver registration	Viresh Kumar
	Extend the cpufreq abstractions to support driver registration from Rust. Reviewed-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-20	rust: cpufreq: Extend abstractions for policy and driver ops	Viresh Kumar
	Extend the cpufreq abstractions to include support for policy handling and driver operations. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-20	rust: cpufreq: Add initial abstractions for cpufreq framework	Viresh Kumar
	Introduce initial Rust abstractions for the cpufreq core. This includes basic representations for cpufreq flags, relation types, and the cpufreq table. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-19	hwmon: (isl28022) Fix current reading calculation	Yikai Tsai
	According to the ISL28022 datasheet, bit15 of the current register is representing -32768. Fix the calculation to properly handle this bit, ensuring correct measurements for negative values. Signed-off-by: Yikai Tsai <yikai.tsai.wiwynn@gmail.com> Link: https://lore.kernel.org/r/20250519084055.3787-2-yikai.tsai.wiwynn@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-05-20	rust: opp: Add abstractions for the configuration options	Viresh Kumar
	Introduce Rust abstractions for the OPP core configuration options, enabling safe access to various configurable aspects of the OPP framework. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-20	rust: opp: Add abstractions for the OPP table	Viresh Kumar
	Introduce Rust abstractions for `struct opp_table`, enabling access to OPP tables from Rust. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-20	rust: opp: Add initial abstractions for OPP framework	Viresh Kumar
	Introduce initial Rust abstractions for the Operating Performance Points (OPP) framework. This includes bindings for `struct dev_pm_opp` and `struct dev_pm_opp_data`, laying the groundwork for further OPP integration. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-20	rust: cpu: Add from_cpu()	Viresh Kumar
	This implements cpu::from_cpu(), which returns a reference to Device for a CPU. The C struct is created at initialization time for CPUs and is never freed and so ARef isn't returned from this function. The new helper will be used by Rust based cpufreq drivers. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-20	rust: macros: enable use of hyphens in module names	Anisse Astier
	Some modules might need naming that contains hyphens "-" to match the auto-probing by name in the platform devices that comes from the device tree. But Rust identifier cannot contain hyphens, so replace them with underscores. Signed-off-by: Anisse Astier <anisse@astier.eu> Reviewed-by: Alice Ryhl <aliceryhl@google.com> [ Viresh: Replace "-" with '-', minor commit log fix, rename variable and fix line length checkpatch warnings ] Reviewed-by: Benno Lossin <lossin@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-05-20	nvme: rename nvme_mpath_shutdown_disk to nvme_mpath_remove_disk	Nilay Shroff
	In the NVMe context, the term "shutdown" has a specific technical meaning. To avoid confusion, this commit renames the nvme_mpath_ shutdown_disk function to nvme_mpath_remove_disk to better reflect its purpose (i.e. removing the disk from the system). However, nvme_mpath_remove_disk was already in use, and its functionality is related to releasing or putting the head node disk. To resolve this naming conflict and improve clarity, the existing nvme_mpath_ remove_disk function is also renamed to nvme_mpath_put_disk. This renaming improves code readability and better aligns function names with their actual roles. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20	nvme: introduce multipath_always_on module param	Nilay Shroff
	Currently, a multipath head disk node is not created for single- ported NVMe adapters or private namespaces with non-unique NSID. However, creating a head node in these cases can help transparently handle transient PCIe link failures. Without a head node, features like delayed removal cannot be leveraged, making it difficult to tolerate such link failures. To address this, this commit introduces nvme_core module parameter multipath_always_on. When multipath_always_on is set to true, it forces the creation of a multipath head node regardless NVMe disk or namespace type. So this option allows the use of delayed removal of head node functionality even for single-ported NVMe disks and private namespaces with a unique NSID and thus helps transparently handle transient PCIe link failures. By default multipath_always_on is set to false, thus preserving the existing behavior. Setting it to true enables improved fault tolerance in PCIe setups. Moreover, please note that enabling this option would also implicitly enable nvme_core.multipath. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20	nvme-multipath: introduce delayed removal of the multipath head node	Nilay Shroff
	Currently, the multipath head node of an NVMe disk is removed immediately as soon as all paths of the disk are removed. However, this can cause issues in scenarios where: - The disk hot-removal followed by re-addition. - Transient PCIe link failures that trigger re-enumeration, temporarily removing and then restoring the disk. In these cases, removing the head node prematurely may lead to a head disk node name change upon re-addition, requiring applications to reopen their handles if they were performing I/O during the failure. To address this, introduce a delayed removal mechanism of head disk node. During transient failure, instead of immediate removal of head disk node, the system waits for a configurable timeout, allowing the disk to recover. During transient disk failure, if application sends any IO then we queue it instead of failing such IO immediately. If the disk comes back online within the timeout, the queued IOs are resubmitted to the disk ensuring seamless operation. In case disk couldn't recover from the failure then queued IOs are failed to its completion and application receives the error. So this way, if disk comes back online within the configured period, the head node remains unchanged, ensuring uninterrupted workloads without requiring applications to reopen device handles. A new sysfs attribute, named "delayed_removal_secs" is added under head disk blkdev for user who wish to configure time for the delayed removal of head disk node. The default value of this attribute is set to zero second ensuring no behavior change unless explicitly configured. Link: https://lore.kernel.org/linux-nvme/Y9oGTKCFlOscbPc2@infradead.org/ Link: https://lore.kernel.org/linux-nvme/Y+1aKcQgbskA2tra@kbusch-mbp.dhcp.thefacebook.com/ Suggested-by: Keith Busch <kbusch@kernel.org> Suggested-by: Christoph Hellwig <hch@infradead.org> [nilay: reworked based on the original idea/POC from Christoph and Keith] Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20	nvme-pci: derive and better document max segments limits	Christoph Hellwig
	Redefine the max segments and max integrity limits based on the limiting factors. This keeps exactly the same values for 4k PAGE_SIZE systems, but increases the number of segments for larger page size as it properly derives the scatterlist allocation based limit for them instead of assuming a 4k PAGE_SIZE. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org>
2025-05-20	nvme-pci: use struct_size for allocation struct nvme_dev	Christoph Hellwig
	This avoids open coding the variable size array arithmetics. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Leon Romanovsky <leon@kernel.org>
2025-05-20	nvme-pci: add a symolic name for the small pool size	Leon Romanovsky
	Open coding magic numbers in multiple places is never a good idea. Signed-off-by: Leon Romanovsky <leon@kernel.org> [hch: split from a larger patch] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
2025-05-20	nvme-pci: use a better encoding for small prp pool allocations	Christoph Hellwig
	Add a separate flag to encode that the transfer is using the small page sized pool, and use a normal 0..n count for the number of descriptors. Contains improvements and suggestions from Kanchan Joshi <joshi.k@samsung.com> and Leon Romanovsky <leon@kernel.org>. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Leon Romanovsky <leon@kernel.org>
2025-05-20	nvme-pci: rename the descriptor pools	Christoph Hellwig
	They are used for both PRPs and SGLs, and we use descriptor elsewhere when referring to their allocations, so use that name here as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Leon Romanovsky <leon@kernel.org>
2025-05-20	nvme-pci: remove struct nvme_descriptor	Christoph Hellwig
	There is no real point in having a union of two pointer types here, just use a void pointer as we mix and match types between the arms of the union between the allocation and freeing side already. Also rename the nr_allocations field to nr_descriptors to better describe what it does. Signed-off-by: Christoph Hellwig <hch@lst.de> [leon: ported forward to include metadata SGL support] Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
2025-05-20	nvme-pci: store aborted state in flags variable	Leon Romanovsky
	Instead of keeping dedicated "bool aborted" variable, switch to a flags flags that can be used for other flags as well. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>