linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-09-17	crypto: ccp - Add new HV-Fixed page allocation/free API	Ashish Kalra
	When SEV-SNP is active, the TEE extended command header page and all output buffers for TEE extended commands (such as used by Seamless Firmware servicing support) must be in hypervisor-fixed state, assigned to the hypervisor and marked immutable in the RMP entrie(s). Add a new generic SEV API interface to allocate/free hypervisor fixed pages which abstracts hypervisor fixed page allocation/free for PSP sub devices. The API internally uses SNP_INIT_EX to transition pages to HV-Fixed page state. If SNP is not enabled then the allocator is simply a wrapper over alloc_pages() and __free_pages(). When the sub device free the pages, they are put on a free list and future allocation requests will try to re-use the freed pages from this list. But this list is not preserved across PSP driver load/unload hence this free/reuse support is only supported while PSP driver is loaded. As HV_FIXED page state is only changed at reboot, these pages are leaked as they cannot be returned back to the page allocator and then potentially allocated to guests, which will cause SEV-SNP guests to fail to start or terminate when accessing the HV_FIXED page. Suggested-by: Thomas Lendacky <Thomas.Lendacky@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/cover.1758057691.git.ashish.kalra@amd.com
2025-09-17	EDAC/mc_sysfs: Increase legacy channel support to 16	Avadhut Naik
	Newer AMD systems can support up to 16 channels per EDAC "mc" device. These are detected by the EDAC module running on the device, and the current EDAC interface is appropriately enumerated. The legacy EDAC sysfs interface however, provides device attributes for channels 0 through 11 only. Consequently, the last four channels, 12 through 15, will not be enumerated and will not be visible through the legacy sysfs interface. Add additional device attributes to ensure that all 16 channels, if present, are enumerated by and visible through the legacy EDAC sysfs interface. Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250916203242.1281036-1-avadhut.naik@amd.com
2025-09-17	EDAC/amd64: Add support for AMD family 1Ah-based newer models	Avadhut Naik
	Add support for family 1Ah-based models 50h-57h, 90h-9Fh, A0h-AFh, and C0h-C7h. Also, raise the maximum memory controllers number as those machines support that many. Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250916203242.1281036-1-avadhut.naik@amd.com
2025-09-17	x86/cpu: Rename and move CPU model entry for Diamond Rapids	Tony Luck
	This model was added as INTEL_PANTHERCOVE_X (based on the name of the core) with a comment that the platform name is Diamond Rapids. It was also placed at the end of the file in a new section for family 19 processors. This is different from previous naming as Andrew Cooper noted. PeterZ agreed and posted a patch[1] to fix the name and move it in sequence with other Xeon servers. But without a commit description or sign-off the patch wasn't ever applied. Patch updated to cover one additional use of the #define by turbostat and to change the "Family 6" comment to also list 18 and 19 since new models in these families are mixed in with family 6. Originally-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://lore.kernel.org/all/20250214130205.GK14028@noisy.programming.kicks-ass.net/ # [1]
2025-09-17	net/mlx5e: Prevent WQE metadata conflicts between timestamping and offloads	Carolina Jubran
	Update the WQE metadata assignment to avoid overriding existing metadata when setting the sysport timestamp ID. Since timestamp IDs are limited to 256 values, they use only the lower 8 bits of the metadata field. To avoid conflicts, move IPsec and MACsec metadata ID to bits 8 and 9, and shift the MACsec fs_id accordingly. This ensures safe coexistence of timestamping and offload features that use the same metadata field. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757574619-604874-4-git-send-email-tariqt@nvidia.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-09-17	net/mlx5: Refactor MACsec WQE metadata shifts	Carolina Jubran
	Introduce MLX5_ETH_WQE_FT_META_SHIFT as a shared base offset for features that use the lower 8 bits of the WQE flow_table_metadata field, currently used for timestamping, IPsec, and MACsec. Define MLX5_ETH_WQE_FT_META_MACSEC_FS_ID_MASK so that fs_id occupies bits 2–5, making it clear that fs_id occupies bits in the metadata. Set MLX5_ETH_WQE_FT_META_MACSEC_MASK as the OR of the MACsec flag and MLX5_ETH_WQE_FT_META_MACSEC_FS_ID_MASK, corresponding to the original 0x3E mask. Update the fs_id macro to right-shift the MACsec flag by MLX5_ETH_WQE_FT_META_SHIFT and update the RoCE modify-header action to use it. Introduce the helper macro MLX5_MACSEC_TX_METADATA(fs_id) to compose the full shifted MACsec metadata value. These changes make it explicit exactly which metadata bits carry MACsec information, simplifying future feature exclusions when multiple features share the WQE flowtable metadata. In addition, drop the incorrect “RX flow steering” comment, since this applies to TX flow steering. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757574619-604874-3-git-send-email-tariqt@nvidia.com Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-09-17	Merge tag 'drm-rust-next-2025-09-16' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/rust/kernel into drm-next DRM Rust changes for v6.18 Alloc - Add BorrowedPage type and AsPageIter trait - Implement Vmalloc::to_page() and VmallocPageIter - Implement AsPageIter for VBox and VVec DMA & Scatterlist - Add dma::DataDirection and type alias for dma_addr_t - Abstraction for struct scatterlist and struct sg_table DRM - In the DRM GEM module, simplify overall use of generics, add DriverFile type alias and drop Object::SIZE. Nova (Core) - Various register!() macro improvements (paving the way for lifting it to common driver infrastructure) - Minor VBios fixes and refactoring - Minor firmware request refactoring - Advance firmware boot stages; process Booter and patch its signature, process GSP and GSP bootloader - Switch development fimrware version to r570.144 - Add basic firmware bindings for r570.144 - Move GSP boot code to its own module - Clean up and take advantage of pin-init features to store most of the driver's private data within a single allocation - Update ARef import from sync::aref - Add website to MAINTAINERS entry Nova (DRM) - Update ARef import from sync::aref - Add website to MAINTAINERS entry Pin-Init - Merge pin-init PR from Benno - `#[pin_data]` now generates a `Projection` struct similar to the `pin-project` crate. - Add initializer code blocks to `[try_][pin_]init!` macros: make initializer macros accept any number of `_: {/ arbitrary code */},` & make them run the code at that point. - Make the `[try_][pin_]init!` macros expose initialized fields via a `let` binding as `&mut T` or `Pin<&mut T>` for later fields. Rust - Various methods for AsBytes and FromBytes traits Tyr - Initial Rust driver skeleton for ARM Mali GPUs. - It can power up the GPU, query for GPU metatdata through MMIO and provide the metadata to userspace via DRM device IOCTL (struct drm_panthor_dev_query). Signed-off-by: Dave Airlie <airlied@redhat.com> From: "Danilo Krummrich" <dakr@kernel.org> Link: https://lore.kernel.org/r/DCUC4SY6SRBD.1ZLHAIQZOC6KG@kernel.org
2025-09-16	net/mlx5: Lag, add net namespace support	Shay Drory
	Update the LAG implementation to support net namespace isolation. Recent devcom changes added namespace-aware client matching. Align LAG with this model so that hardware LAG forms only between mlx5 interfaces that share the same network namespace. This avoids cross-namespace interference and matches user expectations when devices are placed in different netns. Make LAG netns-aware by storing the device’s namespace in mlx5_lag and registering the devcom client with that namespace. As a result, only peers in the same netns are eligible to form a LAG. Adjust reload handling so LAG teardown/re-evaluation happens in the correct namespace context. Remove the blanket restriction that prevented devlink reload when LAG was active. Remove the reload restriction here allowing devlink reload in LAG mode is part of delivering complete netns aware LAG support: With per-netns devcom registration, reload no longer risks cross-namespace coupling. The devcom client is torn down and re-registered in the device’s current netns, and LAG is re-evaluated within that scope. The change is trivial and self-contained, and keeping it in this patch avoids splitting a feature that is functionally one unit. Only devices in same netns can form hardware LAG. devlink reload no longer fails just because LAG is active. LAG is torn down/re-created as needed within the correct namespace. No change for setups that don’t use namespaces. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757940070-618661-5-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-16	net/mlx5: Add net namespace support to devcom	Shay Drory
	Extend the devcom framework to support namespace-aware components. The existing devcom matching logic was based solely on numeric keys, limiting its use to the global (init_net) scope or requiring clients to ignore namespaces altogether, both of which are incorrect in multi-namespace environments. This patch introduces namespace support by allowing devcom clients to provide a namespace match attribute. The devcom pairing mechanism is updated to compare the namespace, enabling proper isolation and interaction of components across different net namespaces. With this change, components that require namespace aware pairing, such as SD groups or LAG, can now work correctly in multi-namespace scenarios. In particular, this opens the way to support hardware LAG within a net namespace. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757940070-618661-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-16	net/mlx5: Lag, move devcom registration to LAG layer	Shay Drory
	Move the devcom registration for the HCA_PORTS component from the core initialization path into the LAG logic. This better reflects the logical ownership of this component and ensures proper alignment with the LAG lifecycle. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757940070-618661-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-16	net/mlx5: Refactor devcom to use match attributes	Shay Drory
	Refactor the devcom interface to use a match attribute structure instead of passing raw keys. This change lays the groundwork for extending devcom matching logic with additional fields like net namespace, improving its flexibility and robustness. No functional changes. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757940070-618661-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-16	net/mlx5e: Add a miss level for ipsec crypto offload	Lama Kayal
	The cited commit adds a miss table for switchdev mode. But it uses the same level as policy table. Will hit the following error when running command: # ip xfrm state add src 192.168.1.22 dst 192.168.1.21 proto \ esp spi 1001 reqid 10001 aead 'rfc4106(gcm(aes))' \ 0x3a189a7f9374955d3817886c8587f1da3df387ff 128 \ mode tunnel offload dev enp8s0f0 dir in Error: mlx5_core: Device failed to offload this state. The dmesg error is: mlx5_core 0000:03:00.0: ipsec_miss_create:578:(pid 311797): fail to create IPsec miss_rule err=-22 Fix it by adding a new miss level to avoid the error. Fixes: 7d9e292ecd67 ("net/mlx5e: Move IPSec policy check after decryption") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Lama Kayal <lkayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757939074-617281-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-16	net/mlx5e: Harden uplink netdev access against device unbind	Jianbo Liu
	The function mlx5_uplink_netdev_get() gets the uplink netdevice pointer from mdev->mlx5e_res.uplink_netdev. However, the netdevice can be removed and its pointer cleared when unbound from the mlx5_core.eth driver. This results in a NULL pointer, causing a kernel panic. BUG: unable to handle page fault for address: 0000000000001300 at RIP: 0010:mlx5e_vport_rep_load+0x22a/0x270 [mlx5_core] Call Trace: <TASK> mlx5_esw_offloads_rep_load+0x68/0xe0 [mlx5_core] esw_offloads_enable+0x593/0x910 [mlx5_core] mlx5_eswitch_enable_locked+0x341/0x420 [mlx5_core] mlx5_devlink_eswitch_mode_set+0x17e/0x3a0 [mlx5_core] devlink_nl_eswitch_set_doit+0x60/0xd0 genl_family_rcv_msg_doit+0xe0/0x130 genl_rcv_msg+0x183/0x290 netlink_rcv_skb+0x4b/0xf0 genl_rcv+0x24/0x40 netlink_unicast+0x255/0x380 netlink_sendmsg+0x1f3/0x420 __sock_sendmsg+0x38/0x60 __sys_sendto+0x119/0x180 do_syscall_64+0x53/0x1d0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Ensure the pointer is valid before use by checking it for NULL. If it is valid, immediately call netdev_hold() to take a reference, and preventing the netdevice from being freed while it is in use. Fixes: 7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757939074-617281-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-17	firewire: core: shrink critical section of fw_card spinlock in bm_work	Takashi Sakamoto
	Now fw_core_handle_bus_reset() and bm_work() are serialized. Some members of fw_card are free to access in bm_work() This commit shrinks critical section of fw_card spinlock in bm_work() Link: https://lore.kernel.org/r/20250917000347.52369-4-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2025-09-17	firewire: core: disable bus management work temporarily during updating topology	Takashi Sakamoto
	When processing selfID sequence, bus topology tree is (re)built, and some members of fw_card are determined. Once determined, the members are valid during the bus generation. The above operations are in the critical section of fw_card spin lock. Before building the bus topology, a work item is scheduled for bus manager work. The bm_work() function is invoked by the work item. The function tries to acquire the spin lock, then can be stalled until the bus topology building finishes. The bus manager should work once the members of fw_card are determined. This commit suppresses the above situation by disabling the work item during processing selfID sequence. Link: https://lore.kernel.org/r/20250917000347.52369-3-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2025-09-17	firewire: core: schedule bm_work item outside of spin lock	Takashi Sakamoto
	Before (re)building topology tree, fw_core_handle_bus_reset() schedules a work item under acquiring fw_card spin lock. The work item invokes bm_work() which acquires the spin lock at first, then can be stalled to wait until the building tree finishes. This is inconvenient. This commit moves the timing to schedule the work item after releasing the spin lock. Link: https://lore.kernel.org/r/20250917000347.52369-2-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2025-09-16	net: dsa: mv88e6xxx: clean up PTP clock during setup failure	Russell King (Oracle)
	If an error occurs during mv88e6xxx_setup() and the PTP clock has been registered, the clock will not be unregistered as mv88e6xxx_ptp_free() will not be called. mv88e6xxx_hwtstamp_free() also is not called. As mv88e6xxx_ptp_free() can cope with being called without a successful call to mv88e6xxx_ptp_setup(), and mv88e6xxx_hwtstamp_free() is empty, add both these _free() calls to the error cleanup paths in mv88e6xxx_setup(). Moreover, mv88e6xxx_teardown() should teardown setup done in mv88e6xxx_setup() - see dsa_switch_setup(). However, instead _free() are called from mv88e6xxx_remove() function that is only called when a device is unbound, which omits cleanup should a failure occur later in dsa_switch_setup(). Move the *_free() calls from mv88e6xxx_remove() to mv88e6xxx_teardown(). Note that mv88e6xxx_ptp_setup() must be called holding the reg_lock, but mv88e6xxx_ptp_free() must never be. This is especially true after commit "ptp: rework ptp_clock_unregister() to disable events". This patch does not change this, but adds a comment to that effect. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/E1uy84w-00000005Spi-46iF@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-16	net: mvpp2: add support for hardware timestamps	Russell King
	Add support for hardware timestamps in (e.g.) the PHY by calling skb_tx_timestamp() as close as reasonably possible to the point that the hardware is instructed to send the queued packets. As this also introduces software timestamping support, report those capabilities via the .get_ts_info() method. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1uy82E-00000005Sll-0SSy@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-16	libie: fix linking with libie_{adminq,fwlog} when CONFIG_LIBIE=n	Alexander Lobakin
	Initially, libie contained only 1 module and I assumed that new modules in its folder would depend on it. However, Michał did a good job and libie_{adminq,fwlog} are completely independent, but libie/ is still traversed by Kbuild only under CONFIG_LIBIE != n. This results in undefined references with certain kernel configs. Tell Kbuild to always descend to libie/ to be able to build each module regardless of whether the basic one is enabled. If none of CONFIG_LIBIE* is set, Kbuild will just create an empty built-in.a there with no side effects. Fixes: 641585bc978e ("ixgbe: fwlog support for e610") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/202509140606.j8z3rE73-lkp@intel.com Reported-by: Breno Leitao <leitao@debian.org> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Closes: https://lore.kernel.org/CA+G9fYvH8d6pJRbHpOCMZFjgDCff3zcL_AsXL-nf5eB2smS8SA@mail.gmail.com Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250916160118.2209412-1-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-16	drm/xe: Allow error injection for xe_pxp_exec_queue_add	Daniele Ceraolo Spurio
	This will allow us to simulate this function returning an error like we do for other functions called in the exec_queue_create path. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250909221240.3711023-4-daniele.ceraolospurio@intel.com
2025-09-16	drm/xe: Fix error handling if PXP fails to start	Daniele Ceraolo Spurio
	Since the PXP start comes after __xe_exec_queue_init() has completed, we need to cleanup what was done in that function in case of a PXP start error. __xe_exec_queue_init calls the submission backend init() function, so we need to introduce an opposite for that. Unfortunately, while we already have a fini() function pointer, it performs other operations in addition to cleaning up what was done by the init(). Therefore, for clarity, the existing fini() has been renamed to destroy(), while a new fini() has been added to only clean up what was done by the init(), with the latter being called by the former (via xe_exec_queue_fini). Fixes: 72d479601d67 ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250909221240.3711023-3-daniele.ceraolospurio@intel.com
2025-09-16	drm/amd: Only restore cached manual clock settings in restore if OD enabled	Mario Limonciello
	If OD is not enabled then restoring cached clock settings doesn't make sense and actually leads to errors in resume. Check if enabled before restoring settings. Fixes: 4e9526924d09 ("drm/amd: Restore cached manual clock settings during resume") Reported-by: Jérôme Lécuyer <jerome.4a4c@gmail.com> Closes: https://lore.kernel.org/amd-gfx/0ffe2692-7bfa-4821-856e-dd0f18e2c32b@amd.com/T/#me6db8ddb192626360c462b7570ed7eba0c6c9733 Suggested-by: Jérôme Lécuyer <jerome.4a4c@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1a4dd33cc6e1baaa81efdbe68227a19f51c50f20) Cc: stable@vger.kernel.org
2025-09-16	drm/amdgpu: re-order and document VM code	Christian König
	Re-order fields in the VM structure and try to improve the documentation a bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amdgpu: remove check for BO reservation add assert instead	Christian König
	We should leave such checks to lockdep and not implement something manually. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/pm: Update pmfw headers for smu_v13_0_12	Asad Kamal
	Update pmfw headers for smu_v13_0_12 to include node power limit Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/pm: Rename amdgpu_hwmon_get_sensor_generic	Asad Kamal
	Rename amdgpu_hwmon_get_sensor_generic to use for generic pm interfaces Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd: Only restore cached manual clock settings in restore if OD enabled	Mario Limonciello
	If OD is not enabled then restoring cached clock settings doesn't make sense and actually leads to errors in resume. Check if enabled before restoring settings. Fixes: 4e9526924d09 ("drm/amd: Restore cached manual clock settings during resume") Reported-by: Jérôme Lécuyer <jerome.4a4c@gmail.com> Closes: https://lore.kernel.org/amd-gfx/0ffe2692-7bfa-4821-856e-dd0f18e2c32b@amd.com/T/#me6db8ddb192626360c462b7570ed7eba0c6c9733 Suggested-by: Jérôme Lécuyer <jerome.4a4c@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/pm: Use devm_i2c_add_adapter() in the V14_0_2 smu	Rodrigo Siqueira
	The I2C init for V14_0_2 uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that V14_0_2 init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/pm: Use devm_i2c_add_adapter() in the V13_0_6 smu	Rodrigo Siqueira
	The I2C init for V13_0_6 uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that V13_0_6 init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/pm: Use devm_i2c_add_adapter() in the V13 smu	Rodrigo Siqueira
	The I2C init for SMU_V13 uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that SMU_V13 init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/pm: Use devm_i2c_add_adapter() in the Sienna smu	Rodrigo Siqueira
	The I2C init for Sienna Cichlid uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that Sienna Cichlid init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/pm: Use devm_i2c_add_adapter() in the Navi10 smu	Rodrigo Siqueira
	The I2C init for Navi10 uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that Navi10 init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/pm: Use devm_i2c_add_adapter() in the Arcturus smu	Rodrigo Siqueira
	The I2C init for Arcturus uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that Arcturus init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/pm: Use devm_i2c_add_adapter() in the i2c init	Rodrigo Siqueira
	Instead of using i2c_add_adapter() and i2c_del_adapter(), replace them with devm_i2c_add_adapter() to simplify the i2c logic. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amdgpu: Use devm_i2c_add_adapter() in SMU V11	Rodrigo Siqueira
	Instead of using i2c_add_adapter() and i2c_del_adapter() in the SMU V11, use devm_i2c_add_adapter() to simplify the code path. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amdgpu/amdgpu_i2c: Use devm_i2c_add_adapter instead of i2c_add_adapter	Rodrigo Siqueira
	This commit replaces i2c_add_adapter() with devm_i2c_add_adapter() and removes part of the cleanup logic since the new function handles the i2c removal. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/display: Use devm_i2c_add_adapter to simplify i2c cleanup logic	Rodrigo Siqueira
	This commit replaces the utilization of i2c_add/del_adapter() with devm_i2c_add_adapter() to reduce the amount of boilerplate. Using devm_i2c_add_adapter() has the advantage of removing the manual manipulation of the I2C adapter. Suggested-by: Robert Beckett <bob.beckett@collabora.com> Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amd/display: Use kmalloc_array() instead of kmalloc()	James Flowers
	Documentation/process/deprecated.rst recommends against the use of kmalloc with dynamic size calculations due to the risk of overflow and smaller allocation being made than the caller was expecting. This could lead to buffer overflow in code similar to the memcpy in amdgpu_dm_plane_add_modifier(). Signed-off-by: James Flowers <bold.zone2373@fastmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amdgpu: fix userq VM validation v4	Christian König
	That was actually complete nonsense and not validating the BOs at all. The code just cleared all VM areas were it couldn't grab the lock for a BO. Try to fix this. Only compile tested at the moment. v2: fix fence slot reservation as well as pointed out by Sunil. also validate PDs, PTs, per VM BOs and update PDEs v3: grab the status_lock while working with the done list. v4: rename functions, add some comments, fix waiting for updates to complete. v4: rename amdgpu_vm_lock_done_list(), add some more comments Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	drm/amdgpu: reject gang submissions under SRIOV	Christian König
	Gang submission means that the kernel driver guarantees that multiple submissions are executed on the HW at the same time on different engines. Background is that those submissions then depend on each other and each can't finish stand alone. SRIOV now uses world switch to preempt submissions on the engines to allow sharing the HW resources between multiple VFs. The problem is now that the SRIOV world switch can't know about such inter dependencies and will cause a timeout if it waits for a partially running gang submission. To conclude SRIOV and gang submissions are fundamentally incompatible at the moment. For now just disable them. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-16	igc: don't fail igc_probe() on LED setup error	Kohei Enju
	When igc_led_setup() fails, igc_probe() fails and triggers kernel panic in free_netdev() since unregister_netdev() is not called. [1] This behavior can be tested using fault-injection framework, especially the failslab feature. [2] Since LED support is not mandatory, treat LED setup failures as non-fatal and continue probe with a warning message, consequently avoiding the kernel panic. [1] kernel BUG at net/core/dev.c:12047! Oops: invalid opcode: 0000 [#1] SMP NOPTI CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-rc4-enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:free_netdev+0x278/0x2b0 [...] Call Trace: <TASK> igc_probe+0x370/0x910 local_pci_probe+0x3a/0x80 pci_device_probe+0xd1/0x200 [...] [2] #!/bin/bash -ex FAILSLAB_PATH=/sys/kernel/debug/failslab/ DEVICE=0000:00:05.0 START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \ \| awk '{printf("0x%s", $1)}') END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100))) echo $START_ADDR > $FAILSLAB_PATH/require-start echo $END_ADDR > $FAILSLAB_PATH/require-end echo 1 > $FAILSLAB_PATH/times echo 100 > $FAILSLAB_PATH/probability echo N > $FAILSLAB_PATH/ignore-gfp-wait echo $DEVICE > /sys/bus/pci/drivers/igc/bind Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226") Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-16	ixgbe: destroy aci.lock later within ixgbe_remove path	Jedrzej Jagielski
	There's another issue with aci.lock and previous patch uncovers it. aci.lock is being destroyed during removing ixgbe while some of the ixgbe closing routines are still ongoing. These routines use Admin Command Interface which require taking aci.lock which has been already destroyed what leads to call trace. [ +0.000004] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ +0.000007] WARNING: CPU: 12 PID: 10277 at kernel/locking/mutex.c:155 mutex_lock+0x5f/0x70 [ +0.000002] Call Trace: [ +0.000003] <TASK> [ +0.000006] ixgbe_aci_send_cmd+0xc8/0x220 [ixgbe] [ +0.000049] ? try_to_wake_up+0x29d/0x5d0 [ +0.000009] ixgbe_disable_rx_e610+0xc4/0x110 [ixgbe] [ +0.000032] ixgbe_disable_rx+0x3d/0x200 [ixgbe] [ +0.000027] ixgbe_down+0x102/0x3b0 [ixgbe] [ +0.000031] ixgbe_close_suspend+0x28/0x90 [ixgbe] [ +0.000028] ixgbe_close+0xfb/0x100 [ixgbe] [ +0.000025] __dev_close_many+0xae/0x220 [ +0.000005] dev_close_many+0xc2/0x1a0 [ +0.000004] ? kernfs_should_drain_open_files+0x2a/0x40 [ +0.000005] unregister_netdevice_many_notify+0x204/0xb00 [ +0.000006] ? __kernfs_remove.part.0+0x109/0x210 [ +0.000006] ? kobj_kset_leave+0x4b/0x70 [ +0.000008] unregister_netdevice_queue+0xf6/0x130 [ +0.000006] unregister_netdev+0x1c/0x40 [ +0.000005] ixgbe_remove+0x216/0x290 [ixgbe] [ +0.000021] pci_device_remove+0x42/0xb0 [ +0.000007] device_release_driver_internal+0x19c/0x200 [ +0.000008] driver_detach+0x48/0x90 [ +0.000003] bus_remove_driver+0x6d/0xf0 [ +0.000006] pci_unregister_driver+0x2e/0xb0 [ +0.000005] ixgbe_exit_module+0x1c/0xc80 [ixgbe] Same as for the previous commit, the issue has been highlighted by the commit 337369f8ce9e ("locking/mutex: Add MUTEX_WARN_ON() into fast path"). Move destroying aci.lock to the end of ixgbe_remove(), as this simply fixes the issue. Fixes: 4600cdf9f5ac ("ixgbe: Enable link management in E610 device") Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-16	ixgbe: initialize aci.lock before it's used	Jedrzej Jagielski
	Currently aci.lock is initialized too late. A bunch of ACI callbacks using the lock are called prior it's initialized. Commit 337369f8ce9e ("locking/mutex: Add MUTEX_WARN_ON() into fast path") highlights that issue what results in call trace. [ 4.092899] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 4.092910] WARNING: CPU: 0 PID: 578 at kernel/locking/mutex.c:154 mutex_lock+0x6d/0x80 [ 4.098757] Call Trace: [ 4.098847] <TASK> [ 4.098922] ixgbe_aci_send_cmd+0x8c/0x1e0 [ixgbe] [ 4.099108] ? hrtimer_try_to_cancel+0x18/0x110 [ 4.099277] ixgbe_aci_get_fw_ver+0x52/0xa0 [ixgbe] [ 4.099460] ixgbe_check_fw_error+0x1fc/0x2f0 [ixgbe] [ 4.099650] ? usleep_range_state+0x69/0xd0 [ 4.099811] ? usleep_range_state+0x8c/0xd0 [ 4.099964] ixgbe_probe+0x3b0/0x12d0 [ixgbe] [ 4.100132] local_pci_probe+0x43/0xa0 [ 4.100267] work_for_cpu_fn+0x13/0x20 [ 4.101647] </TASK> Move aci.lock mutex initialization to ixgbe_sw_init() before any ACI command is sent. Along with that move also related SWFW semaphore in order to reduce size of ixgbe_probe() and that way all locks are initialized in ixgbe_sw_init(). Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Fixes: 4600cdf9f5ac ("ixgbe: Enable link management in E610 device") Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-16	i40e: remove redundant memory barrier when cleaning Tx descs	Maciej Fijalkowski
	i40e has a feature which writes to memory location last descriptor successfully sent. Memory barrier in i40e_clean_tx_irq() was used to avoid forward-reading descriptor fields in case DD bit was not set. Having mentioned feature in place implies that such situation will not happen as we know in advance how many descriptors HW has dealt with. Besides, this barrier placement was wrong. Idea is to have this protection after reading DD bit from HW descriptor, not before. Digging through git history showed me that indeed barrier was before DD bit check, anyways the commit introducing i40e_get_head() should have wiped it out altogether. Also, there was one commit doing s/read_barrier_depends/smp_rmb when get head feature was already in place, but it was only theoretical based on ixgbe experiences, which is different in these terms as that driver has to read DD bit from HW descriptor. Fixes: 1943d8ba9507 ("i40e/i40evf: enable hardware feature head write back") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-16	ice: fix Rx page leak on multi-buffer frames	Jacob Keller
	The ice_put_rx_mbuf() function handles calling ice_put_rx_buf() for each buffer in the current frame. This function was introduced as part of handling multi-buffer XDP support in the ice driver. It works by iterating over the buffers from first_desc up to 1 plus the total number of fragments in the frame, cached from before the XDP program was executed. If the hardware posts a descriptor with a size of 0, the logic used in ice_put_rx_mbuf() breaks. Such descriptors get skipped and don't get added as fragments in ice_add_xdp_frag. Since the buffer isn't counted as a fragment, we do not iterate over it in ice_put_rx_mbuf(), and thus we don't call ice_put_rx_buf(). Because we don't call ice_put_rx_buf(), we don't attempt to re-use the page or free it. This leaves a stale page in the ring, as we don't increment next_to_alloc. The ice_reuse_rx_page() assumes that the next_to_alloc has been incremented properly, and that it always points to a buffer with a NULL page. Since this function doesn't check, it will happily recycle a page over the top of the next_to_alloc buffer, losing track of the old page. Note that this leak only occurs for multi-buffer frames. The ice_put_rx_mbuf() function always handles at least one buffer, so a single-buffer frame will always get handled correctly. It is not clear precisely why the hardware hands us descriptors with a size of 0 sometimes, but it happens somewhat regularly with "jumbo frames" used by 9K MTU. To fix ice_put_rx_mbuf(), we need to make sure to call ice_put_rx_buf() on all buffers between first_desc and next_to_clean. Borrow the logic of a similar function in i40e used for this same purpose. Use the same logic also in ice_get_pgcnts(). Instead of iterating over just the number of fragments, use a loop which iterates until the current index reaches to the next_to_clean element just past the current frame. Unlike i40e, the ice_put_rx_mbuf() function does call ice_put_rx_buf() on the last buffer of the frame indicating the end of packet. For non-linear (multi-buffer) frames, we need to take care when adjusting the pagecnt_bias. An XDP program might release fragments from the tail of the frame, in which case that fragment page is already released. Only update the pagecnt_bias for the first descriptor and fragments still remaining post-XDP program. Take care to only access the shared info for fragmented buffers, as this avoids a significant cache miss. The xdp_xmit value only needs to be updated if an XDP program is run, and only once per packet. Drop the xdp_xmit pointer argument from ice_put_rx_mbuf(). Instead, set xdp_xmit in the ice_clean_rx_irq() function directly. This avoids needing to pass the argument and avoids an extra bit-wise OR for each buffer in the frame. Move the increment of the ntc local variable to ensure its updated before all calls to ice_get_pgcnts() or ice_put_rx_mbuf(), as the loop logic requires the index of the element just after the current frame. Now that we use an index pointer in the ring to identify the packet, we no longer need to track or cache the number of fragments in the rx_ring. Cc: Christoph Petrausch <christoph.petrausch@deepl.com> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Closes: https://lore.kernel.org/netdev/CAK8fFZ4hY6GUJNENz3wY9jaYLZXGfpr7dnZxzGMYoE44caRbgw@mail.gmail.com/ Fixes: 743bbd93cf29 ("ice: put Rx buffers after being done with current frame") Tested-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Tested-by: Priya Singh <priyax.singh@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-16	drm/xe: Remove duplicate header files	Yang Li
	Fix some duplicate includes in xe: ./drivers/gpu/drm/xe/xe_tlb_inval.c: xe_tlb_inval.h is included more than once. ./drivers/gpu/drm/xe/xe_pt.c: xe_tlb_inval_job.h is included more than once. While at it, also sort the include lines alphabetically. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=24705 Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=24706 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> [Reword commit message] Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250916021039.1632766-1-yang.lee@linux.alibaba.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-09-16	drm/xe/guc: Return an error code if the GuC load fails	John Harrison
	Due to multiple explosion issues in the early days of the Xe driver, the GuC load was hacked to never return a failure. That prevented kernel panics and such initially, but now all it achieves is creating more confusing errors when the driver tries to submit commands to a GuC it already knows is not there. So fix that up. As a stop-gap and to help with debug of load failures due to invalid GuC init params, a wedge call had been added to the inner GuC load function. The reason being that it leaves the GuC log accessible via debugfs. However, for an end user, simply aborting the module load is much cleaner than wedging and trying to continue. The wedge blocks user submissions but it seems that various bits of the driver itself still try to submit to a dead GuC and lots of subsequent errors occur. And with regards to developers debugging why their particular code change is being rejected by the GuC, it is trivial to either add the wedge back in and hack the return code to zero again or to just do a GuC log dump to dmesg. v2: Add support for error injection testing and drop the now redundant wedge call. CC: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://lore.kernel.org/r/20250909224132.536320-1-John.C.Harrison@Intel.com
2025-09-16	nvme-pci: Add TUXEDO IBS Gen8 to Samsung sleep quirk	Georg Gottleuber
	On the TUXEDO InfinityBook S Gen8, a Samsung 990 Evo NVMe leads to a high power consumption in s2idle sleep (3.5 watts). This patch applies 'Force No Simple Suspend' quirk to achieve a sleep with a lower power consumption, typically around 1 watts. Signed-off-by: Georg Gottleuber <ggo@tuxedocomputers.com> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: stable@vger.kernel.org Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-09-16	ACPI: property: Adjust failure handling in acpi_nondev_subnode_extract()	Rafael J. Wysocki
	Make acpi_nondev_subnode_extract() follow the usual code flow pattern in which failure is handled at the point where it is detected. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
2025-09-16	ACPI: property: Do not pass NULL handles to acpi_attach_data()	Rafael J. Wysocki
	In certain circumstances, the ACPI handle of a data-only node may be NULL, in which case it does not make sense to attempt to attach that node to an ACPI namespace object, so update the code to avoid attempts to do so. This prevents confusing and unuseful error messages from being printed. Also document the fact that the ACPI handle of a data-only node may be NULL and when that happens in a code comment. In addition, make acpi_add_nondev_subnodes() print a diagnostic message for each data-only node with an unknown ACPI namespace scope. Fixes: 1d52f10917a7 ("ACPI: property: Tie data nodes to acpi handles") Cc: 6.0+ <stable@vger.kernel.org> # 6.0+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>