summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-29Merge tag 'drm-xe-next-2025-04-28-1' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Core Changes: - Add drm_coredump_printer_is_full() (Matt Brost) Driver Changes: - Do not queue unneeded terminations from debugfs (Daniele) - Fix out-of-bound while enabling engine activity stats (Michal) - Use GT oriented message to report engine activity error (Michal) - Some fault-injection additions (Satyanarayana) - Fix an error pointer dereference (Harshit) - Fix capture of steering registers (John) - Use the steering flag when printing registers (John) - Cache DSS info when creating capture register list (John) - Backup VRAM in PM notifier instead of in the suspend / freeze callbacks (Matt Auld) - Fix CFI violation when accessing sysfs files (Jeevaka) - Fix kernel version docs for temperature and fan speed (Lucas) - Add devcoredump chunking (Matt Brost) - Update xe_ttm_access_memory to use GPU for non-visible access (Matt Brost) - Abort printing coredump in VM printer output if full (Matt Brost) - Resolve a possible circular locking dependency (Harish) - Don't support EU stall on SRIOV VF (Harish) - Drop force_alloc from xe_bo_evict in selftests (Matt Brost) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/aA-mvTb6s909V8hu@fedora
2025-04-28Merge drm/drm-next into drm-xe-nextThomas Hellström
Additional backmerge to avoid excessive diffstats when sending PR. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-04-27drm/xe: Drop force_alloc from xe_bo_evict in selftestsMatthew Brost
The force_alloc flag was removed from TTM / Xe but updating the selftests to new function interfaces was missed. Remove argument from xe_bo_evict in selftests. v2: - Fix dma-buf, migrate selftests (CI) Fixes: 55df7c0c62c1 ("drm/ttm/xe: drop unused force_alloc flag") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://lore.kernel.org/r/20250428022318.877860-1-matthew.brost@intel.com
2025-04-25drm/xe/eustall: Do not support EU stall on SRIOV VFHarish Chegondi
EU stall sampling is not supported on SRIOV VF. Do not initialize or open EU stall stream on SRIOV VF. Fixes: 9a0b11d4cf3b ("drm/xe/eustall: Add support to init, enable and disable EU stall sampling") Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/10db5d1c7e17aadca7078ff74575b7ffc0d5d6b8.1745215022.git.harish.chegondi@intel.com
2025-04-25drm/xe/eustall: Resolve a possible circular locking dependencyHarish Chegondi
Use a separate lock in the polling function eu_stall_data_buf_poll() instead of eu_stall->stream_lock. This would prevent a possible circular locking dependency leading to a deadlock as described below. This would also require additional locking with the new lock in the read function. <4> [787.192986] ====================================================== <4> [787.192988] WARNING: possible circular locking dependency detected <4> [787.192991] 6.14.0-rc7-xe+ #1 Tainted: G U <4> [787.192993] ------------------------------------------------------ <4> [787.192994] xe_eu_stall/20093 is trying to acquire lock: <4> [787.192996] ffff88819847e2c0 ((work_completion) (&(&stream->buf_poll_work)->work)), at: __flush_work+0x1f8/0x5e0 <4> [787.193005] but task is already holding lock: <4> [787.193007] ffff88814ce83ba8 (&gt->eu_stall->stream_lock){3:3}, at: xe_eu_stall_stream_ioctl+0x41/0x6a0 [xe] <4> [787.193090] which lock already depends on the new lock. <4> [787.193093] the existing dependency chain (in reverse order) is: <4> [787.193095] -> #1 (&gt->eu_stall->stream_lock){+.+.}-{3:3}: <4> [787.193099] __mutex_lock+0xb4/0xe40 <4> [787.193104] mutex_lock_nested+0x1b/0x30 <4> [787.193106] eu_stall_data_buf_poll_work_fn+0x44/0x1d0 [xe] <4> [787.193155] process_one_work+0x21c/0x740 <4> [787.193159] worker_thread+0x1db/0x3c0 <4> [787.193161] kthread+0x10d/0x270 <4> [787.193164] ret_from_fork+0x44/0x70 <4> [787.193168] ret_from_fork_asm+0x1a/0x30 <4> [787.193172] -> #0 ((work_completion)(&(&stream->buf_poll_work)->work)){+.+.}-{0:0}: <4> [787.193176] __lock_acquire+0x1637/0x2810 <4> [787.193180] lock_acquire+0xc9/0x300 <4> [787.193183] __flush_work+0x219/0x5e0 <4> [787.193186] cancel_delayed_work_sync+0x87/0x90 <4> [787.193189] xe_eu_stall_disable_locked+0x9a/0x260 [xe] <4> [787.193237] xe_eu_stall_stream_ioctl+0x5b/0x6a0 [xe] <4> [787.193285] __x64_sys_ioctl+0xa4/0xe0 <4> [787.193289] x64_sys_call+0x131e/0x2650 <4> [787.193292] do_syscall_64+0x91/0x180 <4> [787.193295] entry_SYSCALL_64_after_hwframe+0x76/0x7e <4> [787.193299] other info that might help us debug this: <4> [787.193302] Possible unsafe locking scenario: <4> [787.193304] CPU0 CPU1 <4> [787.193305] ---- ---- <4> [787.193306] lock(&gt->eu_stall->stream_lock); <4> [787.193308] lock((work_completion) (&(&stream->buf_poll_work)->work)); <4> [787.193311] lock(&gt->eu_stall->stream_lock); <4> [787.193313] lock((work_completion) (&(&stream->buf_poll_work)->work)); <4> [787.193315] *** DEADLOCK *** Fixes: 760edec939685 ("drm/xe/eustall: Add support to read() and poll() EU stall data") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4598 Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/c896932fca84f79db2df5942911997ed77b2b9b6.1744934656.git.harish.chegondi@intel.com
2025-04-26Merge tag 'drm-xe-next-2025-04-17' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Core Changes: Fix drm_gpusvm kernel-doc (Lucas) Driver Changes: - Release guc ids before cancelling work (Tejas) - Remove a duplicated pc_start_call (Rodrigo) - Fix an incorrect assert in previous userptr fixes (Thomas) - Remove gen11 assertions and prefixes (Lucas) - Drop sentinels from arg to xe_rtp_process_to_src (Lucas) - Temporarily disable D3Cold on BMG (Rodrigo) - Fix MOCS debugfs LNCF readout (Tvrtko) - Some ring flush cleanups (Tvrtko) - Use unsigned int for alignment in fb pinning code (Tvrtko) - Retry and wait longer for GuC PC start (Rodrigo) - Recognize 3DSTATE_COARSE_PIXEL in LRC dumps (Matt Roper) - Remove reduntant check in xe_vm_create_ioctl() (Xin) - A bunch of SRIOV updates (Michal) - Add stats for SVM page-faults (Francois) - Fix an UAF (Harish) - Expose fan speed (Raag) - Fix exporting xe buffer objects multiple times (Tomasz) - Apply a workaround (Vinay) - Simplify pinned bo iteration (Thomas) - Remove an incorrect "static" keywork (Lucas) - Add support for separate firmware files on each GT (Lucas) - Survivability handling fixes (Lucas) - Allow to inject error in early probe (Lucas) - Fix unmet direct dependencies warning (Yue Haibing) - More error injection during probe (Francois) - Coding style fix (Maarten) - Additional stats support (Riana) - Add fault injection for xe_oa_alloc_regs (Nakshrtra) - Add a BMG PCI ID (Matt Roper) - Some SVM fixes and preliminary SVM multi-device work (Thomas) - Switch the migrate code from drm managed to dev managed (Aradhya) - Fix an out-of-bounds shift when invalidating TLB (Thomas) - Ensure fixed_slice_mode gets set after ccs_mode change (Niranjana) - Use local fence in error path of xe_migrate_clear (Matthew Brost) - More Workarounds (Julia) - Define sysfs_ops on all directories (Tejas) - Set power state to D3Cold during s2idle/s3 (Badal) - Devcoredump output fix (John) - Avoid plain 64-bit division (Arnd Bergmann) - Reword a debug message (John) - Don't print a hwconfig error message when forcing execlists (Stuart) - Restore an error code to avoid a smatch warning (Rodrigo) - Invalidate L3 read-only cachelines for geometry streams too (Kenneth) - Make PPHWSP size explicit in xe_gt_lrc_size() (Gustavo) - Add GT frequency events (Vinay) - Fix xe_pt_stage_bind_walk kerneldoc (Thomas) - Add a workaround (Aradhya) - Rework pinned save/restore (Matthew Auld, Matthew Brost) - Allow non-contig VRAM kernel BO (Matthew Auld) - Support non-contig VRAM provisioning for SRIOV (Matthew Auld) - Allow scratch-pages for unmapped parts of page-faulting VMs. (Oak) - Ensure XE_BO_FLAG_CPU_ADDR_MIRROR had a unique value (Matt Roper) - Fix taking an invalid lock on wedge (Lucas) - Configs and documentation for survivability mode (Riana) - Remove an unused macro (Shuicheng) - Work around a page-fault full error (Matt Brost) - Enable a SRIOV workaround (John) - Bump the recommended GuC version (John) - Allow to drop VRAM resizing (Lucas) - Don't expose privileged debugfs files if VF (Michal) - Don't show GGTT/LMEM debugfs files under media GT (Michal) - Adjust ring-buffer emission for maximum possible size (Tvrtko) - Fix notifier vs folio lock deadlock (Matthew Auld) - Stop relying on placement for dma-buf unmap Matthew Auld) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/aADWaEFKVmxSnDLo@fedora
2025-04-24drm/xe: Abort printing coredump in VM printer output if fullMatthew Brost
Abort printing coredump in VM printer output if full. Helps speedup large coredumps which need to walked multiple times in xe_devcoredump_read. v2: - s/drm_printer_is_full/drm_coredump_printer_is_full (Jani) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250423171725.597955-5-matthew.brost@intel.com
2025-04-24drm/print: Add drm_coredump_printer_is_fullMatthew Brost
Add drm_coredump_printer_is_full which indicates if a drm printer's output is full. Useful to short circuit coredump printing once printer's output is full. v2: - s/drm_printer_is_full/drm_coredump_printer_is_full (Jani) v3: - Bail if not a coredump printer (Michal) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250423171725.597955-4-matthew.brost@intel.com
2025-04-24drm/xe: Update xe_ttm_access_memory to use GPU for non-visible accessMatthew Brost
Add migrate layer functions to access VRAM and update xe_ttm_access_memory to use for non-visible access and large (more than 16k) BO access. 8G devcoreump on BMG observed 3 minute CPU copy time vs. 3s GPU copy time. v4: - Fix non-page aligned accesses - Add support for small / unaligned access - Update commit message indicating migrate used for large accesses (Auld) - Fix warning in xe_res_cursor for non-zero offset v5: - Fix 32 bit build (CI) v6: - Rebase and use SVM migration copy functions v7: - Fix build error (CI) v8: - Remove ifdef around VRAM copy functions (CI) - Use break statement in dma unmmaping (Jonathan) - Use if/else rather than goto (Jonathan) - Use single return point (Jonathan) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250423171725.597955-3-matthew.brost@intel.com
2025-04-24drm/xe: Add devcoredump chunkingMatthew Brost
Chunk devcoredump into 1.5G pieces to avoid hitting the kvmalloc limit of 2G. Simple algorithm reads 1.5G at time in xe_devcoredump_read callback as needed. Some memory allocations are changed to GFP_ATOMIC as they done in xe_devcoredump_read which holds lock in the path of reclaim. The allocations are small, so in practice should never fail. v2: - Update commit message wrt gfp atomic (John H) v6: - Drop GFP_ATOMIC change for hwconfig (John H) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250423171725.597955-2-matthew.brost@intel.com
2025-04-24drm/xe/hwmon: Fix kernel version documentation for fan speedLucas De Marchi
The version in the sysfs attribute should correspond to the version in which this is enabled and visible for end users. It usually doesn't correspond to the version in which the patch was developed, but rather a release that will contain it. Update them to 6.16. Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed") Reported-by: Ulisses Furquim <ulisses.furquim@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4841 Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://lore.kernel.org/r/20250421-hwmon-doc-fix-v1-2-9f68db702249@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-24drm/xe/hwmon: Fix kernel version documentation for temperatureLucas De Marchi
The version in the sysfs attribute should correspond to the version in which this is enabled and visible for end users. It usually doesn't correspond to the version in which the patch was developed, but rather a release that will contain it. Update them to 6.15. Fixes: dac328dea701 ("drm/xe/hwmon: expose package and vram temperature") Reported-by: Ulisses Furquim <ulisses.furquim@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4840 Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://lore.kernel.org/r/20250421-hwmon-doc-fix-v1-1-9f68db702249@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-24Merge drm/drm-next into drm-xe-nextThomas Hellström
Backmerge to bring in linux 6.15-rc. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-04-24drm/ttm/xe: drop unused force_alloc flagDave Airlie
This flag used to be used in the old memory tracking code, that code got migrated into the vmwgfx driver[1], and then got removed from the tree[2], but this piece got left behind. [1] f07069da6b4c ("drm/ttm: move memory accounting into vmwgfx v4") [2] 8aadeb8ad874 ("drm/vmwgfx: Remove the dedicated memory accounting") Cleanup the dead code. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-04-23drm/xe: Fix CFI violation when accessing sysfs filesJeevaka Prabu Badrappan
When an attribute group is created with sysfs_create_group() or sysfs_create_files() the ->sysfs_ops() callback is set to kobj_sysfs_ops, which sets the ->show() callback to kobj_attr_show(). kobj_attr_show() uses container_of() to get the ->show() callback from the attribute it was passed, meaning the ->show() callback needs to be the same type as the ->show() callback in 'struct kobj_attribute'. However, cur_freq_show() has the type of the ->show() callback in 'struct device_attribute', which causes a CFI violation when opening the 'id' sysfs node under gtidle/freq/throttle. This happens to work because the layout of 'struct kobj_attribute' and 'struct device_attribute' are the same, so the container_of() cast happens to allow the ->show() callback to still work. Changed the type of cur_freq_show() and few more functions to match the ->show() callback in 'struct kobj_attributes' to resolve the CFI violation. CFI failure seen while accessing sysfs files under /sys/class/drm/card0/device/tile0/gt*/gtidle/* /sys/class/drm/card0/device/tile0/gt*/freq0/* /sys/class/drm/card0/device/tile0/gt*/freq0/throttle/* [ 2599.618075] RIP: 0010:__cfi_cur_freq_show+0xd/0x10 [xe] [ 2599.624452] Code: 44 c1 44 89 fa e8 03 95 39 f2 48 98 5b 41 5e 41 5f 5d c3 c9 [ 2599.646638] RSP: 0018:ffffbe438ead7d10 EFLAGS: 00010286 [ 2599.652823] RAX: ffff9f7d8b3845d8 RBX: ffff9f7dee8c95d8 RCX: 0000000000000000 [ 2599.661246] RDX: ffff9f7e6f439000 RSI: ffffffffc13ada30 RDI: ffff9f7d975d4b00 [ 2599.669669] RBP: ffffbe438ead7d18 R08: 0000000000001000 R09: ffff9f7e6f439000 [ 2599.678092] R10: 00000000e07304a6 R11: ffffffffc1241ca0 R12: ffffffffb4836ea0 [ 2599.688435] R13: ffff9f7e45fb1180 R14: ffff9f7d975d4b00 R15: ffff9f7e6f439000 [ 2599.696860] FS: 000076b02b66cfc0(0000) GS:ffff9f80ef400000(0000) knlGS:00000 [ 2599.706412] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2599.713196] CR2: 00005f80d94641a9 CR3: 00000001e44ec006 CR4: 0000000100f72ef0 [ 2599.721618] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2599.730041] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 [ 2599.738464] PKRU: 55555554 [ 2599.741655] Call Trace: [ 2599.744541] <TASK> [ 2599.747017] ? __die_body+0x69/0xb0 [ 2599.751151] ? die+0xa9/0xd0 [ 2599.754548] ? do_trap+0x89/0x160 [ 2599.758476] ? __cfi_cur_freq_show+0xd/0x10 [xe b37985c94829727668bd7c5b33c1] [ 2599.768315] ? handle_invalid_op+0x69/0x90 [ 2599.773167] ? __cfi_cur_freq_show+0xd/0x10 [xe b37985c94829727668bd7c5b33c1] [ 2599.783010] ? exc_invalid_op+0x36/0x60 [ 2599.787552] ? fred_hwexc+0x123/0x1a0 [ 2599.791873] ? fred_entry_from_kernel+0x7b/0xd0 [ 2599.797219] ? asm_fred_entrypoint_kernel+0x45/0x70 [ 2599.802976] ? act_freq_show+0x70/0x70 [xe b37985c94829727668bd7c5b33c1d9998] [ 2599.812301] ? __cfi_cur_freq_show+0xd/0x10 [xe b37985c94829727668bd7c5b33c1] [ 2599.822137] ? __kmalloc_node_noprof+0x1f3/0x420 [ 2599.827594] ? __kvmalloc_node_noprof+0xcb/0x180 [ 2599.833045] ? kobj_attr_show+0x22/0x40 [ 2599.837571] sysfs_kf_seq_show+0xa8/0x110 [ 2599.842302] kernfs_seq_show+0x38/0x50 Signed-off-by: Jeevaka Prabu Badrappan <jeevaka.badrappan@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250422171852.85558-1-jeevaka.badrappan@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-04-23drm/xe: handle pinned memory in PM notifierMatthew Auld
Userspace is still alive and kicking at this point so actually moving pinned stuff here is tricky. However, we can instead pre-allocate the backup storage upfront from the notifier, such that we scoop up as much as we can, and then leave the final .suspend() to do the actual copy (or allocate anything that we missed). That way the bulk of our allocations will hopefully be done outside the more restrictive .suspend(). We do need to be extra careful though, since the pinned handling can now race with PM notifier, like something becoming unpinned after we prepare it from the notifier. v2 (Thomas): - Fix kernel doc and drop the pin as soon as we are done with the restore, instead of deferring to later. Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250416150913.434369-8-matthew.auld@intel.com
2025-04-23drm/xe: share bo dma-resv with backup objectMatthew Auld
We end up needing to grab both locks together anyway and keep them held until we complete the copy or add the fence. Plus the backup_obj is short lived and tied to the parent object, so seems reasonable to share the same dma-resv. This will simplify the locking here, and in follow up patches. v2: - Hold reference to the parent bo to be sure the shared dma-resv can't go out of scope too soon. (Thomas) Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250416150913.434369-7-matthew.auld@intel.com
2025-04-23drm/xe: evict user memory in PM notifierMatthew Auld
In the case of VRAM we might need to allocate large amounts of GFP_KERNEL memory on suspend, however doing that directly in the driver .suspend()/.prepare() callback is not advisable (no swap for example). To improve on this we can instead hook up to the PM notifier framework which is invoked at an earlier stage. We effectively call the evict routine twice, where the notifier will have hopefully have cleared out most if not everything by the time we call it a second time when entering the .suspend() callback. For s4 we also get the added benefit of allocating the system pages before the hibernation image size is calculated, which looks more sensible. Note that the .suspend() hook is still responsible for dealing with all the pinned memory. Improving that is left to another patch. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1181 Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288 Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4566 Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250416150913.434369-6-matthew.auld@intel.com
2025-04-22drm/xe/guc: Cache DSS info when creating capture register listJohn Harrison
Calculating the DSS id (index of a steered register) currently requires reading state from the hwconfig table and that currently requires dynamically allocating memory. The GuC based register capture (for dev core dumps) includes this index as part of the register name in the dump. However, it was calculating said index at the time of the dump for every dump. That is wasteful. It also breaks anyone trying to do the dump at a time when memory allocations are not allowed. So rather than calculating on every print, just calculate at start of day when creating the register list in the first place. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250417213303.3021243-1-John.C.Harrison@Intel.com
2025-04-22drm/xe/guc: Use the steering flag when printing registersJohn Harrison
The printing code was doing a test on which list a register was in to decide whether it is steered or not. That might be valid at this moment but there may be other reasons for extended lists in the future. Plus, there is a flag specifically for identifying steered registers. So, just use that instead - it is simpler and safer. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250417195215.3002210-3-John.C.Harrison@Intel.com
2025-04-22drm/xe/guc: Fix capture of steering registersJohn Harrison
The list of registers to capture on a GPU hang includes some that require steering. Unfortunately, the flag to say this was being wiped to due a missing OR on the assignment of the next flag field. Fix that. Fixes: b170d696c1e2 ("drm/xe/guc: Add XE_LP steered register lists") Cc: Zhanjun Dong <zhanjun.dong@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-xe@lists.freedesktop.org Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com> Link: https://lore.kernel.org/r/20250417195215.3002210-2-John.C.Harrison@Intel.com
2025-04-22drm/xe/svm: fix dereferencing error pointer in drm_gpusvm_range_alloc()Harshit Mogalapalli
xe_svm_range_alloc() returns ERR_PTR(-ENOMEM) on failure and there is a dereference of "range" after that: --> range->gpusvm = gpusvm; In xe_svm_range_alloc(), when memory allocation fails return NULL instead to handle this situation. Fixes: 99624bdff867 ("drm/gpusvm: Add support for GPU Shared Virtual Memory") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/adaef4dd-5866-48ca-bc22-4a1ddef20381@stanley.mountain/ Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250323124907.3946370-1-harshit.m.mogalapalli@oracle.com
2025-04-17drm/xe: Introduce fault injection for guc CTB send/recvSatyanarayana K V P
Fault can be injected with below steps. FAILTYPE=fail_function FAILFUNC=xe_guc_ct_send_recv echo > /sys/kernel/debug/$FAILTYPE/inject echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject printf %#x -19 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval echo N > /sys/kernel/debug/$FAILTYPE/task-filter echo 10 > /sys/kernel/debug/$FAILTYPE/probability echo 0 > /sys/kernel/debug/$FAILTYPE/interval echo -1 > /sys/kernel/debug/$FAILTYPE/times echo 0 > /sys/kernel/debug/$FAILTYPE/space echo 1 > /sys/kernel/debug/$FAILTYPE/verbose Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michał Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250403120641.7258-3-satyanarayana.k.v.p@intel.com
2025-04-17drm/xe: Introduce fault injection for guc mmio send/recv.Satyanarayana K V P
Fault can be injected with below steps. FAILTYPE=fail_function FAILFUNC=xe_guc_mmio_send_recv echo > /sys/kernel/debug/$FAILTYPE/inject echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject printf %#x -5 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval echo N > /sys/kernel/debug/$FAILTYPE/task-filter echo 10 > /sys/kernel/debug/$FAILTYPE/probability echo 0 > /sys/kernel/debug/$FAILTYPE/interval echo -1 > /sys/kernel/debug/$FAILTYPE/times echo 0 > /sys/kernel/debug/$FAILTYPE/space echo 1 > /sys/kernel/debug/$FAILTYPE/verbose Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michał Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250403120641.7258-2-satyanarayana.k.v.p@intel.com
2025-04-17drm/xe: Use GT oriented message to report engine activity errorMichal Wajdeczko
We are enabling/disabling engine activity on per-GT basis, so any errors should be also reported per GT, like: [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to enable engine activity function stats (-ENOSPC) [ ] xe 0000:00:02.0: [drm] GT1: PF: Failed to enable engine activity function stats (-ENOSPC) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250414202347.1909-2-michal.wajdeczko@intel.com
2025-04-17drm/xe/guc: Fix out-of-bound while enabling engine activity statsMichal Wajdeczko
In the PF mode we allocate array of struct engine_activity_group that holds activity data split for the PF and all potential VFs. But while preparing data for use by VFs we ended with bad index. [ ] BUG: KASAN: slab-out-of-bounds in xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe] [ ] Call Trace: [ ] <TASK> [ ] dump_stack_lvl+0x91/0xf0 [ ] print_report+0xd1/0x680 [ ] ? __virt_addr_valid+0x23a/0x440 [ ] ? kasan_addr_to_slab+0xd/0xb0 [ ] kasan_report+0xe7/0x130 [ ] ? xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe] [ ] ? xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe] [ ] __asan_report_store8_noabort+0x17/0x30 [ ] xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe] [ ] pf_engine_activity_stats+0x1b6/0x7f0 [xe] [ ] ? kobject_put+0x5f/0x470 [ ] xe_pci_sriov_configure+0x28c9/0x3270 [xe] [ ] ? __pfx_dev_attr_store+0x10/0x10 [ ] ? kstrtoull+0x3b/0x70 [ ] ? __pfx___lock_acquire+0x10/0x10 [ ] ? kstrtou16+0x65/0xf0 [ ] sriov_numvfs_store+0x20c/0x400 [ ] ? __pfx_sriov_numvfs_store+0x10/0x10 [ ] ? __pfx__copy_from_iter+0x10/0x10 [ ] ? __pfx_dev_attr_store+0x10/0x10 [ ] dev_attr_store+0x3b/0x80 [ ] ? sysfs_file_ops+0x135/0x190 Fixes: 2de3f38fbf89 ("drm/xe: Add support for per-function engine activity") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250414202347.1909-1-michal.wajdeczko@intel.com
2025-04-17drm/xe/pxp: do not queue unneeded terminations from debugfsDaniele Ceraolo Spurio
The PXP terminate debugfs currently unconditionally simulates a termination, no matter what the HW status is. This is unneeded if PXP is not in use and can cause errors if the HW init hasn't completed yet. To solve these issues, we can simply limit the terminations to the cases where PXP is fully initialized and in use. v2: s/pxp_status/ready/ to avoid confusion with pxp->status (John) Fixes: 385a8015b214 ("drm/xe/pxp: Add PXP debugfs support") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4749 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250416201622.1295369-1-daniele.ceraolospurio@intel.com
2025-04-16drm/xe/dma_buf: stop relying on placement in unmapMatthew Auld
The is_vram() is checking the current placement, however if we consider exported VRAM with dynamic dma-buf, it looks possible for the xe driver to async evict the memory, notifying the importer, however importer does not have to call unmap_attachment() immediately, but rather just as "soon as possible", like when the dma-resv idles. Following from this we would then pipeline the move, attaching the fence to the manager, and then update the current placement. But when the unmap_attachment() runs at some later point we might see that is_vram() is now false, and take the complete wrong path when dma-unmapping the sg, leading to explosions. To fix this check if the sgl was mapping a struct page. v2: - The attachment can be mapped multiple times it seems, so we can't really rely on encoding something in the attachment->priv. Instead see if the page_link has an encoded struct page. For vram we expect this to be NULL. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4563 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250410162716.159403-2-matthew.auld@intel.com
2025-04-16drm/xe/userptr: fix notifier vs folio deadlockMatthew Auld
User is reporting what smells like notifier vs folio deadlock, where migrate_pages_batch() on core kernel side is holding folio lock(s) and then interacting with the mappings of it, however those mappings are tied to some userptr, which means calling into the notifier callback and grabbing the notifier lock. With perfect timing it looks possible that the pages we pulled from the hmm fault can get sniped by migrate_pages_batch() at the same time that we are holding the notifier lock to mark the pages as accessed/dirty, but at this point we also want to grab the folio locks(s) to mark them as dirty, but if they are contended from notifier/migrate_pages_batch side then we deadlock since folio lock won't be dropped until we drop the notifier lock. Fortunately the mark_page_accessed/dirty is not really needed in the first place it seems and should have already been done by hmm fault, so just remove it. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4765 Fixes: 0a98219bcc96 ("drm/xe/hmm: Don't dereference struct page pointers without notifier lock") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250414132539.26654-2-matthew.auld@intel.com
2025-04-15drm/xe: Adjust ringbuf emission for maximum possible sizeTvrtko Ursulin
MAX_JOB_SIZE_DW seems to be undersized. For the worst case emission from __emit_job_gen12_render_compute I hand count 57 dwords so lets bump this to an even 58. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250403190317.6064-2-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-15Merge tag 'drm-intel-next-2025-04-11' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next Cross-subsystem Changes: - Update GVT MAINTAINERS (Jani) Driver Changes: - Updates for xe3lpd display (Gustavo) - Fix link training interrupted by HPD pulse (Imre) - Watermark bound checks for DSC (Ankit) - VRR Refactor and other fixes and improvements (Ankit) - More conversions towards intel_display struct (Gustavo, Jani) - Other clean-up patches towards a display separation (Jani) - Maintain asciibetical order for HAS_* macros (Ankit) - Fixes around probe/initialization (Janusz) - Fix build and doc build issue (Yue, Rodrigo) - DSI related fixes (Suraj, William, Jani) - Improve DC6 entry counter (Mohammed) - Fix xe2hpd memory type identification (Vivek) - PSR related fixes and improvements (Animesh, Jouni) - DP MST related fixes and improvements (Imre) - Fix scanline_offset for LNL+/BMG+ (Ville) - Some gvt related fixes and changes (Ville, Jani) - Some PLL code adjustment (Ville) - Display wa addition (Vinod) - DRAM type logging (Lucas) - Pimp the initial FB readout (Ville) - Some sagv/bw cleanup (Ville) - Remove i915_display_capabilities debugfs entry (Jani) - Move PCH type to display caps debugfs entry (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/Z_kTqPX5Mjruq1pL@intel.com
2025-04-14drm/xe: Set LRC addresses before guc loadLucas De Marchi
The metadata saved in the ADS is read by GuC when it's initialized. Saving the addresses to the LRCs when they are populated is too late as GuC will keep using the old ones. This was causing GuC to use the RCS LRC for any engine class. It's not a big problem on a Linux-only scenario since the they are used by GuC only on media engines when the watchdog is triggered. However, in a virtualization scenario with Windows as the VF, it causes the wrong LRCs to be loaded as the watchdog is used for all engines. Fix it by letting guc_golden_lrc_init() initialize the metadata, like other *_init() functions, and later guc_golden_lrc_populate() to copy the LRCs to the right places. The former is called before the second GuC load, while the latter is called after LRCs have been recorded. Cc: Chee Yin Wong <chee.yin.wong@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: <stable@vger.kernel.org> # v6.11+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Chee Yin Wong <chee.yin.wong@intel.com> Link: https://lore.kernel.org/r/20250409-fix-guc-ads-v1-1-494135f7a5d0@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-14drm/xe/pf: Don't show GGTT/LMEM debugfs files under media GTMichal Wajdeczko
Most of the PF's debugfs files (and their implementations) are based on the GT hierarchy even if files are related to GGTT or LMEM data, that are related to the tile. While we could reach the tile data from any GT, to avoid potential misuse, some functions allow to be used on the primary GT only, and may use asserts to enforce that. In our case, the following assert could be seen when reading the /sys/kernel/debug/dri/0000:00:02.0/gt1/pf/ggtt_available [ ] xe 0000:00:02.0: [drm] Assertion `!xe_gt_is_media_type(gt)` failed! [ ] WARNING: CPU: 4 PID: 10609 at drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:379 pf_get_spare_ggtt+0x256/0x4e0 [xe] [ ] RIP: 0010:pf_get_spare_ggtt+0x256/0x4e0 [xe] [ ] Call Trace: [ ] <TASK> [ ] xe_gt_sriov_pf_config_print_available_ggtt+0xb7/0x480 [xe] [ ] ? __memcg_slab_post_alloc_hook+0x12f/0x3f0 [ ] xe_gt_debugfs_simple_show+0x7b/0xb0 [xe] [ ] ? __pfx___drm_printfn_seq_file+0x10/0x10 [ ] ? __pfx___drm_puts_seq_file+0x10/0x10 [ ] seq_read_iter+0x139/0x4e0 [ ] seq_read+0x11d/0x160 [ ] full_proxy_read+0x6b/0xb0 [ ] vfs_read+0xfa/0x390 Fix that by moving GGTT/LMEM debugfs attributes to separate lists and register them only when applicable (on primary GT, on DGFX). Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Tested-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com> Reviewed-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250411193030.1865-1-michal.wajdeczko@intel.com
2025-04-14Merge tag 'drm-misc-next-2025-04-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.16-rc1: UAPI Changes: - Add ASAHI uapi header! - Add apple fourcc modifiers. - Add capset virtio definitions to UAPI. - Extend EXPORT_SYNC_FILE for timeline syncobjs. Cross-subsystem Changes: - Adjust DMA-BUF sg handling to not cache map on attach. - Update drm/ci, hlcdc, virtio, maintainers. - Update fbdev todo. - Allow setting dma-device for dma-buf import. - Export efi_mem_desc_lookup to make efidrm build as a module. Core Changes: - Update drm scheduler docs. - Use the correct resv object in TTM delayed destroy. - Fix compiler warning with panic qr code, and other small fixes. - drm/ci updates. - Add debugfs file for listing all bridges. - Small fixes to drm/client, ttm tests. - Add documentation to display/hdmi. - Add kunit tests for bridges. - Dont fail managed device probing if connector polling fails. - Create Kconfig.debug for drm core. - Add tests for the drm scheduler. - Add and use new access helpers for DPCPD. - Add generic and optimized conversions for format-helper. - Begin refcounting panel for improving lifetime handling. - Unify simpledrm and ofdrm sysfb, and add extra features. - Split hdmi audio in bridge to make DP audio work. Driver Changes: - Convert drivers to use devm_platform_ioremap_resource(). - Assorted small fixes to imx/legacy-bridg, gma500, pl111, nouveau, vc4, vmwgfx, ast, mxsfb, xlnx, accel/qaic, v3d, bridge/imx8qxp-ldb, ofdrm, bridge/fsl-ldb, udl, bridge/ti-sn65dsi86, bridge/anx7625, cirrus-qemu, bridge/cdns-dsi, panel/sharp, panel/himax, bridge/sil902x, renesas, imagination, various panels. - Allow attaching more display to vkms. - Add Powertip PH128800T004-ZZA01 panel. - Add rotation quirk for ZOTAC panel. - Convert bridge/tc358775 to atomic. - Remove deprecated panel calls from synaptics, novatek, samsung panels. - Refactor shmem helper page pinning and accel drivers using it. - Add dmabuf support to accel/amdxdna. - Use 4k page table format for panfrost/mediatek. - Add common powerup/down dp link helper and use it. - Assorted compiler warning fixes. - Support dma-buf import for renesas Signed-off-by: Dave Airlie <airlied@redhat.com> # Conflicts: # include/drm/drm_kunit_helpers.h From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/e147ff95-697b-4067-9e2e-7cbd424e162a@linux.intel.com
2025-04-13Linux 6.15-rc2Linus Torvalds
2025-04-13Merge tag 'erofs-for-6.15-rc2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: - Properly handle errors when file-backed I/O fails - Fix compilation issues on ARM platform (arm-linux-gnueabi) - Fix parsing of encoded extents - Minor cleanup * tag 'erofs-for-6.15-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: remove duplicate code erofs: fix encoded extents handling erofs: add __packed annotation to union(__le16..) erofs: set error to bio if file-backed IO fails
2025-04-13Merge tag 'ext4_for_linus-6.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "A few more miscellaneous ext4 bug fixes and cleanups including some syzbot failures and fixing a stale file handing refeencing an inode previously used as a regular file, but which has been deleted and reused as an ea_inode would result in ext4 erroneously considering this a case of fs corruption" * tag 'ext4_for_linus-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix off-by-one error in do_split ext4: make block validity check resistent to sb bh corruption ext4: avoid -Wflex-array-member-not-at-end warning Documentation: ext4: Add fields to ext4_super_block documentation ext4: don't treat fhandle lookup of ea_inode as FS corruption
2025-04-13Merge tag 'fixes-2025-04-13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fix from Mike Rapoport: "Fix build of memblock test. Add missing stubs for mutex and free_reserved_area() to memblock tests" * tag 'fixes-2025-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock tests: Fix mutex related build error
2025-04-12ext4: fix off-by-one error in do_splitArtem Sadovnikov
Syzkaller detected a use-after-free issue in ext4_insert_dentry that was caused by out-of-bounds access due to incorrect splitting in do_split. BUG: KASAN: use-after-free in ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109 Write of size 251 at addr ffff888074572f14 by task syz-executor335/5847 CPU: 0 UID: 0 PID: 5847 Comm: syz-executor335 Not tainted 6.12.0-rc6-syzkaller-00318-ga9cda7c0ffed #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189 __asan_memcpy+0x40/0x70 mm/kasan/shadow.c:106 ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109 add_dirent_to_buf+0x3d9/0x750 fs/ext4/namei.c:2154 make_indexed_dir+0xf98/0x1600 fs/ext4/namei.c:2351 ext4_add_entry+0x222a/0x25d0 fs/ext4/namei.c:2455 ext4_add_nondir+0x8d/0x290 fs/ext4/namei.c:2796 ext4_symlink+0x920/0xb50 fs/ext4/namei.c:3431 vfs_symlink+0x137/0x2e0 fs/namei.c:4615 do_symlinkat+0x222/0x3a0 fs/namei.c:4641 __do_sys_symlink fs/namei.c:4662 [inline] __se_sys_symlink fs/namei.c:4660 [inline] __x64_sys_symlink+0x7a/0x90 fs/namei.c:4660 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> The following loop is located right above 'if' statement. for (i = count-1; i >= 0; i--) { /* is more than half of this entry in 2nd half of the block? */ if (size + map[i].size/2 > blocksize/2) break; size += map[i].size; move++; } 'i' in this case could go down to -1, in which case sum of active entries wouldn't exceed half the block size, but previous behaviour would also do split in half if sum would exceed at the very last block, which in case of having too many long name files in a single block could lead to out-of-bounds access and following use-after-free. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Cc: stable@vger.kernel.org Fixes: 5872331b3d91 ("ext4: fix potential negative array index in do_split()") Signed-off-by: Artem Sadovnikov <a.sadovnikov@ispras.ru> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250404082804.2567-3-a.sadovnikov@ispras.ru Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12ext4: make block validity check resistent to sb bh corruptionOjaswin Mujoo
Block validity checks need to be skipped in case they are called for journal blocks since they are part of system's protected zone. Currently, this is done by checking inode->ino against sbi->s_es->s_journal_inum, which is a direct read from the ext4 sb buffer head. If someone modifies this underneath us then the s_journal_inum field might get corrupted. To prevent against this, change the check to directly compare the inode with journal->j_inode. **Slight change in behavior**: During journal init path, check_block_validity etc might be called for journal inode when sbi->s_journal is not set yet. In this case we now proceed with ext4_inode_block_valid() instead of returning early. Since systems zones have not been set yet, it is okay to proceed so we can perform basic checks on the blocks. Suggested-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/0c06bc9ebfcd6ccfed84a36e79147bf45ff5adc1.1743142920.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12ext4: avoid -Wflex-array-member-not-at-end warningGustavo A. R. Silva
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Use the `DEFINE_RAW_FLEX()` helper for an on-stack definition of a flexible structure where the size of the flexible-array member is known at compile-time, and refactor the rest of the code, accordingly. So, with these changes, fix the following warning: fs/ext4/mballoc.c:3041:40: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/Z-SF97N3AxcIMlSi@kspp Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12Documentation: ext4: Add fields to ext4_super_block documentationTom Vierjahn
Documentation and implementation of the ext4 super block have slightly diverged: Padding has been removed in order to make room for new fields that are still missing in the documentation. Add the new fields s_encryption_level, s_first_error_errorcode, s_last_error_errorcode to the documentation of the ext4 super block. Fixes: f542fbe8d5e8 ("ext4 crypto: reserve codepoints used by the ext4 encryption feature") Fixes: 878520ac45f9 ("ext4: save the error code which triggered an ext4_error() in the superblock") Signed-off-by: Tom Vierjahn <tom.vierjahn@acm.org> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/20250324221004.5268-1-tom.vierjahn@acm.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12Merge tag 'trace-v6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Hide get_vm_area() from MMUless builds The function get_vm_area() is not defined when CONFIG_MMU is not defined. Hide that function within #ifdef CONFIG_MMU. - Fix output of synthetic events when they have dynamic strings The print fmt of the synthetic event's format file use to have "%.*s" for dynamic size strings even though the user space exported arguments had only __get_str() macro that provided just a nul terminated string. This was fixed so that user space could parse this properly. But the reason that it had "%.*s" was because internally it provided the maximum size of the string as one of the arguments. The fix that replaced "%.*s" with "%s" caused the trace output (when the kernel reads the event) to write "(efault)" as it would now read the length of the string as "%s". As the string provided is always nul terminated, there's no reason for the internal code to use "%.*s" anyway. Just remove the length argument to match the "%s" that is now in the format. - Fix the ftrace subops hash logic of the manager ops hash The function_graph uses the ftrace subops code. The subops code is a way to have a single ftrace_ops registered with ftrace to determine what functions will call the ftrace_ops callback. More than one user of function graph can register a ftrace_ops with it. The function graph infrastructure will then add this ftrace_ops as a subops with the main ftrace_ops it registers with ftrace. This is because the functions will always call the function graph callback which in turn calls the subops ftrace_ops callbacks. The main ftrace_ops must add a callback to all the functions that the subops want a callback from. When a subops is registered, it will update the main ftrace_ops hash to include the functions it wants. This is the logic that was broken. The ftrace_ops hash has a "filter_hash" and a "notrace_hash" where all the functions in the filter_hash but not in the notrace_hash are attached by ftrace. The original logic would have the main ftrace_ops filter_hash be a union of all the subops filter_hashes and the main notrace_hash would be a intersect of all the subops filter hashes. But this was incorrect because the notrace hash depends on the filter_hash it is associated to and not the union of all filter_hashes. Instead, when a subops is added, just include all the functions of the subops hash that are in its filter_hash but not in its notrace_hash. The main subops hash should not use its notrace hash, unless all of its subops hashes have an empty filter_hash (which means to attach to all functions), and then, and only then, the main ftrace_ops notrace hash can be the intersect of all the subops hashes. This not only fixes the bug, but also simplifies the code. - Add a selftest to better test the subops filtering Add a selftest that would catch the bug fixed by the above change. - Fix extra newline printed in function tracing with retval The function parameter code changed the output logic slightly and called print_graph_retval() and also printed a newline. The print_graph_retval() also prints a newline which caused blank lines to be printed in the function graph tracer when retval was added. This caused one of the selftests to fail if retvals were enabled. Instead remove the new line output from print_graph_retval() and have the callers always print the new line so that it doesn't have to do special logic if it calls print_graph_retval() or not. - Fix out-of-bound memory access in the runtime verifier When rv_is_container_monitor() is called on the last entry on the link list it references the next entry, which is the list head and causes an out-of-bound memory access. * tag 'trace-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rv: Fix out-of-bound memory access in rv_is_container_monitor() ftrace: Do not have print_graph_retval() add a newline tracing/selftest: Add test to better test subops filtering of function graph ftrace: Fix accounting of subop hashes ftrace: Properly merge notrace hashes tracing: Do not add length to print format in synthetic events tracing: Hide get_vm_area() from MMUless builds
2025-04-12Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Alexei Starovoitov: - Followup fixes for resilient spinlock (Kumar Kartikeya Dwivedi): - Make res_spin_lock test less verbose, since it was spamming BPF CI on failure, and make the check for AA deadlock stronger - Fix rebasing mistake and use architecture provided res_smp_cond_load_acquire - Convert BPF maps (queue_stack and ringbuf) to resilient spinlock to address long standing syzbot reports - Make sure that classic BPF load instruction from SKF_[NET|LL]_OFF offsets works when skb is fragmeneted (Willem de Bruijn) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Convert ringbuf map to rqspinlock bpf: Convert queue_stack map to rqspinlock bpf: Use architecture provided res_smp_cond_load_acquire selftests/bpf: Make res_spin_lock AA test condition stronger selftests/net: test sk_filter support for SKF_NET_OFF on frags bpf: support SKF_NET_OFF and SKF_LL_OFF on skb frags selftests/bpf: Make res_spin_lock test less verbose
2025-04-12rv: Fix out-of-bound memory access in rv_is_container_monitor()Nam Cao
When rv_is_container_monitor() is called on the last monitor in rv_monitors_list, KASAN yells: BUG: KASAN: global-out-of-bounds in rv_is_container_monitor+0x101/0x110 Read of size 8 at addr ffffffff97c7c798 by task setup/221 The buggy address belongs to the variable: rv_monitors_list+0x18/0x40 This is due to list_next_entry() is called on the last entry in the list. It wraps around to the first list_head, and the first list_head is not embedded in struct rv_monitor_def. Fix it by checking if the monitor is last in the list. Cc: stable@vger.kernel.org Cc: Gabriele Monaco <gmonaco@redhat.com> Fixes: cb85c660fcd4 ("rv: Add option for nested monitors and include sched") Link: https://lore.kernel.org/e85b5eeb7228bfc23b8d7d4ab5411472c54ae91b.1744355018.git.namcao@linutronix.de Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-12ftrace: Do not have print_graph_retval() add a newlineSteven Rostedt
The retval and retaddr options for function_graph tracer will add a comment at the end of a function for both leaf and non leaf functions that looks like: __wake_up_common(); /* ret=0x1 */ } /* pick_next_task_fair ret=0x0 */ The function print_graph_retval() adds a newline after the "*/". But if that's not called, the caller function needs to make sure there's a newline added. This is confusing and when the function parameters code was added, it added a newline even when calling print_graph_retval() as the fact that the print_graph_retval() function prints a newline isn't obvious. This caused an extra newline to be printed and that made it fail the selftests when the retval option was set, as the selftests were not expecting blank lines being injected into the trace. Instead of having print_graph_retval() print a newline, just have the caller always print the newline regardless if it calls print_graph_retval() or not. This not only fixes this bug, but it also simplifies the code. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250411133015.015ca393@gandalf.local.home Reported-by: Mark Brown <broonie@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/all/ccc40f2b-4b9e-4abd-8daf-d22fce2a86f0@sirena.org.uk/ Fixes: ff5c9c576e754 ("ftrace: Add support for function argument to graph tracer") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-12Merge tag 'pwm/for-6.15-rc2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux Pull pwm fixes from Uwe Kleine-König: "A set of fixes for pwm core and various drivers The first three patches handle clk_get_rate() returning 0 (which might happen for example if the CCF is disabled). The first of these was found because this triggered a warning with clang, the two others by looking for similar issues in other drivers. The remaining three fixes address issues in the new waveform pwm API. Now that I worked on this a bit more, the finer details and corner cases are better understood and the code is fixed accordingly" * tag 'pwm/for-6.15-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: pwm: axi-pwmgen: Let .round_waveform_tohw() signal when request was rounded up pwm: stm32: Search an appropriate duty_cycle if period cannot be modified pwm: Let pwm_set_waveform() succeed even if lowlevel driver rounded up pwm: fsl-ftm: Handle clk_get_rate() returning 0 pwm: rcar: Improve register calculation pwm: mediatek: Prevent divide-by-zero in pwm_mediatek_config()
2025-04-11Merge tag 'v6.15-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - Fix multichannel decryption UAF - Fix regression mounting to onedrive shares - Fix missing mount option check for posix vs. noposix - Fix version field in WSL symlinks - Three minor cleanup to reparse point handling - SMB1 fix for WSL special files - SMB1 Kerberos fix - Add SMB3 defines for two new FS attributes * tag 'v6.15-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb3: Add defines for two new FileSystemAttributes cifs: Fix querying of WSL CHR and BLK reparse points over SMB1 cifs: Split parse_reparse_point callback to functions: get buffer and parse buffer cifs: Improve handling of name surrogate reparse points in reparse.c cifs: Remove explicit handling of IO_REPARSE_TAG_MOUNT_POINT in inode.c cifs: Fix encoding of SMB1 Session Setup Kerberos Request in non-UNICODE mode smb: client: fix UAF in decryption with multichannel cifs: Fix support for WSL-style symlinks smb311 client: fix missing tcon check when mounting with linux/posix extensions cifs: Ensure that all non-client-specific reparse points are processed by the server
2025-04-11Merge tag 'pci-v6.15-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fix from Bjorn Helgaas: - Run quirk_huawei_pcie_sva() before arm_smmu_probe_device(), which depends on the quirk, to avoid IOMMU initialization failures (Zhangfei Gao) * tag 'pci-v6.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: Run quirk_huawei_pcie_sva() before arm_smmu_probe_device()
2025-04-11tracing/selftest: Add test to better test subops filtering of function graphSteven Rostedt
A bug was discovered that showed the accounting of the subops of the ftrace_ops filtering was incorrect. Add a new test to better test the filtering. This test creates two instances, where it will add various filters to both the set_ftrace_filter and the set_ftrace_notrace files and enable function_graph. Then it looks into the enabled_functions file to make sure that the filters are behaving correctly. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Andy Chiu <andybnac@gmail.com> Link: https://lore.kernel.org/20250409152720.380778379@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>