summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-10-25scx: Fix raciness in scx_ops_bypass()David Vernet
scx_ops_bypass() can currently race on the ops enable / disable path as follows: 1. scx_ops_bypass(true) called on enable path, bypass depth is set to 1 2. An op on the init path exits, which schedules scx_ops_disable_workfn() 3. scx_ops_bypass(false) is called on the disable path, and bypass depth is decremented to 0 4. kthread is scheduled to execute scx_ops_disable_workfn() 5. scx_ops_bypass(true) called, bypass depth set to 1 6. scx_ops_bypass() races when iterating over CPUs While it's not safe to take any blocking locks on the bypass path, it is safe to take a raw spinlock which cannot be preempted. This patch therefore updates scx_ops_bypass() to use a raw spinlock to synchronize, and changes scx_ops_bypass_depth to be a regular int. Without this change, we observe the following warnings when running the 'exit' sched_ext selftest (sometimes requires a couple of runs): .[root@virtme-ng sched_ext]# ./runner -t exit ===== START ===== TEST: exit ... [ 14.935078] WARNING: CPU: 2 PID: 360 at kernel/sched/ext.c:4332 scx_ops_bypass+0x1ca/0x280 [ 14.935126] Modules linked in: [ 14.935150] CPU: 2 UID: 0 PID: 360 Comm: sched_ext_ops_h Not tainted 6.11.0-virtme #24 [ 14.935192] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 14.935242] Sched_ext: exit (enabling+all) [ 14.935244] RIP: 0010:scx_ops_bypass+0x1ca/0x280 [ 14.935300] Code: ff ff ff e8 48 96 10 00 fb e9 08 ff ff ff c6 05 7b 34 e8 01 01 90 48 c7 c7 89 86 88 87 e8 be 1d f8 ff 90 0f 0b 90 90 eb 95 90 <0f> 0b 90 41 8b 84 24 24 0a 00 00 eb 97 90 0f 0b 90 41 8b 84 24 24 [ 14.935394] RSP: 0018:ffffb706c0957ce0 EFLAGS: 00010002 [ 14.935424] RAX: 0000000000000009 RBX: 0000000000000001 RCX: 00000000e3fb8b2a [ 14.935465] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffffff88a4c080 [ 14.935512] RBP: 0000000000009b56 R08: 0000000000000004 R09: 00000003f12e520a [ 14.935555] R10: ffffffff863a9795 R11: 0000000000000000 R12: ffff8fc5fec31300 [ 14.935598] R13: ffff8fc5fec31318 R14: 0000000000000286 R15: 0000000000000018 [ 14.935642] FS: 0000000000000000(0000) GS:ffff8fc5fe680000(0000) knlGS:0000000000000000 [ 14.935684] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 14.935721] CR2: 0000557d92890b88 CR3: 000000002464a000 CR4: 0000000000750ef0 [ 14.935765] PKRU: 55555554 [ 14.935782] Call Trace: [ 14.935802] <TASK> [ 14.935823] ? __warn+0xce/0x220 [ 14.935850] ? scx_ops_bypass+0x1ca/0x280 [ 14.935881] ? report_bug+0xc1/0x160 [ 14.935909] ? handle_bug+0x61/0x90 [ 14.935934] ? exc_invalid_op+0x1a/0x50 [ 14.935959] ? asm_exc_invalid_op+0x1a/0x20 [ 14.935984] ? raw_spin_rq_lock_nested+0x15/0x30 [ 14.936019] ? scx_ops_bypass+0x1ca/0x280 [ 14.936046] ? srso_alias_return_thunk+0x5/0xfbef5 [ 14.936081] ? __pfx_scx_ops_disable_workfn+0x10/0x10 [ 14.936111] scx_ops_disable_workfn+0x146/0xac0 [ 14.936142] ? finish_task_switch+0xa9/0x2c0 [ 14.936172] ? srso_alias_return_thunk+0x5/0xfbef5 [ 14.936211] ? __pfx_scx_ops_disable_workfn+0x10/0x10 [ 14.936244] kthread_worker_fn+0x101/0x2c0 [ 14.936268] ? __pfx_kthread_worker_fn+0x10/0x10 [ 14.936299] kthread+0xec/0x110 [ 14.936327] ? __pfx_kthread+0x10/0x10 [ 14.936351] ret_from_fork+0x37/0x50 [ 14.936374] ? __pfx_kthread+0x10/0x10 [ 14.936400] ret_from_fork_asm+0x1a/0x30 [ 14.936427] </TASK> [ 14.936443] irq event stamp: 21002 [ 14.936467] hardirqs last enabled at (21001): [<ffffffff863aa35f>] resched_cpu+0x9f/0xd0 [ 14.936521] hardirqs last disabled at (21002): [<ffffffff863dd0ba>] scx_ops_bypass+0x11a/0x280 [ 14.936571] softirqs last enabled at (20642): [<ffffffff863683d7>] __irq_exit_rcu+0x67/0xd0 [ 14.936622] softirqs last disabled at (20637): [<ffffffff863683d7>] __irq_exit_rcu+0x67/0xd0 [ 14.936672] ---[ end trace 0000000000000000 ]--- [ 14.953282] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 14.953352] ------------[ cut here ]------------ [ 14.953383] WARNING: CPU: 2 PID: 360 at kernel/sched/ext.c:4335 scx_ops_bypass+0x1d8/0x280 [ 14.953428] Modules linked in: [ 14.953453] CPU: 2 UID: 0 PID: 360 Comm: sched_ext_ops_h Tainted: G W 6.11.0-virtme #24 [ 14.953505] Tainted: [W]=WARN [ 14.953527] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 14.953574] RIP: 0010:scx_ops_bypass+0x1d8/0x280 [ 14.953603] Code: c6 05 7b 34 e8 01 01 90 48 c7 c7 89 86 88 87 e8 be 1d f8 ff 90 0f 0b 90 90 eb 95 90 0f 0b 90 41 8b 84 24 24 0a 00 00 eb 97 90 <0f> 0b 90 41 8b 84 24 24 0a 00 00 eb 92 f3 0f 1e fa 49 8d 84 24 f0 [ 14.953693] RSP: 0018:ffffb706c0957ce0 EFLAGS: 00010046 [ 14.953722] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001 [ 14.953763] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8fc5fec31318 [ 14.953804] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 14.953845] R10: ffffffff863a9795 R11: 0000000000000000 R12: ffff8fc5fec31300 [ 14.953888] R13: ffff8fc5fec31318 R14: 0000000000000286 R15: 0000000000000018 [ 14.953934] FS: 0000000000000000(0000) GS:ffff8fc5fe680000(0000) knlGS:0000000000000000 [ 14.953974] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 14.954009] CR2: 0000557d92890b88 CR3: 000000002464a000 CR4: 0000000000750ef0 [ 14.954052] PKRU: 55555554 [ 14.954068] Call Trace: [ 14.954085] <TASK> [ 14.954102] ? __warn+0xce/0x220 [ 14.954126] ? scx_ops_bypass+0x1d8/0x280 [ 14.954150] ? report_bug+0xc1/0x160 [ 14.954178] ? handle_bug+0x61/0x90 [ 14.954203] ? exc_invalid_op+0x1a/0x50 [ 14.954226] ? asm_exc_invalid_op+0x1a/0x20 [ 14.954250] ? raw_spin_rq_lock_nested+0x15/0x30 [ 14.954285] ? scx_ops_bypass+0x1d8/0x280 [ 14.954311] ? __mutex_unlock_slowpath+0x3a/0x260 [ 14.954343] scx_ops_disable_workfn+0xa3e/0xac0 [ 14.954381] ? __pfx_scx_ops_disable_workfn+0x10/0x10 [ 14.954413] kthread_worker_fn+0x101/0x2c0 [ 14.954442] ? __pfx_kthread_worker_fn+0x10/0x10 [ 14.954479] kthread+0xec/0x110 [ 14.954507] ? __pfx_kthread+0x10/0x10 [ 14.954530] ret_from_fork+0x37/0x50 [ 14.954553] ? __pfx_kthread+0x10/0x10 [ 14.954576] ret_from_fork_asm+0x1a/0x30 [ 14.954603] </TASK> [ 14.954621] irq event stamp: 21002 [ 14.954644] hardirqs last enabled at (21001): [<ffffffff863aa35f>] resched_cpu+0x9f/0xd0 [ 14.954686] hardirqs last disabled at (21002): [<ffffffff863dd0ba>] scx_ops_bypass+0x11a/0x280 [ 14.954735] softirqs last enabled at (20642): [<ffffffff863683d7>] __irq_exit_rcu+0x67/0xd0 [ 14.954782] softirqs last disabled at (20637): [<ffffffff863683d7>] __irq_exit_rcu+0x67/0xd0 [ 14.954829] ---[ end trace 0000000000000000 ]--- [ 15.022283] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 15.092282] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 15.149282] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) ok 1 exit # ===== END ===== And with it, the test passes without issue after 1000s of runs: .[root@virtme-ng sched_ext]# ./runner -t exit ===== START ===== TEST: exit DESCRIPTION: Verify we can cleanly exit a scheduler in multiple places OUTPUT: [ 7.412856] sched_ext: BPF scheduler "exit" enabled [ 7.427924] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 7.466677] sched_ext: BPF scheduler "exit" enabled [ 7.475923] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 7.512803] sched_ext: BPF scheduler "exit" enabled [ 7.532924] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 7.586809] sched_ext: BPF scheduler "exit" enabled [ 7.595926] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 7.661923] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 7.723923] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) ok 1 exit # ===== END ===== ============================= RESULTS: PASSED: 1 SKIPPED: 0 FAILED: 0 Fixes: f0e1a0643a59 ("sched_ext: Implement BPF extensible scheduler class") Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-10-25cxl/test: Improve init-order fidelity relative to real-world systemsDan Williams
The investigation of an initialization failure [1] highlighted that cxl_test does not reflect the init-order of real world systems. The expected order is root/bus first then async probing of the memory devices. Fix up cxl_test to reflect that order. While it did not reproduce the initial bug report (since that is dependent on built-in vs modular builds), it did reveal a separate latent bug in the subsystem's decoder shutdown flow. Fix for that sent separately. Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/172964784521.81806.15791069994065969243.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25cxl/port: Prevent out-of-order decoder allocationDan Williams
With the recent change to allow out-of-order decoder de-commit it highlights a need to strengthen the in-order decoder commit guarantees. As it stands match_free_decoder() ensures that if 2 regions are racing decoder allocations the one that wins the race will get the lower id decoder, but that still leaves the race to *commit* the decoder. Rather than have this complicated case of "reserved in-order, but may still commit out-of-order", just arrange for the reservation order to match the commit-order. In other words, prevent subsequent allocations until the last reservation is committed. This precludes overlapping region creation events and requires the previous regionN to either move forward to the decoder commit stage or drop its reservation before regionN+1 can move forward. That is, provided that regionN and regionN+1 decode through the same switch port. As a side effect this allows match_free_decoder() to drop its dependency on needing write access to the device_find_child() @data parameter [1]. Reported-by: Zijun Hu <quic_zijuhu@quicinc.com> Closes: http://lore.kernel.org/20240905-const_dfc_prepare-v4-0-4180e1d5a244@quicinc.com Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964783668.81806.14962699553881333486.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25cxl/port: Fix use-after-free, permit out-of-order decoder shutdownDan Williams
In support of investigating an initialization failure report [1], cxl_test was updated to register mock memory-devices after the mock root-port/bus device had been registered. That led to cxl_test crashing with a use-after-free bug with the following signature: cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1 cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1 cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0 1) cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1 [..] cxld_unregister: cxl decoder14.0: cxl_region_decode_reset: cxl_region region3: mock_decoder_reset: cxl_port port3: decoder3.0 reset 2) mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1 cxl_endpoint_decoder_release: cxl decoder14.0: [..] cxld_unregister: cxl decoder7.0: 3) cxl_region_decode_reset: cxl_region region3: Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI [..] RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core] [..] Call Trace: <TASK> cxl_region_decode_reset+0x69/0x190 [cxl_core] cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x5d/0x60 [cxl_core] At 1) a region has been established with 2 endpoint decoders (7.0 and 14.0). Those endpoints share a common switch-decoder in the topology (3.0). At teardown, 2), decoder14.0 is the first to be removed and hits the "out of order reset case" in the switch decoder. The effect though is that region3 cleanup is aborted leaving it in-tact and referencing decoder14.0. At 3) the second attempt to teardown region3 trips over the stale decoder14.0 object which has long since been deleted. The fix here is to recognize that the CXL specification places no mandate on in-order shutdown of switch-decoders, the driver enforces in-order allocation, and hardware enforces in-order commit. So, rather than fail and leave objects dangling, always remove them. In support of making cxl_region_decode_reset() always succeed, cxl_region_invalidate_memregion() failures are turned into warnings. Crashing the kernel is ok there since system integrity is at risk if caches cannot be managed around physical address mutation events like CXL region destruction. A new device_for_each_child_reverse_from() is added to cleanup port->commit_end after all dependent decoders have been disabled. In other words if decoders are allocated 0->1->2 and disabled 1->2->0 then port->commit_end only decrements from 2 after 2 has been disabled, and it decrements all the way to zero since 1 was disabled previously. Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Cc: stable@vger.kernel.org Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25cxl/acpi: Ensure ports ready at cxl_acpi_probe() returnDan Williams
In order to ensure root CXL ports are enabled upon cxl_acpi_probe() when the 'cxl_port' driver is built as a module, arrange for the module to be pre-loaded or built-in. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964781969.81806.17276352414854540808.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()Dan Williams
It turns out since its original introduction, pre-2.6.12, bus_rescan_devices() has skipped devices that might be in the process of attaching or detaching from their driver. For CXL this behavior is unwanted and expects that cxl_bus_rescan() is a probe barrier. That behavior is simple enough to achieve with bus_for_each_dev() paired with call to device_attach(), and it is unclear why bus_rescan_devices() took the position of lockless consumption of dev->driver which is racy. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25cxl/port: Fix CXL port initialization order when the subsystem is built-inDan Williams
When the CXL subsystem is built-in the module init order is determined by Makefile order. That order violates expectations. The expectation is that cxl_acpi and cxl_mem can race to attach. If cxl_acpi wins the race, cxl_mem will find the enabled CXL root ports it needs. If cxl_acpi loses the race it will retrigger cxl_mem to attach via cxl_bus_rescan(). That flow only works if cxl_acpi can assume ports are enabled immediately upon cxl_acpi_probe() return. That in turn can only happen in the CONFIG_CXL_ACPI=y case if the cxl_port driver is registered before cxl_acpi_probe() runs. Fix up the order to prevent initialization failures. Ensure that cxl_port is built-in when cxl_acpi is also built-in, arrange for Makefile order to resolve the subsys_initcall() order of cxl_port and cxl_acpi, and arrange for Makefile order to resolve the device_initcall() (module_init()) order of the remaining objects. As for what contributed to this not being found earlier, the CXL regression environment, cxl_test, builds all CXL functionality as a module to allow to symbol mocking and other dynamic reload tests. As a result there is no regression coverage for the built-in case. Reported-by: Gregory Price <gourry@gourry.net> Closes: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net Tested-by: Gregory Price <gourry@gourry.net> Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Cc: stable@vger.kernel.org Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/172988474904.476062.7961350937442459266.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25drm/i915/xe3lpd: Load DMCGustavo Sousa
Load the DMC for Xe3LPD. Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022155115.50989-1-gustavo.sousa@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
2024-10-25scsi: ufs: core: Fix another deadlock during RTC updatePeter Wang
If ufshcd_rtc_work calls ufshcd_rpm_put_sync() and the pm's usage_count is 0, we will enter the runtime suspend callback. However, the runtime suspend callback will wait to flush ufshcd_rtc_work, causing a deadlock. Replace ufshcd_rpm_put_sync() with ufshcd_rpm_put() to avoid the deadlock. Fixes: 6bf999e0eb41 ("scsi: ufs: core: Add UFS RTC support") Cc: stable@vger.kernel.org #6.11.x Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241024015453.21684-1-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-10-25scsi: scsi_debug: Fix do_device_access() handling of unexpected SG copy lengthJohn Garry
If the sg_copy_buffer() call returns less than sdebug_sector_size, then we drop out of the copy loop. However, we still report that we copied the full expected amount, which is not proper. Fix by keeping a running total and return that value. Fixes: 84f3a3c01d70 ("scsi: scsi_debug: Atomic write support") Reported-by: Colin Ian King <colin.i.king@gmail.com> Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241018101655.4207-1-john.g.garry@oracle.com Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-10-25Merge tag 'v6.12-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - Fix init module error caseb - Fix memory allocation error path (for passwords) in mount * tag 'v6.12-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix warning when destroy 'cifs_io_request_pool' smb: client: Handle kstrdup failures for passwords
2024-10-25Merge tag 'fuse-fixes-6.12-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: - Fix cached size after passthrough writes This fix needed a trivial change in the backing-file API, which resulted in some non-fuse files being touched. - Revert a commit meant as a cleanup but which triggered a WARNING - Remove a stray debug line left-over * tag 'fuse-fixes-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: remove stray debug line Revert "fuse: move initialization of fuse_file to fuse_writepages() instead of in callback" fuse: update inode size after extending passthrough write fs: pass offset and result to backing_file end_write() callback
2024-10-25Merge tag 'nfsd-6.12-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a couple of use-after-free bugs * tag 'nfsd-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net nfsd: fix race between laundromat and free_stateid
2024-10-25Merge tag 'acpi-6.12-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix an ACPI PRM (Platform Runtime Mechanism) issue and add two new DMI quirks, one for an ACPI IRQ override and one for lid switch detection: - Make acpi_parse_prmt() look for EFI_MEMORY_RUNTIME memory regions only to comply with the UEFI specification and make PRM use efi_guid_t instead of guid_t to avoid a compiler warning triggered by that change (Koba Ko, Dan Carpenter) - Add an ACPI IRQ override quirk for LG 16T90SP (Christian Heusel) - Add a lid switch detection quirk for Samsung Galaxy Book2 (Shubham Panwar)" * tag 'acpi-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: PRM: Clean up guid type in struct prm_handler_info ACPI: button: Add DMI quirk for Samsung Galaxy Book2 to fix initial lid detection issue ACPI: resource: Add LG 16T90SP to irq1_level_low_skip_override[] ACPI: PRM: Find EFI_MEMORY_RUNTIME block for PRM handler and context
2024-10-25Merge tag 'pm-6.12-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "Update cpufreq documentation to match the code after recent changes (Christian Loehle), fix a units conversion issue in the CPPC cpufreq driver (liwei), and fix an error check in the dtpm_devfreq power capping driver (Yuan Can)" * tag 'pm-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: CPPC: fix perf_to_khz/khz_to_perf conversion exception powercap: dtpm_devfreq: Fix error check against dev_pm_qos_add_request() cpufreq: docs: Reflect latency changes in docs
2024-10-25Merge tag 'pci-v6.12-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Hold the rescan lock while adding devices to avoid race with concurrent pwrctl rescan that can lead to a crash (Bartosz Golaszewski) - Avoid binding pwrctl driver to QCom WCN wifi if the DT lacks the necessary PMU regulator descriptions (Bartosz Golaszewski) * tag 'pci-v6.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/pwrctl: Abandon QCom WCN probe on pre-pwrseq device-trees PCI: Hold rescan lock while adding devices during host probe
2024-10-25Merge tag 'fbdev-for-6.12-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fixes from Helge Deller: - Fix some build warnings and failures with CONFIG_FB_IOMEM_FOPS and CONFIG_FB_DEVICE - Remove the da8xx fbdev driver - Constify struct sbus_mmap_map and fix indentation warning * tag 'fbdev-for-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: wm8505fb: select CONFIG_FB_IOMEM_FOPS fbdev: da8xx: remove the driver fbdev: Constify struct sbus_mmap_map fbdev: nvidiafb: fix inconsistent indentation warning fbdev: sstfb: Make CONFIG_FB_DEVICE optional
2024-10-25Merge tag 'gpio-fixes-for-v6.12-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fix from Bartosz Golaszewski: "Update MAINTAINERS with a keyword pattern for legacy GPIO API The goal is to alert us to anyone trying to use the deprecated, legacy API (this happens almost every release)" * tag 'gpio-fixes-for-v6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: MAINTAINERS: add a keyword entry for the GPIO subsystem
2024-10-25drm/i915/display: Cover all possible pipes in TP_printk()Gustavo Sousa
Tracepoints that display frame and scanline counters for all pipes were added with commit 1489bba82433 ("drm/i915: Add cxsr toggle tracepoint") and commit 0b2599a43ca9 ("drm/i915: Add pipe enable/disable tracepoints"). At that time, we only had pipes A, B and C. Now that we can also have pipe D, the TP_printk() calls are missing it. As a quick and dirty fix for that, let's define two common macros to be used for the format and values respectively, and also ensure we raise a build bug if more pipes are added to enum pipe. In the future, we should probably have a way of printing information for available pipes only. Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016135300.21428-6-gustavo.sousa@intel.com
2024-10-25drm/i915/display: Do not use ids from enum pipe in TP_printk()Gustavo Sousa
Because much of kernel tracepoints is implemented at the C preprocessor level, C identifiers used in TP_printk() are saved verbatim in the event format, even when they represent compile-time constant values. As an example, we can look at the format for the intel_pipe_enable event: # cat /sys/kernel/debug/tracing/events/i915/intel_pipe_enable/format | grep '^print fmt' print fmt: "dev %s, pipe %c enable, pipe A: frame=%u, scanline=%u, pipe B: frame=%u, scanline=%u, pipe C: frame=%u, scanline=%u", __get_str(dev), REC->pipe_name, REC->frame[PIPE_A], REC->scanline[PIPE_A], REC->frame[PIPE_B], REC->scanline[PIPE_B], REC->frame[PIPE_C], REC->scanline[PIPE_C] We see that PIPE_A, PIPE_B and PIPE_C are pasted directly in the format. Because tools that interact with kernel tracepoints don't know about those ids, they'll endup failing to parse the format or produce corrupted output. For example, we can see below that trace-cmd repeats PIPE_A's frame/scanline counts for all pipes (probably because it evaluates unknown ids as zero): $ trace-cmd report -F intel_pipe_enable | tail -n5 testdisplay-8616 [000] 22048.276758: intel_pipe_enable: dev 0000:00:02.0, pipe A enable, pipe A: frame=861, scanline=480, pipe B: frame=861, scanline=480, pipe C: frame=861, scanline=480 testdisplay-8616 [001] 22048.490287: intel_pipe_enable: dev 0000:00:02.0, pipe A enable, pipe A: frame=867, scanline=480, pipe B: frame=867, scanline=480, pipe C: frame=867, scanline=480 testdisplay-8616 [003] 22048.700181: intel_pipe_enable: dev 0000:00:02.0, pipe A enable, pipe A: frame=872, scanline=400, pipe B: frame=872, scanline=400, pipe C: frame=872, scanline=400 testdisplay-8616 [002] 22049.054220: intel_pipe_enable: dev 0000:00:02.0, pipe A enable, pipe A: frame=881, scanline=2170, pipe B: frame=881, scanline=2170, pipe C: frame=881, scanline=2170 testdisplay-8616 [002] 22049.166851: intel_pipe_enable: dev 0000:00:02.0, pipe B enable, pipe A: frame=887, scanline=1632, pipe B: frame=887, scanline=1632, pipe C: frame=887, scanline=1632 , while in fact we have different values for each pipe, which can be confirmed with the raw view of the events: $ trace-cmd report -R -F intel_pipe_enable | tail -n5 testdisplay-8616 [000] 22048.276758: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[5d, 03, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[e0, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe_name=A testdisplay-8616 [001] 22048.490287: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[63, 03, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[e0, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe_name=A testdisplay-8616 [003] 22048.700181: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[68, 03, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[90, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe_name=A testdisplay-8616 [002] 22049.054220: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[71, 03, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[7a, 08, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe_name=A testdisplay-8616 [002] 22049.166851: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[77, 03, 00, 00, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[60, 06, 00, 00, 39, 04, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe_name=B To fix that, we need a fix that looks more like a hack: use macros that result to integer constants instead of enum pipe values. This fixes the issue, but could break if, for whatever unlikely reason, the underlying values in the enum are changed. In the future, we should find a better way to handle this, but for now, the hack took care of the job: $ trace-cmd report -F intel_pipe_enable | tail -n5 testdisplay-9224 [003] 24324.455375: intel_pipe_enable: dev 0000:00:02.0, pipe A enable, pipe A: frame=1103, scanline=480, pipe B: frame=0, scanline=0, pipe C: frame=0, scanline=0 testdisplay-9224 [002] 24324.669845: intel_pipe_enable: dev 0000:00:02.0, pipe A enable, pipe A: frame=1109, scanline=480, pipe B: frame=0, scanline=0, pipe C: frame=0, scanline=0 testdisplay-9224 [003] 24324.900105: intel_pipe_enable: dev 0000:00:02.0, pipe A enable, pipe A: frame=1115, scanline=31, pipe B: frame=0, scanline=0, pipe C: frame=0, scanline=0 testdisplay-9224 [002] 24325.256408: intel_pipe_enable: dev 0000:00:02.0, pipe A enable, pipe A: frame=1124, scanline=2171, pipe B: frame=0, scanline=0, pipe C: frame=0, scanline=0 testdisplay-9224 [003] 24325.380789: intel_pipe_enable: dev 0000:00:02.0, pipe B enable, pipe A: frame=1131, scanline=979, pipe B: frame=1, scanline=1082, pipe C: frame=0, scanline=0 v2: - Statically assert that PIPE_A == _TRACE_PIPE_A. (MattR) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016135300.21428-5-gustavo.sousa@intel.com
2024-10-25drm/i915/display: Store pipe name in trace eventsGustavo Sousa
The first part[1] of the LWN series on using TRACE_EVENT() mentions about TP_printk(): "Do not create new tracepoint-specific helpers, because that will confuse user-space tools that know about the TRACE_EVENT() helper macros but will not know how to handle ones created for individual tracepoints." It seems this is what we ended up doing when using pipe_name() in TP_printk(). For example, the format for the intel_pipe_update_start event is as follows: # cat /sys/kernel/debug/tracing/events/i915/intel_pipe_update_start/format name: intel_pipe_update_start ID: 1136 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:__data_loc char[] dev; offset:8; size:4; signed:0; field:enum pipe pipe; offset:12; size:4; signed:1; field:u32 frame; offset:16; size:4; signed:0; field:u32 scanline; offset:20; size:4; signed:0; field:u32 min; offset:24; size:4; signed:0; field:u32 max; offset:28; size:4; signed:0; print fmt: "dev %s, pipe %c, frame=%u, scanline=%u, min=%u, max=%u", __get_str(dev), ((REC->pipe) + 'A'), REC->frame, REC->scanline, REC->min, REC->max The call to pipe_name(__entry->pipe) is expanted to ((REC->pipe) + 'A') and that's how the format is saved. Even though the output from /sys/kernel/debug/tracing/trace will look correct (because it is generated in the kernel), we will see corrupted lines when using a tool like trace-cmd to view the data. While the output looks correct when looking at /sys/kernel/debug/tracing/trace, we see corrupted lines when viewing the trace data with "trace-cmd report": $ trace-cmd report \ > | sed -n 's/.*dev 0000:00:02\.0, \(pipe .\).*/\1/p' \ > | cat -v | uniq -c 34 pipe ^A , where ^A is a non-printable character. As a fix, let's store the pipe name directly in the event. The fix was done by applying the following sed script: s/__field\s*(\s*enum\s\+pipe\s*,\s*pipe\s*)/__field(char, pipe_name)/ s/__entry\s*->\s*pipe\s*=\s*\([^;]\+\);/__entry->pipe_name = pipe_name(\1);/ s/pipe_name\s*(\s*__entry\s*->\s*pipe\s*)/__entry->pipe_name/ After these changes, using the same example, we have the following: $ trace-cmd report \ > | sed -n 's/.*dev 0000:00:02\.0, \(pipe .\).*/\1/p' \ > | cat -v | sort | uniq -c 396 pipe A 34 pipe B [1] https://lwn.net/Articles/379903/ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016135300.21428-4-gustavo.sousa@intel.com
2024-10-25drm/i915/display: Zero-initialize frame/scanline counts in tracepointsGustavo Sousa
In an upcoming change, we will also add support for logging frame/scanline counts for pipe D in relevant tracepoints. In [1], Matt mentioned the possibility of having garbage in those counts for pipe D on a platform containing only 3 pipes. Indeed, it has been verified that the counts for the extra pipe would not be zero-initialized by the tracing system. Since it is also possible that the same would happen for a fused-off pipe, let's go ahead and add the logic to zero-initialize the arrays now. [1] https://lore.kernel.org/all/20240918224927.GU5091@mdroper-desk1.amr.corp.intel.com/ Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016135300.21428-3-gustavo.sousa@intel.com
2024-10-25drm/i915/display: Fix out-of-bounds access in pipe-related tracepointsGustavo Sousa
Some display trace events use array members to store frame and scanline counts for each pipe. However, those arrays are declared with 3 as the hardcoded size, which cause out-of-bounds access when the trace event is enabled on a platform that contains pipe D. For example, when looking at the last 10 intel_pipe_enable events after running IGT's testdisplay, we see the following on a MTL machine that has pipe D available: $ trace-cmd report -R -F intel_pipe_enable \ > | tail \ > | sed 's,\(frame=.*\) \(scanline=.*\),\n\t \1\n\t\2,' testdisplay-6715 [002] 17591.063491: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[83, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-6715 [003] 17591.264742: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[89, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-6715 [003] 17591.464541: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[8f, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-6715 [001] 17591.695827: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[95, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-6715 [000] 17591.915841: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[9a, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-6715 [000] 17592.127114: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[a0, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-6715 [002] 17592.358351: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[a8, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-6715 [002] 17592.580467: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[ae, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-6715 [000] 17592.950946: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[b8, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-6715 [004] 17593.079597: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[bf, 01, 00, 00, 01, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[00, 00, 00, 00, 3a, 04, 00, 00, 00, 00, 00, 00] pipe=1 Which shows zeros for pipe A's scanline counts. That happens because pipe D's frame counts are overwriting them. Let's fix that by making the arrays bring able to store info for all possible pipes. With the fix, we get the following: $ trace-cmd report -R -F intel_pipe_enable \ > | tail \ > | sed 's,\(frame=.*\) \(scanline=.*\),\n\t \1\n\t\2,' testdisplay-7040 [003] 18067.489565: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[8c, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[8e, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-7040 [002] 18067.699312: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[92, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[58, 02, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-7040 [002] 18067.908868: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[98, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[58, 02, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-7040 [002] 18068.122802: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[9d, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[58, 02, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-7040 [003] 18068.331019: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[a2, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[e0, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-7040 [002] 18068.529067: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[a8, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[e0, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-7040 [003] 18068.742033: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[ae, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[e0, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-7040 [002] 18068.956229: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[b3, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[1f, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-7040 [002] 18069.295322: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[bb, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[7b, 08, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=0 testdisplay-7040 [010] 18069.423527: intel_pipe_enable: dev=0000:00:02.0 frame=ARRAY[c2, 01, 00, 00, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] scanline=ARRAY[d0, 05, 00, 00, 3a, 04, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] pipe=1 Which makes more sense now. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016135300.21428-2-gustavo.sousa@intel.com
2024-10-25Merge tag 'ata-6.12-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Niklas Cassel: - Fix the handling of ATA commands that timeout (command that did not receive a completion interrupt within the configured timeout time). Commands that timeout, while also having either the FAILFAST flag set, or the command being a passthrough command, should never be retried. Restore this behavior (as it was before v6.12-rc1). * tag 'ata-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata: Set DID_TIME_OUT for commands that actually timed out
2024-10-25Merge tag 'sound-6.12-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The majority of changes here are about ASoC. There are two core changes in ASoC (the bump of minimal topology ABI version and the fix for references of components in DAPM code), and others are mostly various device-specific fixes for SoundWire, AMD, Intel, SOF, Qualcomm and FSL, in addition to a few usual HD-audio quirks and fixes" * tag 'sound-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (33 commits) ALSA: hda/realtek: Update default depop procedure ASoC: qcom: sc7280: Fix missing Soundwire runtime stream alloc ASoC: fsl_micfil: Add sample rate constraint ASoC: rt722-sdca: increase clk_stop_timeout to fix clock stop issue ALSA: hda/tas2781: select CRC32 instead of CRC32_SARWATE ALSA: hda/realtek: Add subwoofer quirk for Acer Predator G9-593 ALSA: firewire-lib: Avoid division by zero in apply_constraint_to_size() ASoC: fsl_micfil: Add a flag to distinguish with different volume control types ASoC: codecs: lpass-rx-macro: fix RXn(rx,n) macro for DSM_CTL and SEC7 regs ASoC: Change my e-mail to gmail ASoC: Intel: soc-acpi: lnl: Add match entry for TM2 laptops ASoC: amd: yc: Fix non-functional mic on ASUS E1404FA ASoC: SOF: Intel: hda: Always clean up link DMA during stop soundwire: intel_ace2x: Send PDI stream number during prepare ASoC: SOF: Intel: hda: Handle prepare without close for non-HDA DAI's ASoC: SOF: ipc4-topology: Do not set ALH node_id for aggregated DAIs MAINTAINERS: Update maintainer list for MICROCHIP ASOC, SSC and MCP16502 drivers ASoC: qcom: Select missing common Soundwire module code on SDM845 ASoC: fsl_esai: change dev_warn to dev_dbg in irq handler ASoC: rsnd: Fix probe failure on HiHope boards due to endpoint parsing ...
2024-10-25Merge tag 'drm-fixes-2024-10-25' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Weekly drm fixes, mostly amdgpu and xe, with minor bridge and an i915 Kconfig fix. Nothing too scary and it seems to be pretty quiet. amdgpu: - ACPI method handling fixes - SMU 14.x fixes - Display idle optimization fix - DP link layer compliance fix - SDMA 7.x fix - PSR-SU fix - SWSMU fix i915: - Fix DRM_I915_GVT_KVMGT dependencies in Kconfig xe: - Increase invalidation timeout to avoid errors in some hosts - Flush worker on timeout - Better handling for force wake failure - Improve argument check on user fence creation - Don't restart parallel queues multiple times on GT reset bridge: - aux: Fix assignment of OF node - tc358767: Add missing of_node_put() in error path" * tag 'drm-fixes-2024-10-25' of https://gitlab.freedesktop.org/drm/kernel: drm/xe: Don't restart parallel queues multiple times on GT reset drm/xe/ufence: Prefetch ufence addr to catch bogus address drm/xe: Handle unreliable MMIO reads during forcewake drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout drm/xe: Enlarge the invalidation timeout from 150 to 500 drm/amdgpu: handle default profile on on devices without fullscreen 3D drm/amd/display: Disable PSR-SU on Parade 08-01 TCON too drm/amdgpu: fix random data corruption for sdma 7 drm/amd/display: temp w/a for DP Link Layer compliance drm/amd/display: temp w/a for dGPU to enter idle optimizations drm/amd/pm: update deep sleep status on smu v14.0.2/3 drm/amd/pm: update overdrive function on smu v14.0.2/3 drm/amd/pm: update the driver-fw interface file for smu v14.0.2/3 drm/amd: Guard against bad data for ATIF ACPI method drm/bridge: tc358767: fix missing of_node_put() in for_each_endpoint_of_node() drm/bridge: Fix assignment of the of_node of the parent to aux bridge i915: fix DRM_I915_GVT_KVMGT dependencies
2024-10-25bcachefs: Fix UAF in bch2_reconstruct_alloc()Kent Overstreet
write_super() -> sb_counters_from_cpu() may reallocate the superblock Reported-by: syzbot+9fc4dac4775d07bcfe34@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-25bcachefs: fix null-ptr-deref in have_stripes()Jeongjun Park
c->btree_roots_known[i].b can be NULL. In this case, a NULL pointer dereference occurs, so you need to add code to check the variable. Reported-by: syzbot+b468b9fef56949c3b528@syzkaller.appspotmail.com Fixes: 7773df19c35f ("bcachefs: metadata version bucket_stripe_sectors") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-25accel/qaic: Add crashdump to SaharaJeffrey Hugo
The Sahara protocol has a crashdump functionality. In the hello exchange, the device can advertise it has a memory dump available for the host to collect. Instead of the device making requests of the host, the host requests data from the device which can be later analyzed. Implement this functionality and utilize the devcoredump framework for handing the dump over to userspace. Similar to how firmware loading in Sahara involves multiple files, crashdump can consist of multiple files for different parts of the dump. Structure these into a single buffer that userspace can parse and extract the original files from. Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241021200355.544126-1-quic_jhugo@quicinc.com
2024-10-25scx: Fix exit selftest to use custom DSQDavid Vernet
In commit 63fb3ec80516 ("sched_ext: Allow only user DSQs for scx_bpf_consume(), scx_bpf_dsq_nr_queued() and bpf_iter_scx_dsq_new()"), we updated the consume path to only accept user DSQs, thus making it invalid to consume SCX_DSQ_GLOBAL. This selftest was doing that, so let's create a custom DSQ and use that instead. The test now passes: [root@virtme-ng sched_ext]# ./runner -t exit ===== START ===== TEST: exit DESCRIPTION: Verify we can cleanly exit a scheduler in multiple places OUTPUT: [ 12.387229] sched_ext: BPF scheduler "exit" enabled [ 12.406064] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 12.453325] sched_ext: BPF scheduler "exit" enabled [ 12.474064] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 12.515241] sched_ext: BPF scheduler "exit" enabled [ 12.532064] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 12.592063] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 12.654063] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) [ 12.715062] sched_ext: BPF scheduler "exit" disabled (unregistered from BPF) ok 1 exit # ===== END ===== Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-10-25x86: fix whitespace in runtime-const assembler outputLinus Torvalds
The x86 user pointer validation changes made me look at compiler output a lot, and the wrong indentation for the ".popsection" in the generated assembler triggered me. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-10-25x86: fix user address masking non-canonical speculation issueLinus Torvalds
It turns out that AMD has a "Meltdown Lite(tm)" issue with non-canonical accesses in kernel space. And so using just the high bit to decide whether an access is in user space or kernel space ends up with the good old "leak speculative data" if you have the right gadget using the result: CVE-2020-12965 “Transient Execution of Non-Canonical Accesses“ Now, the kernel surrounds the access with a STAC/CLAC pair, and those instructions end up serializing execution on older Zen architectures, which closes the speculation window. But that was true only up until Zen 5, which renames the AC bit [1]. That improves performance of STAC/CLAC a lot, but also means that the speculation window is now open. Note that this affects not just the new address masking, but also the regular valid_user_address() check used by access_ok(), and the asm version of the sign bit check in the get_user() helpers. It does not affect put_user() or clear_user() variants, since there's no speculative result to be used in a gadget for those operations. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Link: https://lore.kernel.org/all/80d94591-1297-4afb-b510-c665efd37f10@citrix.com/ Link: https://lore.kernel.org/all/20241023094448.GAZxjFkEOOF_DM83TQ@fat_crate.local/ [1] Link: https://www.amd.com/en/resources/product-security/bulletin/amd-sb-1010.html Link: https://arxiv.org/pdf/2108.10771 Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> # LAM case Fixes: 2865baf54077 ("x86: support user address masking instead of non-speculative conditional") Fixes: 6014bc27561f ("x86-64: make access_ok() independent of LAM") Fixes: b19b74bc99b1 ("x86/mm: Rework address range check in get_user() and put_user()") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-10-25cxl/events: Fix Trace DRAM Event RecordShiju Jose
CXL spec rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record. Fix decode memory event type field of DRAM Event Record. For e.g. if value is 0x1 it will be reported as an Invalid Address (General Media Event Record - Memory Event Type) instead of Scrub Media ECC Error (DRAM Event Record - Memory Event Type) and so on. Fixes: 2d6c1e6d60ba ("cxl/mem: Trace DRAM Event Record") Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20241014143003.1170-1-shiju.jose@huawei.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25drm/sched: warn about drm_sched_job_init()'s partial initPhilipp Stanner
drm_sched_job_init()'s name suggests that after the function succeeded, parameter "job" will be fully initialized. This is not the case; some members are only later set, notably drm_sched_job.sched by drm_sched_job_arm(). Document that drm_sched_job_init() does not set all struct members. Document the lifetime of drm_sched_job.sched. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023141530.113370-2-pstanner@redhat.com
2024-10-25wifi: iwlwifi: mvm: fix 6 GHz scan constructionJohannes Berg
If more than 255 colocated APs exist for the set of all APs found during 2.4/5 GHz scanning, then the 6 GHz scan construction will loop forever since the loop variable has type u8, which can never reach the number found when that's bigger than 255, and is stored in a u32 variable. Also move it into the loops to have a smaller scope. Using a u32 there is fine, we limit the number of APs in the scan list and each has a limit on the number of RNR entries due to the frame size. With a limit of 1000 scan results, a frame size upper bound of 4096 (really it's more like ~2300) and a TBTT entry size of at least 11, we get an upper bound for the number of ~372k, well in the bounds of a u32. Cc: stable@vger.kernel.org Fixes: eae94cf82d74 ("iwlwifi: mvm: add support for 6GHz") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219375 Link: https://patch.msgid.link/20241023091744.f4baed5c08a1.I8b417148bbc8c5d11c101e1b8f5bf372e17bf2a7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-25wifi: cfg80211: clear wdev->cqm_config pointer on freeJohannes Berg
When we free wdev->cqm_config when unregistering, we also need to clear out the pointer since the same wdev/netdev may get re-registered in another network namespace, then destroyed later, running this code again, which results in a double-free. Reported-by: syzbot+36218cddfd84b5cc263e@syzkaller.appspotmail.com Fixes: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race") Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20241022161742.7c34b2037726.I121b9cdb7eb180802eafc90b493522950d57ee18@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-25mac80211: fix user-power when emulating chanctxBen Greear
ieee80211_calc_hw_conf_chan was ignoring the configured user_txpower. If it is set, use it to potentially decrease txpower as requested. Signed-off-by: Ben Greear <greearb@candelatech.com> Link: https://patch.msgid.link/20241010203954.1219686-1-greearb@candelatech.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-25Revert "wifi: iwlwifi: remove retry loops in start"Emmanuel Grumbach
Revert commit dfdfe4be183b ("wifi: iwlwifi: remove retry loops in start"), it turns out that there's an issue with the PNVM load notification from firmware not getting processed, that this patch has been somewhat successfully papering over. Since this is being reported, revert the loop removal for now. We will later at least clean this up to only attempt to retry if there was a timeout, but currently we don't even bubble up the failure reason to the correct layer, only returning NULL. Fixes: dfdfe4be183b ("wifi: iwlwifi: remove retry loops in start") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://patch.msgid.link/20241022092212.4aa82a558a00.Ibdeff9c8f0d608bc97fc42024392ae763b6937b7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-25wifi: iwlwifi: mvm: don't add default link in fw restart flowEmmanuel Grumbach
When we add the vif (and its default link) in fw restart we may override the link that already exists. We take care of this but if link 0 is a valid MLO link, then we will re-create a default link on mvmvif->link[0] and we'll loose the real link we had there. In non-MLO, we need to re-create the default link upon the interface creation, this is fine. In MLO, we'll just wait for change_vif_links() to re-build the links. Fixes: bf976c814c86 ("wifi: iwlwifi: mvm: implement link change ops") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.385bfea1b2e9.I4a127312285ccb529cc95cc4edf6fbe1e0a136ad@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-25wifi: iwlwifi: mvm: Fix response handling in iwl_mvm_send_recovery_cmd()Daniel Gabay
1. The size of the response packet is not validated. 2. The response buffer is not freed. Resolve these issues by switching to iwl_mvm_send_cmd_status(), which handles both size validation and frees the buffer. Fixes: f130bb75d881 ("iwlwifi: add FW recovery flow") Signed-off-by: Daniel Gabay <daniel.gabay@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.76c73185951e.Id3b6ca82ced2081f5ee4f33c997491d0ebda83f7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-25wifi: iwlwifi: mvm: SAR table alignmentAnjaneyulu
SAR table format in ACPI and local data base are different, So modified code to read data properly. Signed-off-by: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.f077aced4dee.I4dc618f12d01f7ad19f9f8881f6e09eea77e9a14@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-25wifi: iwlwifi: mvm: Use the sync timepoint API in suspendDaniel Gabay
When starting the suspend flow, HOST_D3_START triggers an _async_ firmware dump collection for debugging purposes. The async worker may race with suspend flow and fail to get NIC access, resulting in the following warning: "Timeout waiting for hardware access (CSR_GP_CNTRL 0xffffffff)" Fix this by switching to the sync version to ensure the dump completes before proceeding with the suspend flow, avoiding potential race issues. Signed-off-by: Daniel Gabay <daniel.gabay@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.9aae318cd593.I4b322009f39489c0b1d8893495c887870f73ed9c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-25wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmdMiri Korenblit
iwl_mvm_send_ap_tx_power_constraint_cmd is a no-op if the link is not active (we need to know the band etc.) However, for the station case it will be called just before we set the link to active (by calling iwl_mvm_link_changed with the LINK_CONTEXT_MODIFY_ACTIVE bit set in the 'changed' flags and active = true), so it will end up doing nothing. Fix this by calling iwl_mvm_send_ap_tx_power_constraint_cmd before iwl_mvm_link_changed. Fixes: 6b82f4e119d1 ("wifi: iwlwifi: mvm: handle TPE advertised by AP") Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.5c235fccd3f1.I2d40dea21e5547eba458565edcb4c354d094d82a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-25wifi: iwlwifi: mvm: don't leak a link on AP removalEmmanuel Grumbach
Release the link mapping resource in AP removal. This impacted devices that do not support the MLD API (9260 and down). On those devices, we couldn't start the AP again after the AP has been already started and stopped. Fixes: a8b5d4809b50 ("wifi: iwlwifi: mvm: Configure the link mapping for non-MLD FW") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241010140328.c54c42779882.Ied79e0d6244dc5a372e8b6ffa8ee9c6e1379ec1d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-25Merge branch 'pm-powercap'Rafael J. Wysocki
Merge a dtpm_devfreq power capping driver fix for 6.12-rc5: - Fix a dev_pm_qos_add_request() return value check in __dtpm_devfreq_setup() to prevent it from failing if a positive number is returned (Yuan Can). * pm-powercap: powercap: dtpm_devfreq: Fix error check against dev_pm_qos_add_request()
2024-10-25Merge branches 'acpi-resource' and 'acpi-button'Rafael J. Wysocki
Merge new DMI quirks for 6.12-rc5: - Add an ACPI IRQ override quirk for LG 16T90SP (Christian Heusel). - Add a lid switch detection quirk for Samsung Galaxy Book2 (Shubham Panwar). * acpi-resource: ACPI: resource: Add LG 16T90SP to irq1_level_low_skip_override[] * acpi-button: ACPI: button: Add DMI quirk for Samsung Galaxy Book2 to fix initial lid detection issue
2024-10-25fuse: remove stray debug lineMiklos Szeredi
It wasn't there when the patch was posted for review, but somehow made it into the pull. Link: https://lore.kernel.org/all/20240913104703.1673180-1-mszeredi@redhat.com/ Fixes: efad7153bf93 ("fuse: allow O_PATH fd for FUSE_DEV_IOC_BACKING_OPEN") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-10-25drm/panfrost: Remove unused id_mask from struct panfrost_modelSteven Price
The id_mask field of struct panfrost_model has never been used. Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241025140008.385081-1-steven.price@arm.com
2024-10-25Merge drm/drm-fixes into drm-misc-fixesThomas Zimmermann
Backmerging to get the latest fixes from upstream. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-10-25Merge commit 'bf40167d54d5' into fixesPalmer Dabbelt
This fix is part of a series on for-next, but it fixes broken builds so I'm picking it up as a fix. * commit 'bf40167d54d5': riscv: vdso: Prevent the compiler from inserting calls to memset()