summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-01-02xe/oa: Fix query mode of operation for OAR/OACUmesh Nerlige Ramappa
This is a set of squashed commits to facilitate smooth applying to stable. Each commit message is retained for reference. 1) Allow a GGTT mapped batch to be submitted to user exec queue For a OA use case, one of the HW registers needs to be modified by submitting an MI_LOAD_REGISTER_IMM command to the users exec queue, so that the register is modified in the user's hardware context. In order to do this a batch that is mapped in GGTT, needs to be submitted to the user exec queue. Since all user submissions use q->vm and hence PPGTT, add some plumbing to enable submission of batches mapped in GGTT. v2: ggtt is zero-initialized, so no need to set it false (Matt Brost) 2) xe/oa: Use MI_LOAD_REGISTER_IMMEDIATE to enable OAR/OAC To enable OAR/OAC, a bit in RING_CONTEXT_CONTROL needs to be set. Setting this bit cause the context image size to change and if not done correct, can cause undesired hangs. Current code uses a separate exec_queue to modify this bit and is error-prone. As per HW recommendation, submit MI_LOAD_REGISTER_IMM to the target hardware context to modify the relevant bit. In v2 version, an attempt to submit everything to the user-queue was made, but it failed the unprivileged-single-ctx-counters test. It appears that the OACTXCONTROL must be modified from a remote context. In v3 version, all context specific register configurations were moved to use LOAD_REGISTER_IMMEDIATE and that seems to work well. This is a cleaner way, since we can now submit all configuration to user exec_queue and the fence handling is simplified. v2: (Matt) - set job->ggtt to true if create job is successful - unlock vm on job error (Ashutosh) - don't wait on job submission - use kernel exec queue where possible v3: (Ashutosh) - Fix checkpatch issues - Remove extra spaces/new-lines - Add Fixes: and Cc: tags - Reset context control bit when OA stream is closed - Submit all config via MI_LOAD_REGISTER_IMMEDIATE (Umesh) - Update commit message for v3 experiment - Squash patches for easier port to stable v4: (Ashutosh) - No need to pass q to xe_oa_submit_bb - Do not support exec queues with width > 1 - Fix disabling of CTX_CTRL_OAC_CONTEXT_ENABLE v5: (Ashutosh) - Drop reg_lri related comments - Use XE_OA_SUBMIT_NO_DEPS in xe_oa_load_with_lri Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> # commit 1 Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.nerlige.ramappa@intel.com (cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-12-31Merge tag 'nvme-6.13-2024-12-31' of git://git.infradead.org/nvme into block-6.13Jens Axboe
Pull NVMe fixes from Keith: "nvme fixes for Linux 6.13 - Fix device specific quirk for PRP list alignment (Robert) - Fix target name overflow (Leo) - Fix target write granularity (Luis) - Fix target sleeping in atomic context (Nilay) - Remove unnecessary tcp queue teardown (Chunguang)" * tag 'nvme-6.13-2024-12-31' of git://git.infradead.org/nvme: nvme-tcp: remove nvme_tcp_destroy_io_queues() nvmet-loop: avoid using mutex in IO hotpath nvmet: propagate npwg topology nvmet: Don't overflow subsysnqn nvme-pci: 512 byte aligned dma pool segment quirk
2024-12-31RDMA/bnxt_re: Fix error recovery sequenceKalesh AP
Fixed to return ENXIO from __send_message_basic_sanity() to indicate that device is in error state. In the case of ERR_DEVICE_DETACHED state, the driver should not post the commands to the firmware as it will time out eventually. Removed bnxt_re_modify_qp() call from bnxt_re_dev_stop() as it is a no-op. Fixes: cc5b9b48d447 ("RDMA/bnxt_re: Recover the device when FW error is detected") Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Link: https://patch.msgid.link/20241231025008.2267162-1-kalesh-anakkur.purayil@broadcom.com Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-31RDMA/rtrs: Ensure 'ib_sge list' is accessibleLi Zhijian
Move the declaration of the 'ib_sge list' variable outside the 'always_invalidate' block to ensure it remains accessible for use throughout the function. Previously, 'ib_sge list' was declared within the 'always_invalidate' block, limiting its accessibility, then caused a 'BUG: kernel NULL pointer dereference'[1]. ? __die_body.cold+0x19/0x27 ? page_fault_oops+0x15a/0x2d0 ? search_module_extables+0x19/0x60 ? search_bpf_extables+0x5f/0x80 ? exc_page_fault+0x7e/0x180 ? asm_exc_page_fault+0x26/0x30 ? memcpy_orig+0xd5/0x140 rxe_mr_copy+0x1c3/0x200 [rdma_rxe] ? rxe_pool_get_index+0x4b/0x80 [rdma_rxe] copy_data+0xa5/0x230 [rdma_rxe] rxe_requester+0xd9b/0xf70 [rdma_rxe] ? finish_task_switch.isra.0+0x99/0x2e0 rxe_sender+0x13/0x40 [rdma_rxe] do_task+0x68/0x1e0 [rdma_rxe] process_one_work+0x177/0x330 worker_thread+0x252/0x390 ? __pfx_worker_thread+0x10/0x10 This change ensures the variable is available for subsequent operations that require it. [1] https://lore.kernel.org/linux-rdma/6a1f3e8f-deb0-49f9-bc69-a9b03ecfcda7@fujitsu.com/ Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://patch.msgid.link/20241231013416.1290920-1-lizhijian@fujitsu.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-30net: wwan: t7xx: Fix FSM command timeout issueJinjian Song
When driver processes the internal state change command, it use an asynchronous thread to process the command operation. If the main thread detects that the task has timed out, the asynchronous thread will panic when executing the completion notification because the main thread completion object has been released. BUG: unable to handle page fault for address: fffffffffffffff8 PGD 1f283a067 P4D 1f283a067 PUD 1f283c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:complete_all+0x3e/0xa0 [...] Call Trace: <TASK> ? __die_body+0x68/0xb0 ? page_fault_oops+0x379/0x3e0 ? exc_page_fault+0x69/0xa0 ? asm_exc_page_fault+0x22/0x30 ? complete_all+0x3e/0xa0 fsm_main_thread+0xa3/0x9c0 [mtk_t7xx (HASH:1400 5)] ? __pfx_autoremove_wake_function+0x10/0x10 kthread+0xd8/0x110 ? __pfx_fsm_main_thread+0x10/0x10 [mtk_t7xx (HASH:1400 5)] ? __pfx_kthread+0x10/0x10 ret_from_fork+0x38/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> [...] CR2: fffffffffffffff8 ---[ end trace 0000000000000000 ]--- Use the reference counter to ensure safe release as Sergey suggests: https://lore.kernel.org/all/da90f64c-260a-4329-87bf-1f9ff20a5951@gmail.com/ Fixes: 13e920d93e37 ("net: wwan: t7xx: Add core components") Signed-off-by: Jinjian Song <jinjian.song@fibocom.com> Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Link: https://patch.msgid.link/20241224041552.8711-1-jinjian.song@fibocom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-30sky2: Add device ID 11ab:4373 for Marvell 88E8075Pascal Hambourg
A Marvell 88E8075 ethernet controller has this device ID instead of 11ab:4370 and works fine with the sky2 driver. Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/10165a62-99fb-4be6-8c64-84afd6234085@plouf.fr.eu.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-30net: mv643xx_eth: fix an OF node reference leakJoe Hattori
Current implementation of mv643xx_eth_shared_of_add_port() calls of_parse_phandle(), but does not release the refcount on error. Call of_node_put() in the error path and in mv643xx_eth_shared_of_remove(). This bug was found by an experimental verification tool that I am developing. Fixes: 76723bca2802 ("net: mv643xx_eth: add DT parsing support") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://patch.msgid.link/20241221081448.3313163-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-30gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeupJoshua Washington
Commit ba0925c34e0f ("gve: process XSK TX descriptors as part of RX NAPI") moved XSK TX processing to be part of the RX NAPI. However, that commit did not include triggering the RX NAPI in gve_xsk_wakeup. This is necessary because the TX NAPI only processes TX completions, meaning that a TX wakeup would not actually trigger XSK descriptor processing. Also, the branch on XDP_WAKEUP_TX was supposed to have been removed, as the NAPI should be scheduled whether the wakeup is for RX or TX. Fixes: ba0925c34e0f ("gve: process XSK TX descriptors as part of RX NAPI") Cc: stable@vger.kernel.org Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Link: https://patch.msgid.link/20241221032807.302244-1-pkaligineedi@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-30eth: bcmsysport: fix call balance of priv->clk handling routinesVitalii Mordan
Check the return value of clk_prepare_enable to ensure that priv->clk has been successfully enabled. If priv->clk was not enabled during bcm_sysport_probe, bcm_sysport_resume, or bcm_sysport_open, it must not be disabled in any subsequent execution paths. Fixes: 31bc72d97656 ("net: systemport: fetch and use clock resources") Signed-off-by: Vitalii Mordan <mordan@ispras.ru> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20241227123007.2333397-1-mordan@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-30cdrom: Fix typo, 'devicen' to 'device'Steven Davis
Fix typo in cd_dbg line to add trailing newline character. Signed-off-by: Steven Davis <goldside000@outlook.com> Link: https://lore.kernel.org/lkml/20241229165744.21725-1-goldside000@outlook.com Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Link: https://lore.kernel.org/lkml/Z3GV2W_MUOw5BrtR@equinox Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Link: https://lore.kernel.org/r/20241230193431.441120-2-phil@philpotter.co.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-30Merge tag 'platform-drivers-x86-v6.13-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: "hp-wmi: - mark 8A15 board for timed OMEN thermal profile mlx-platform: - call pci_dev_put() to balance the refcount thinkpad-acpi: - Add support for hotkey 0x1401" * tag 'platform-drivers-x86-v6.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: thinkpad-acpi: Add support for hotkey 0x1401 platform/x86: hp-wmi: mark 8A15 board for timed OMEN thermal profile platform/x86: mlx-platform: call pci_dev_put() to balance the refcount
2024-12-30watchdog: stm32_iwdg: fix error message during driver probeClément Le Goffic
The commit 3ab1663af6c1 ("watchdog: stm32_iwdg: Add pretimeout support") introduces the support for the pre-timeout interrupt. The support for this interrupt is optional but the driver uses the platform_get_irq() which produces an error message during the driver probe if we don't have any `interrupts` property in the DT. Use the platform_get_irq_optional() API to get rid of the error message as this property is optional. Fixes: 3ab1663af6c1 ("watchdog: stm32_iwdg: Add pretimeout support") Signed-off-by: Clément Le Goffic <clement.legoffic@foss.st.com> Reviewed-by: Marek Vasut <marex@denx.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20241218092227.771133-1-clement.legoffic@foss.st.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2024-12-29Merge tag 'x86-urgent-2024-12-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Fix a hang in the "kernel IBT no ENDBR" self-test that may trigger on FRED systems, caused by incomplete FRED state cleanup in the #CP fault handler - Improve TDX (Coco VM) guest unrecoverable error handling to not potentially leak decrypted memory * tag 'x86-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: virt: tdx-guest: Just leak decrypted memory on unrecoverable errors x86/fred: Clear WFE in missing-ENDBRANCH #CPs
2024-12-29Merge tag 'irq-urgent-2024-12-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "Fix bogus MSI IRQ setup warning on RISC-V" * tag 'irq-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: PCI/MSI: Handle lack of irqdomain gracefully
2024-12-29Merge tag 'i2c-for-6.13-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - IMX: fix stop condition in single master mode and add compatible string for errata adherence - Microchip: Add support for proper repeated sends and fix unnecessary NAKs on empty messages, which caused false bus detection * tag 'i2c-for-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: microchip-core: fix "ghost" detections i2c: microchip-core: actually use repeated sends i2c: imx: add imx7d compatible string for applying erratum ERR007805 i2c: imx: fix missing stop condition in single-master mode
2024-12-29platform/x86: thinkpad-acpi: Add support for hotkey 0x1401Vishnu Sankar
F8 mode key on Lenovo 2025 platforms use a different key code. Adding support for the new keycode 0x1401. Tested on X1 Carbon Gen 13 and X1 2-in-1 Gen 10. Signed-off-by: Vishnu Sankar <vishnuocv@gmail.com> Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca> Link: https://lore.kernel.org/r/20241227231840.21334-1-vishnuocv@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-29platform/x86: hp-wmi: mark 8A15 board for timed OMEN thermal profileMingcong Bai
The HP OMEN 8 (2022), corresponding to a board ID of 8A15, supports OMEN thermal profile and requires the timed profile quirk. Upon adding this ID to both the omen_thermal_profile_boards and omen_timed_thermal_profile_boards, significant bump in performance can be observed. For instance, SilverBench (https://silver.urih.com/) results improved from ~56,000 to ~69,000, as a result of higher power draws (and thus core frequencies) whilst under load: Package Power: - Before the patch: ~65W (dropping to about 55W under sustained load). - After the patch: ~115W (dropping to about 105W under sustained load). Core Power: - Before: ~60W (ditto above). - After: ~108W (ditto above). Add 8A15 to omen_thermal_profile_boards and omen_timed_thermal_profile_boards to improve performance. Signed-off-by: Xi Xiao <1577912515@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/r/20241226062207.3352629-1-jeffbai@aosc.io Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-12-29virt: tdx-guest: Just leak decrypted memory on unrecoverable errorsLi RongQing
In CoCo VMs it is possible for the untrusted host to cause set_memory_decrypted() to fail such that an error is returned and the resulting memory is shared. Callers need to take care to handle these errors to avoid returning decrypted (shared) memory to the page allocator, which could lead to functional or security issues. Leak the decrypted memory when set_memory_decrypted() fails, and don't need to print an error since set_memory_decrypted() will call WARN_ONCE(). Fixes: f4738f56d1dc ("virt: tdx-guest: Add Quote generation support using TSM_REPORTS") Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240619111801.25630-1-lirongqing%40baidu.com
2024-12-28Merge tag 'block-6.13-20241228' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fix from Jens Axboe: "Just a single fix for ublk setup error handling" * tag 'block-6.13-20241228' of git://git.kernel.dk/linux: ublk: detach gendisk from ublk device if add_disk() fails
2024-12-27nvme-tcp: remove nvme_tcp_destroy_io_queues()Chunguang.xu
Now when destroying the IO queue we call nvme_tcp_stop_io_queues() twice, nvme_tcp_destroy_io_queues() has an unnecessary call. Here we try to remove nvme_tcp_destroy_io_queues() and merge it into nvme_tcp_teardown_io_queues(), simplify the code and align with nvme-rdma, make it easy to maintaince. Signed-off-by: Chunguang.xu <chunguang.xu@shopee.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-12-27nvmet-loop: avoid using mutex in IO hotpathNilay Shroff
Using mutex lock in IO hot path causes the kernel BUG sleeping while atomic. Shinichiro[1], first encountered this issue while running blktest nvme/052 shown below: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 996, name: (udev-worker) preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 2 locks held by (udev-worker)/996: #0: ffff8881004570c8 (mapping.invalidate_lock){.+.+}-{3:3}, at: page_cache_ra_unbounded+0x155/0x5c0 #1: ffffffff8607eaa0 (rcu_read_lock){....}-{1:2}, at: blk_mq_flush_plug_list+0xa75/0x1950 CPU: 2 UID: 0 PID: 996 Comm: (udev-worker) Not tainted 6.12.0-rc3+ #339 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x6a/0x90 __might_resched.cold+0x1f7/0x23d ? __pfx___might_resched+0x10/0x10 ? vsnprintf+0xdeb/0x18f0 __mutex_lock+0xf4/0x1220 ? nvmet_subsys_nsid_exists+0xb9/0x150 [nvmet] ? __pfx_vsnprintf+0x10/0x10 ? __pfx___mutex_lock+0x10/0x10 ? snprintf+0xa5/0xe0 ? xas_load+0x1ce/0x3f0 ? nvmet_subsys_nsid_exists+0xb9/0x150 [nvmet] nvmet_subsys_nsid_exists+0xb9/0x150 [nvmet] ? __pfx_nvmet_subsys_nsid_exists+0x10/0x10 [nvmet] nvmet_req_find_ns+0x24e/0x300 [nvmet] nvmet_req_init+0x694/0xd40 [nvmet] ? blk_mq_start_request+0x11c/0x750 ? nvme_setup_cmd+0x369/0x990 [nvme_core] nvme_loop_queue_rq+0x2a7/0x7a0 [nvme_loop] ? __pfx___lock_acquire+0x10/0x10 ? __pfx_nvme_loop_queue_rq+0x10/0x10 [nvme_loop] __blk_mq_issue_directly+0xe2/0x1d0 ? __pfx___blk_mq_issue_directly+0x10/0x10 ? blk_mq_request_issue_directly+0xc2/0x140 blk_mq_plug_issue_direct+0x13f/0x630 ? lock_acquire+0x2d/0xc0 ? blk_mq_flush_plug_list+0xa75/0x1950 blk_mq_flush_plug_list+0xa9d/0x1950 ? __pfx_blk_mq_flush_plug_list+0x10/0x10 ? __pfx_mpage_readahead+0x10/0x10 __blk_flush_plug+0x278/0x4d0 ? __pfx___blk_flush_plug+0x10/0x10 ? lock_release+0x460/0x7a0 blk_finish_plug+0x4e/0x90 read_pages+0x51b/0xbc0 ? __pfx_read_pages+0x10/0x10 ? lock_release+0x460/0x7a0 page_cache_ra_unbounded+0x326/0x5c0 force_page_cache_ra+0x1ea/0x2f0 filemap_get_pages+0x59e/0x17b0 ? __pfx_filemap_get_pages+0x10/0x10 ? lock_is_held_type+0xd5/0x130 ? __pfx___might_resched+0x10/0x10 ? find_held_lock+0x2d/0x110 filemap_read+0x317/0xb70 ? up_write+0x1ba/0x510 ? __pfx_filemap_read+0x10/0x10 ? inode_security+0x54/0xf0 ? selinux_file_permission+0x36d/0x420 blkdev_read_iter+0x143/0x3b0 vfs_read+0x6ac/0xa20 ? __pfx_vfs_read+0x10/0x10 ? __pfx_vm_mmap_pgoff+0x10/0x10 ? __pfx___seccomp_filter+0x10/0x10 ksys_read+0xf7/0x1d0 ? __pfx_ksys_read+0x10/0x10 do_syscall_64+0x93/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f565bd1ce11 Code: 00 48 8b 15 09 90 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8 d0 ad 01 00 f3 0f 1e fa 80 3d 35 12 0e 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec RSP: 002b:00007ffd6e7a20c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000001000 RCX: 00007f565bd1ce11 RDX: 0000000000001000 RSI: 00007f565babb000 RDI: 0000000000000014 RBP: 00007ffd6e7a2130 R08: 00000000ffffffff R09: 0000000000000000 R10: 0000556000bfa610 R11: 0000000000000246 R12: 000000003ffff000 R13: 0000556000bfa5b0 R14: 0000000000000e00 R15: 0000556000c07328 </TASK> Apparently, the above issue is caused due to using mutex lock while we're in IO hot path. It's a regression caused with commit 505363957fad ("nvmet: fix nvme status code when namespace is disabled"). The mutex ->su_mutex is used to find whether a disabled nsid exists in the config group or not. This is to differentiate between a nsid that is disabled vs non-existent. To mitigate the above issue, we've worked upon a fix[2] where we now insert nsid in subsys Xarray as soon as it's created under config group and later when that nsid is enabled, we add an Xarray mark on it and set ns->enabled to true. The Xarray mark is useful while we need to loop through all enabled namepsaces under a subsystem using xa_for_each_marked() API. If later a nsid is disabled then we clear Xarray mark from it and also set ns->enabled to false. It's only when nsid is deleted from the config group we delete it from the Xarray. So with this change, now we could easily differentiate a nsid is disabled (i.e. Xarray entry for ns exists but ns->enabled is set to false) vs non- existent (i.e.Xarray entry for ns doesn't exist). Link: https://lore.kernel.org/linux-nvme/20241022070252.GA11389@lst.de/ [2] Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Closes: https://lore.kernel.org/linux-nvme/tqcy3sveity7p56v7ywp7ssyviwcb3w4623cnxj3knoobfcanq@yxgt2mjkbkam/ [1] Fixes: 505363957fad ("nvmet: fix nvme status code when namespace is disabled") Fix-suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-12-27nvmet: propagate npwg topologyLuis Chamberlain
Ensure we propagate npwg to the target as well instead of assuming its the same logical blocks per physical block. This ensures devices with large IUs information properly propagated on the target. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-12-27nvmet: Don't overflow subsysnqnLeo Stone
nvmet_root_discovery_nqn_store treats the subsysnqn string like a fixed size buffer, even though it is dynamically allocated to the size of the string. Create a new string with kstrndup instead of using the old buffer. Reported-by: syzbot+ff4aab278fa7e27e0f9e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ff4aab278fa7e27e0f9e Fixes: 95409e277d83 ("nvmet: implement unique discovery NQN") Signed-off-by: Leo Stone <leocstone@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-12-27Merge tag 'for-v6.13-rc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply fixes from Sebastian Reichel: - fix potential array out of bounds access in gpio-charger - cros_charge-control: - fix concurrent sysfs access - allow start_threshold == end_threshold - workaround limited v2 charge threshold API - bq24296: fix vbus regulator handling * tag 'for-v6.13-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power: supply: bq24190: Fix BQ24296 Vbus regulator support power: supply: cros_charge-control: hide start threshold on v2 cmd power: supply: cros_charge-control: allow start_threshold == end_threshold power: supply: cros_charge-control: add mutex for driver data power: supply: gpio-charger: Fix set charge current limits
2024-12-26Merge tag 'dmaengine-fix-6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "Bunch of minor driver fixes for drivers in this cycle: - Kernel doc warning documentation fixes - apple driver fix for register access - amd driver dropping private dma_ops - freescale cleanup path fix - refcount fix for mv_xor driver - null pointer deref fix for at_xdmac driver - GENMASK to GENMASK_ULL fix for loongson2 apb driver - Tegra driver fix for correcting dma status" * tag 'dmaengine-fix-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: tegra: Return correct DMA status when paused dmaengine: mv_xor: fix child node refcount handling in early exit dmaengine: fsl-edma: implement the cleanup path of fsl_edma3_attach_pd() dmaengine: amd: qdma: Remove using the private get and set dma_ops APIs dmaengine: apple-admac: Avoid accessing registers in probe linux/dmaengine.h: fix a few kernel-doc warnings dmaengine: loongson2-apb: Change GENMASK to GENMASK_ULL dmaengine: dw: Select only supported masters for ACPI devices dmaengine: at_xdmac: avoid null_prt_deref in at_xdmac_prep_dma_memset
2024-12-26Merge tag 'phy-fixes-6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: "A few core API fixes for devm calls and bunch of driver fixes as usual: - devm_phy_xxx fixes for few APIs in the phy core - qmp driver register name config - init sequence fix for usb driver - rockchip driver setting drvdata correctly in samsung hdptx and reset fix for naneng combophy - regulator dependency fix for mediatek hdmi driver - overflow assertion fix for stm32 driver" * tag 'phy-fixes-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: mediatek: phy-mtk-hdmi: add regulator dependency phy: freescale: fsl-samsung-hdmi: Fix 64-by-32 division cocci warnings phy: core: Fix an OF node refcount leakage in of_phy_provider_lookup() phy: core: Fix an OF node refcount leakage in _of_phy_get() phy: core: Fix that API devm_phy_destroy() fails to destroy the phy phy: core: Fix that API devm_of_phy_provider_unregister() fails to unregister the phy provider phy: core: Fix that API devm_phy_put() fails to release the phy phy: rockchip: samsung-hdptx: Set drvdata before enabling runtime PM phy: stm32: work around constant-value overflow assertion phy: qcom-qmp: Fix register name in RX Lane config of SC8280XP phy: rockchip: naneng-combphy: fix phy reset phy: usb: Toggle the PHY power during init
2024-12-26Merge tag 'chrome-platform-for-6.13-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux Pull chrome platform fix from Tzung-Bi Shih: - Fix wrong product names for early Framework Laptops * tag 'chrome-platform-for-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: platform/chrome: cros_ec_lpc: fix product identity for early Framework Laptops
2024-12-26ublk: detach gendisk from ublk device if add_disk() failsMing Lei
Inside ublk_abort_requests(), gendisk is grabbed for aborting all inflight requests. And ublk_abort_requests() is called when exiting the uring context or handling timeout. If add_disk() fails, the gendisk may have been freed when calling ublk_abort_requests(), so use-after-free can be caused when getting disk's reference in ublk_abort_requests(). Fixes the bug by detaching gendisk from ublk device if add_disk() fails. Fixes: bd23f6c2c2d0 ("ublk: quiesce request queue when aborting queue") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241225110640.351531-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-26i2c: microchip-core: fix "ghost" detectionsConor Dooley
Running i2c-detect currently produces an output akin to: 0 1 2 3 4 5 6 7 8 9 a b c d e f 00: 08 -- 0a -- 0c -- 0e -- 10: 10 -- 12 -- 14 -- 16 -- UU 19 -- 1b -- 1d -- 1f 20: -- 21 -- 23 -- 25 -- 27 -- 29 -- 2b -- 2d -- 2f 30: -- -- -- -- -- -- -- -- 38 -- 3a -- 3c -- 3e -- 40: 40 -- 42 -- 44 -- 46 -- 48 -- 4a -- 4c -- 4e -- 50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 60: 60 -- 62 -- 64 -- 66 -- 68 -- 6a -- 6c -- 6e -- 70: 70 -- 72 -- 74 -- 76 -- This happens because for an i2c_msg with a len of 0 the driver will mark the transmission of the message as a success once the START has been sent, without waiting for the devices on the bus to respond with an ACK/NAK. Since i2cdetect seems to run in a tight loop over all addresses the NAK is treated as part of the next test for the next address. Delete the fast path that marks a message as complete when idev->msg_len is zero after sending a START/RESTART since this isn't a valid scenario. CC: stable@vger.kernel.org Fixes: 64a6f1c4987e ("i2c: add support for microchip fpga i2c controllers") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20241218-outbid-encounter-b2e78b1cc707@spud Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-12-26i2c: microchip-core: actually use repeated sendsConor Dooley
At present, where repeated sends are intended to be used, the i2c-microchip-core driver sends a stop followed by a start. Lots of i2c devices must not malfunction in the face of this behaviour, because the driver has operated like this for years! Try to keep track of whether or not a repeated send is required, and suppress sending a stop in these cases. CC: stable@vger.kernel.org Fixes: 64a6f1c4987e ("i2c: add support for microchip fpga i2c controllers") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20241218-football-composure-e56df2461461@spud Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-12-25i2c: imx: add imx7d compatible string for applying erratum ERR007805Carlos Song
Compatible string "fsl,imx7d-i2c" is not exited at i2c-imx driver compatible string table, at the result, "fsl,imx21-i2c" will be matched, but it will cause erratum ERR007805 not be applied in fact. So Add "fsl,imx7d-i2c" compatible string in i2c-imx driver to apply the erratum ERR007805(https://www.nxp.com/docs/en/errata/IMX7DS_3N09P.pdf). " ERR007805 I2C: When the I2C clock speed is configured for 400 kHz, the SCL low period violates the I2C spec of 1.3 uS min Description: When the I2C module is programmed to operate at the maximum clock speed of 400 kHz (as defined by the I2C spec), the SCL clock low period violates the I2C spec of 1.3 uS min. The user must reduce the clock speed to obtain the SCL low time to meet the 1.3us I2C minimum required. This behavior means the SoC is not compliant to the I2C spec at 400kHz. Workaround: To meet the clock low period requirement in fast speed mode, SCL must be configured to 384KHz or less. " "fsl,imx7d-i2c" already is documented in binding doc. This erratum fix has been included in imx6_i2c_hwdata and it is the same in all I.MX6/7/8, so just reuse it. Fixes: 39c025721d70 ("i2c: imx: Implement errata ERR007805 or e7805 bus frequency limit") Cc: stable@vger.kernel.org # v5.18+ Signed-off-by: Carlos Song <carlos.song@nxp.com> Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Fixes: 39c025721d70 ("i2c: imx: Implement errata ERR007805 or e7805 bus frequency limit") Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20241218044238.143414-1-carlos.song@nxp.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-12-25i2c: imx: fix missing stop condition in single-master modeStefan Eichenberger
A regression was introduced with the implementation of single-master mode, preventing proper stop conditions from being generated. Devices that require a valid stop condition, such as EEPROMs, fail to function correctly as a result. The issue only affects devices with the single-master property enabled. This commit resolves the issue by re-enabling I2C bus busy bit (IBB) polling for single-master mode when generating a stop condition. The fix further ensures that the i2c_imx->stopped flag is cleared at the start of each transfer, allowing the stop condition to be correctly generated in i2c_imx_stop(). According to the reference manual (IMX8MMRM, Rev. 2, 09/2019, page 5270), polling the IBB bit to determine if the bus is free is only necessary in multi-master mode. Consequently, the IBB bit is not polled for the start condition in single-master mode. Fixes: 6692694aca86 ("i2c: imx: do not poll for bus busy in single master mode") Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://lore.kernel.org/r/20241216151829.74056-1-eichest@gmail.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-12-25platform/chrome: cros_ec_lpc: fix product identity for early Framework LaptopsDustin L. Howett
The product names for the Framework Laptop (12th and 13th Generation Intel Core) are incorrect as of 62be134abf42. Fixes: 62be134abf42 ("platform/chrome: cros_ec_lpc: switch primary DMI data for Framework Laptop") Cc: stable@vger.kernel.org # 6.12.x Signed-off-by: Dustin L. Howett <dustin@howett.net> Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241224-platform-chrome-cros_ec_lpc-fix-product-identity-for-early-framework-laptops-v1-1-0d31d6e1d22c@howett.net Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
2024-12-24Merge tag 'mtd/fixes-for-6.13-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull mtd fixes from Miquel Raynal: "Four minor fixes for NAND controller drivers (cleanup path, double actions, and W=1 warning) as well as a cast to avoid overflows in an mtd device driver" * tag 'mtd/fixes-for-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: omap2: Fix build warnings with W=1 mtd: rawnand: arasan: Fix missing de-registration of NAND mtd: rawnand: arasan: Fix double assertion of chip-select mtd: diskonchip: Cast an operand to prevent potential overflow mtd: rawnand: fix double free in atmel_pmecc_create_user()
2024-12-24phy: mediatek: phy-mtk-hdmi: add regulator dependencyArnd Bergmann
The driver no longer builds when regulator support is unavailable: arm-linux-gnueabi-ld: drivers/phy/mediatek/phy-mtk-hdmi.o: in function `mtk_hdmi_phy_register_regulators': phy-mtk-hdmi.c:(.text.unlikely+0x3e): undefined reference to `devm_regulator_register' arm-linux-gnueabi-ld: drivers/phy/mediatek/phy-mtk-hdmi-mt8195.o: in function `mtk_hdmi_phy_pwr5v_is_enabled': phy-mtk-hdmi-mt8195.c:(.text+0x326): undefined reference to `rdev_get_drvdata' arm-linux-gnueabi-ld: drivers/phy/mediatek/phy-mtk-hdmi-mt8195.o: in function `mtk_hdmi_phy_pwr5v_disable': phy-mtk-hdmi-mt8195.c:(.text+0x346): undefined reference to `rdev_get_drvdata' arm-linux-gnueabi-ld: drivers/phy/mediatek/phy-mtk-hdmi-mt8195.o: in function `mtk_hdmi_phy_pwr5v_enable': Fixes: 49393b2da1cd ("phy: mediatek: phy-mtk-hdmi: Register PHY provided regulator") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20241213083056.2596499-1-arnd@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-12-24phy: freescale: fsl-samsung-hdmi: Fix 64-by-32 division cocci warningsAdam Ford
The Kernel test robot returns the following warning: do_div() does a 64-by-32 division, please consider using div64_ul instead. To prevent the 64-by-32 divsion, consolidate both the multiplication and the do_div into one line which explicitly uses u64 sizes. Fixes: 1951dbb41d1d ("phy: freescale: fsl-samsung-hdmi: Support dynamic integer") Signed-off-by: Adam Ford <aford173@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412091243.fSObwwPi-lkp@intel.com/ Link: https://lore.kernel.org/r/20241215220555.99113-1-aford173@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-12-24phy: core: Fix an OF node refcount leakage in of_phy_provider_lookup()Zijun Hu
For macro for_each_child_of_node(parent, child), refcount of @child has been increased before entering its loop body, so normally needs to call of_node_put(@child) before returning from the loop body to avoid refcount leakage. of_phy_provider_lookup() has such usage but does not call of_node_put() before returning, so cause leakage of the OF node refcount. Fix by simply calling of_node_put() before returning from the loop body. The APIs affected by this issue are shown below since they indirectly invoke problematic of_phy_provider_lookup(). phy_get() of_phy_get() devm_phy_get() devm_of_phy_get() devm_of_phy_get_by_index() Fixes: 2a4c37016ca9 ("phy: core: Fix of_phy_provider_lookup to return PHY provider for sub node") Cc: stable@vger.kernel.org Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241213-phy_core_fix-v6-5-40ae28f5015a@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-12-24phy: core: Fix an OF node refcount leakage in _of_phy_get()Zijun Hu
_of_phy_get() will directly return when suffers of_device_is_compatible() error, but it forgets to decrease refcount of OF node @args.np before error return, the refcount was increased by previous of_parse_phandle_with_args() so causes the OF node's refcount leakage. Fix by decreasing the refcount via of_node_put() before the error return. Fixes: b7563e2796f8 ("phy: work around 'phys' references to usb-nop-xceiv devices") Cc: stable@vger.kernel.org Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241213-phy_core_fix-v6-4-40ae28f5015a@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-12-24phy: core: Fix that API devm_phy_destroy() fails to destroy the phyZijun Hu
For devm_phy_destroy(), its comment says it needs to invoke phy_destroy() to destroy the phy, but it will not actually invoke the function since devres_destroy() does not call devm_phy_consume(), and the missing phy_destroy() call will cause that the phy fails to be destroyed. Fortunately, the faulty API has not been used by current kernel tree. Fix by using devres_release() instead of devres_destroy() within the API. Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241213-phy_core_fix-v6-3-40ae28f5015a@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-12-24phy: core: Fix that API devm_of_phy_provider_unregister() fails to ↵Zijun Hu
unregister the phy provider For devm_of_phy_provider_unregister(), its comment says it needs to invoke of_phy_provider_unregister() to unregister the phy provider, but it will not actually invoke the function since devres_destroy() does not call devm_phy_provider_release(), and the missing of_phy_provider_unregister() call will cause: - The phy provider fails to be unregistered. - Leak both memory and the OF node refcount. Fortunately, the faulty API has not been used by current kernel tree. Fix by using devres_release() instead of devres_destroy() within the API. Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/stable/20241213-phy_core_fix-v6-2-40ae28f5015a%40quicinc.com Link: https://lore.kernel.org/r/20241213-phy_core_fix-v6-2-40ae28f5015a@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-12-24phy: core: Fix that API devm_phy_put() fails to release the phyZijun Hu
For devm_phy_put(), its comment says it needs to invoke phy_put() to release the phy, but it will not actually invoke the function since devres_destroy() does not call devm_phy_release(), and the missing phy_put() call will cause: - The phy fails to be released. - devm_phy_put() can not fully undo what API devm_phy_get() does. - Leak refcount of both the module and device for below typical usage: devm_phy_get(); // or its variant ... err = do_something(); if (err) goto err_out; ... err_out: devm_phy_put(); // leak refcount here The file(s) affected by this issue are shown below since they have such typical usage. drivers/pci/controller/cadence/pcie-cadence.c drivers/net/ethernet/ti/am65-cpsw-nuss.c Fix by using devres_release() instead of devres_destroy() within the API. Fixes: ff764963479a ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Krzysztof Wilczyński <kw@linux.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241213-phy_core_fix-v6-1-40ae28f5015a@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-12-24dmaengine: tegra: Return correct DMA status when pausedAkhil R
Currently, the driver does not return the correct DMA status when a DMA pause is issued by the client drivers. This causes GPCDMA users to assume that DMA is still running, while in reality, the DMA is paused. Return DMA_PAUSED for tx_status() if the channel is paused in the middle of a transfer. Fixes: ee17028009d4 ("dmaengine: tegra: Add tegra gpcdma driver") Cc: stable@vger.kernel.org Signed-off-by: Akhil R <akhilrajeev@nvidia.com> Signed-off-by: Kartik Rajput <kkartik@nvidia.com> Link: https://lore.kernel.org/r/20241212124412.5650-1-kkartik@nvidia.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-12-24dmaengine: mv_xor: fix child node refcount handling in early exitJavier Carrasco
The for_each_child_of_node() loop requires explicit calls to of_node_put() to decrement the child's refcount upon early exits (break, goto, return). Add the missing calls in the two early exits before the goto instructions. Cc: stable@vger.kernel.org Fixes: f7d12ef53ddf ("dma: mv_xor: add Device Tree binding") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://lore.kernel.org/r/20241011-dma_mv_xor_of_node_put-v1-1-3c2de819f463@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-12-24drm/i915/dg1: Fix power gate sequence.Rodrigo Vivi
sub-pipe PG is not present on DG1. Setting these bits can disable other power gates and cause GPU hangs on video playbacks. VLK: 16314, 4304 Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13381 Fixes: 85a12d7eb8fe ("drm/i915/tgl: Fix Media power gate sequence.") Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241219210019.70532-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit de7061947b4ed4be857d452c60d5fb795831d79e) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2024-12-24drm/i915/cx0_phy: Fix C10 pll programming sequenceSuraj Kandpal
According to spec VDR_CUSTOM_WIDTH register gets programmed after pll specific VDR registers and TX Lane programming registers are done. Moreover we only program into C10_VDR_CONTROL1 to update config and setup master lane once all VDR registers are written into. Bspec: 67636 Fixes: 51390cc0e00a ("drm/i915/mtl: Add Support for C10 PHY message bus and pll programming") Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241216181554.2861381-1-suraj.kandpal@intel.com (cherry picked from commit f9d418552ba1e3a0e92487ff82eb515dab7516c0) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2024-12-24RDMA/rxe: Remove the direct link to net_deviceZhu Yanjun
The similar patch in siw is in the link: https://git.kernel.org/rdma/rdma/c/16b87037b48889 This problem also occurred in RXE. The following analyze this problem. In the following Call Traces: " BUG: KASAN: slab-use-after-free in dev_get_flags+0x188/0x1d0 net/core/dev.c:8782 Read of size 4 at addr ffff8880554640b0 by task kworker/1:4/5295 CPU: 1 UID: 0 PID: 5295 Comm: kworker/1:4 Not tainted 6.12.0-rc3-syzkaller-00399-g9197b73fd7bb #0 Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: infiniband ib_cache_event_task Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 dev_get_flags+0x188/0x1d0 net/core/dev.c:8782 rxe_query_port+0x12d/0x260 drivers/infiniband/sw/rxe/rxe_verbs.c:60 __ib_query_port drivers/infiniband/core/device.c:2111 [inline] ib_query_port+0x168/0x7d0 drivers/infiniband/core/device.c:2143 ib_cache_update+0x1a9/0xb80 drivers/infiniband/core/cache.c:1494 ib_cache_event_task+0xf3/0x1e0 drivers/infiniband/core/cache.c:1568 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa65/0x1850 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f2/0x390 kernel/kthread.c:389 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> " 1). In the link [1], " infiniband syz2: set down " This means that on 839.350575, the event ib_cache_event_task was sent andi queued in ib_wq. 2). In the link [1], " team0 (unregistering): Port device team_slave_0 removed " It indicates that before 843.251853, the net device should be freed. 3). In the link [1], " BUG: KASAN: slab-use-after-free in dev_get_flags+0x188/0x1d0 " This means that on 850.559070, this slab-use-after-free problem occurred. In all, on 839.350575, the event ib_cache_event_task was sent and queued in ib_wq, before 843.251853, the net device veth was freed. on 850.559070, this event was executed, and the mentioned freed net device was called. Thus, the above call trace occurred. [1] https://syzkaller.appspot.com/x/log.txt?x=12e7025f980000 Reported-by: syzbot+4b87489410b4efd181bf@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4b87489410b4efd181bf Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://patch.msgid.link/20241220222325.2487767-1-yanjun.zhu@linux.dev Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-24dmaengine: fsl-edma: implement the cleanup path of fsl_edma3_attach_pd()Joe Hattori
Current implementation of fsl_edma3_attach_pd() does not provide a cleanup path, resulting in a memory leak. For example, dev_pm_domain_detach() is not called after dev_pm_domain_attach_by_id(), and the device link created with the DL_FLAG_STATELESS is not released explicitly. Therefore, provide a cleanup function fsl_edma3_detach_pd() and call it upon failure. Also add a devm_add_action_or_reset() call with this function after a successful fsl_edma3_attach_pd(). Fixes: 72f5801a4e2b ("dmaengine: fsl-edma: integrate v3 support") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/20241221075712.3297200-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-12-23net/mlx5e: Keep netdev when leave switchdev for devlink set legacy onlyJianbo Liu
In the cited commit, when changing from switchdev to legacy mode, uplink representor's netdev is kept, and its profile is replaced with nic profile, so netdev is detached from old profile, then attach to new profile. During profile change, the hardware resources allocated by the old profile will be cleaned up. However, the cleanup is relying on the related kernel modules. And they may need to flush themselves first, which is triggered by netdev events, for example, NETDEV_UNREGISTER. However, netdev is kept, or netdev_register is called after the cleanup, which may cause troubles because the resources are still referred by kernel modules. The same process applies to all the caes when uplink is leaving switchdev mode, including devlink eswitch mode set legacy, driver unload and devlink reload. For the first one, it can be blocked and returns failure to users, whenever possible. But it's hard for the others. Besides, the attachment to nic profile is unnecessary as the netdev will be unregistered anyway for such cases. So in this patch, the original behavior is kept only for devlink eswitch set mode legacy. For the others, moves netdev unregistration before the profile change. Fixes: 7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5e: Skip restore TC rules for vport rep without loaded flagJianbo Liu
During driver unload, unregister_netdev is called after unloading vport rep. So, the mlx5e_rep_priv is already freed while trying to get rpriv->netdev, or walk rpriv->tc_ht, which results in use-after-free. So add the checking to make sure access the data of vport rep which is still loaded. Fixes: d1569537a837 ("net/mlx5e: Modify and restore TC rules for IPSec TX rules") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5e: macsec: Maintain TX SA from encoding_saDragos Tatulea
In MACsec, it is possible to create multiple active TX SAs on a SC, but only one such SA can be used at a time for transmission. This SA is selected through the encoding_sa link parameter. When there are 2 or more active TX SAs configured (encoding_sa=0): ip macsec add macsec0 tx sa 0 pn 1 on key 00 <KEY1> ip macsec add macsec0 tx sa 1 pn 1 on key 00 <KEY2> ... the traffic should be still sent via TX SA 0 as the encoding_sa was not changed. However, the driver ignores the encoding_sa and overrides it to SA 1 by installing the flow steering id of the newly created TX SA into the SCI -> flow steering id hash map. The future packet tx descriptors will point to the incorrect flow steering rule (SA 1). This patch fixes the issue by avoiding the creation of the flow steering rule for an active TX SA that is not the encoding_sa. The driver side tx_sa object and the FW side macsec object are still created. When the encoding_sa link parameter is changed to another active TX SA, only the new flow steering rule will be created in the mlx5e_macsec_upd_txsa() handler. Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>