summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-05-14irqchip: Drop MSI_CHIP_FLAG_SET_ACK from unsuspecting MSI driversMarc Zyngier
Commit 1c000dcaad2be ("irqchip/irq-msi-lib: Optionally set default irq_eoi()/irq_ack()") added blanket MSI_CHIP_FLAG_SET_ACK flags, irrespective of whether the underlying irqchip required it or not. Drop it from a number of drivers that do not require it. Fixes: 1c000dcaad2be ("irqchip/irq-msi-lib: Optionally set default irq_eoi()/irq_ack()") Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513172819.2216709-6-maz@kernel.org
2025-05-14LoongArch: uprobes: Remove redundant code about resume_eraTiezhu Yang
arch_uprobe_skip_sstep() returns true if instruction was emulated, that is to say, there is no need to single step for the emulated instructions. regs->csr_era will point to the destination address directly after the exception, so the resume_era related code is redundant, just remove them. Cc: stable@vger.kernel.org Fixes: 19bc6cb64092 ("LoongArch: Add uprobes support") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14LoongArch: uprobes: Remove user_{en,dis}able_single_step()Tiezhu Yang
When executing the "perf probe" and "perf stat" test cases about some cryptographic algorithm, the output shows that "Trace/breakpoint trap". This is because it uses the software singlestep breakpoint for uprobes on LoongArch, and no need to use the hardware singlestep. So just remove the related function call to user_{en,dis}able_single_step() for uprobes on LoongArch. How to reproduce: Please make sure CONFIG_UPROBE_EVENTS is set and openssl supports sm2 algorithm, then execute the following command. cd tools/perf && make ./perf probe -x /usr/lib64/libcrypto.so BN_mod_mul_montgomery ./perf stat -e probe_libcrypto:BN_mod_mul_montgomery openssl speed sm2 Cc: stable@vger.kernel.org Fixes: 19bc6cb64092 ("LoongArch: Add uprobes support") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14LoongArch: Save and restore CSR.CNTC for hibernationHuacai Chen
Save and restore CSR.CNTC for hibernation which is similar to suspend. For host this is unnecessary because sched clock is ensured continuous, but for kvm guest sched clock isn't enough because rdtime.d should also be continuous. Host::rdtime.d = Host::CSR.CNTC + counter Guest::rdtime.d = Host::CSR.CNTC + Host::CSR.GCNTC + Guest::CSR.CNTC + counter so, Guest::rdtime.d = Host::rdtime.d + Host::CSR.GCNTC + Guest::CSR.CNTC To ensure Guest::rdtime.d continuous, Host::rdtime.d should be at first continuous, while Host::CSR.GCNTC / Guest::CSR.CNTC is maintained by KVM. Cc: stable@vger.kernel.org Signed-off-by: Xianglai Li <lixianglai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14LoongArch: Move __arch_cpu_idle() to .cpuidle.text sectionHuacai Chen
Now arch_cpu_idle() is annotated with __cpuidle which means it is in the .cpuidle.text section, but __arch_cpu_idle() isn't. Thus, fix the missing .cpuidle.text section assignment for __arch_cpu_idle() in order to correct backtracing with nmi_backtrace(). The principle is similar to the commit 97c8580e85cf81c ("MIPS: Annotate cpu_wait implementations with __cpuidle") Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14LoongArch: Fix MAX_REG_OFFSET calculationHuacai Chen
Fix MAX_REG_OFFSET calculation, make it point to the last register in 'struct pt_regs' and not to the marker itself, which could allow regs_get_register() to return an invalid offset. Cc: stable@vger.kernel.org Fixes: 803b0fc5c3f2baa6e5 ("LoongArch: Add process management") Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14LoongArch: Prevent cond_resched() occurring within kernel-fpuTianyang Zhang
When CONFIG_PREEMPT_COUNT is not configured (i.e. CONFIG_PREEMPT_NONE/ CONFIG_PREEMPT_VOLUNTARY), preempt_disable() / preempt_enable() merely acts as a barrier(). However, in these cases cond_resched() can still trigger a context switch and modify the CSR.EUEN, resulting in do_fpu() exception being activated within the kernel-fpu critical sections, as demonstrated in the following path: dcn32_calculate_wm_and_dlg() DC_FP_START() dcn32_calculate_wm_and_dlg_fpu() dcn32_find_dummy_latency_index_for_fw_based_mclk_switch() dcn32_internal_validate_bw() dcn32_enable_phantom_stream() dc_create_stream_for_sink() kzalloc(GFP_KERNEL) __kmem_cache_alloc_node() __cond_resched() DC_FP_END() This patch is similar to commit d02198550423a0b (x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs). It uses local_bh_disable() instead of preempt_disable() for non-RT kernels so it can avoid the cond_resched() issue, and also extend the kernel-fpu application scenarios to the softirq context. Cc: stable@vger.kernel.org Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-14dmaengine: fsl-edma: Fix return code for unhandled interruptsStefan Wahren
For fsl,imx93-edma4 two DMA channels share the same interrupt. So in case fsl_edma3_tx_handler is called for the "wrong" channel, the return code must be IRQ_NONE. This signalize that the interrupt wasn't handled. Fixes: 72f5801a4e2b ("dmaengine: fsl-edma: integrate v3 support") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Reviewed-by: Joy Zou <joy.zou@nxp.com> Link: https://lore.kernel.org/r/20250424114829.9055-1-wahrenst@gmx.net Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: mediatek: Fix a possible deadlock error in mtk_cqdma_tx_status()Qiu-ji Chen
Fix a potential deadlock bug. Observe that in the mtk-cqdma.c file, functions like mtk_cqdma_issue_pending() and mtk_cqdma_free_active_desc() properly acquire the pc lock before the vc lock when handling pc and vc fields. However, mtk_cqdma_tx_status() violates this order by first acquiring the vc lock before invoking mtk_cqdma_find_active_desc(), which subsequently takes the pc lock. This reversed locking sequence (vc → pc) contradicts the established pc → vc order and creates deadlock risks. Fix the issue by moving the vc lock acquisition code from mtk_cqdma_find_active_desc() to mtk_cqdma_tx_status(). Ensure the pc lock is acquired before the vc lock in the calling function to maintain correct locking hierarchy. Note that since mtk_cqdma_find_active_desc() is a static function with only one caller (mtk_cqdma_tx_status()), this modification safely eliminates the deadlock possibility without affecting other components. This possible bug is found by an experimental static analysis tool developed by our team. This tool analyzes the locking APIs to extract function pairs that can be concurrently executed, and then analyzes the instructions in the paired functions to identify possible concurrency bugs including deadlocks, data races and atomicity violations. Fixes: b1f01e48df5a ("dmaengine: mediatek: Add MediaTek Command-Queue DMA controller for MT6765 SoC") Cc: stable@vger.kernel.org Signed-off-by: Qiu-ji Chen <chenqiuji666@gmail.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20250508073634.3719-1-chenqiuji666@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: idxd: Fix ->poll() return valueDave Jiang
The fix to block access from different address space did not return a correct value for ->poll() change. kernel test bot reported that a return value of type __poll_t is expected rather than int. Fix to return POLLNVAL to indicate invalid request. Fixes: 8dfa57aabff6 ("dmaengine: idxd: Fix allowing write() from different address spaces") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505081851.rwD7jVxg-lkp@intel.com/ Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20250508170548.2747425-1-dave.jiang@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: idxd: Refactor remove call with idxd_cleanup() helperShuai Xue
The idxd_cleanup() helper cleans up perfmon, interrupts, internals and so on. Refactor remove call with the idxd_cleanup() helper to avoid code duplication. Note, this also fixes the missing put_device() for idxd groups, enginces and wqs. Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Cc: stable@vger.kernel.org Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20250404120217.48772-10-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove callShuai Xue
The remove call stack is missing idxd cleanup to free bitmap, ida and the idxd_device. Call idxd_free() helper routines to make sure we exit gracefully. Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Cc: stable@vger.kernel.org Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20250404120217.48772-9-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: idxd: fix memory leak in error handling path of idxd_pci_probeShuai Xue
Memory allocated for idxd is not freed if an error occurs during idxd_pci_probe(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error. Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Link: https://lore.kernel.org/r/20250404120217.48772-8-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: idxd: fix memory leak in error handling path of idxd_allocShuai Xue
Memory allocated for idxd is not freed if an error occurs during idxd_alloc(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error. Fixes: a8563a33a5e2 ("dmanegine: idxd: reformat opcap output to match bitmap_parse() input") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Link: https://lore.kernel.org/r/20250404120217.48772-7-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: idxd: Add missing cleanups in cleanup internalsShuai Xue
The idxd_cleanup_internals() function only decreases the reference count of groups, engines, and wqs but is missing the step to release memory resources. To fix this, use the cleanup helper to properly release the memory resources. Fixes: ddf742d4f3f1 ("dmaengine: idxd: Add missing cleanup for early error out in probe call") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20250404120217.48772-6-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: idxd: Add missing cleanup for early error out in idxd_setup_internalsShuai Xue
The idxd_setup_internals() is missing some cleanup when things fail in the middle. Add the appropriate cleanup routines: - cleanup groups - cleanup enginces - cleanup wqs to make sure it exits gracefully. Fixes: defe49f96012 ("dmaengine: idxd: fix group conf_dev lifetime") Cc: stable@vger.kernel.org Suggested-by: Fenghua Yu <fenghuay@nvidia.com> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20250404120217.48772-5-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groupsShuai Xue
Memory allocated for groups is not freed if an error occurs during idxd_setup_groups(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error. Fixes: defe49f96012 ("dmaengine: idxd: fix group conf_dev lifetime") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Link: https://lore.kernel.org/r/20250404120217.48772-4-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: idxd: fix memory leak in error handling path of idxd_setup_enginesShuai Xue
Memory allocated for engines is not freed if an error occurs during idxd_setup_engines(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error. Fixes: 75b911309060 ("dmaengine: idxd: fix engine conf_dev lifetime") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Link: https://lore.kernel.org/r/20250404120217.48772-3-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14dmaengine: idxd: fix memory leak in error handling path of idxd_setup_wqsShuai Xue
Memory allocated for wqs is not freed if an error occurs during idxd_setup_wqs(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error. Fixes: 7c5dd23e57c1 ("dmaengine: idxd: fix wq conf_dev 'struct device' lifetime") Fixes: 700af3a0a26c ("dmaengine: idxd: add 'struct idxd_dev' as wrapper for conf_dev") Fixes: de5819b99489 ("dmaengine: idxd: track enabled workqueues in bitmap") Fixes: b0325aefd398 ("dmaengine: idxd: add WQ operation cap restriction support") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Link: https://lore.kernel.org/r/20250404120217.48772-2-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14xfs: Fix comment on xfs_trans_ail_update_bulk()Carlos Maiolino
This function doesn't take the AIL lock, but should be called with AIL lock held. Also (hopefuly) simplify the comment. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14xfs: Fix a comment on xfs_ail_deleteCarlos Maiolino
It doesn't return anything. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14xfs: Fail remount with noattr2 on a v5 with v4 enabledNirjhar Roy (IBM)
Bug: When we compile the kernel with CONFIG_XFS_SUPPORT_V4=y, remount with "-o remount,noattr2" on a v5 XFS does not fail explicitly. Reproduction: mkfs.xfs -f /dev/loop0 mount /dev/loop0 /mnt/scratch mount -o remount,noattr2 /dev/loop0 /mnt/scratch However, with CONFIG_XFS_SUPPORT_V4=n, the remount correctly fails explicitly. This is because the way the following 2 functions are defined: static inline bool xfs_has_attr2 (struct xfs_mount *mp) { return !IS_ENABLED(CONFIG_XFS_SUPPORT_V4) || (mp->m_features & XFS_FEAT_ATTR2); } static inline bool xfs_has_noattr2 (const struct xfs_mount *mp) { return mp->m_features & XFS_FEAT_NOATTR2; } xfs_has_attr2() returns true when CONFIG_XFS_SUPPORT_V4=n and hence, the following if condition in xfs_fs_validate_params() succeeds and returns -EINVAL: /* * We have not read the superblock at this point, so only the attr2 * mount option can set the attr2 feature by this stage. */ if (xfs_has_attr2(mp) && xfs_has_noattr2(mp)) { xfs_warn(mp, "attr2 and noattr2 cannot both be specified."); return -EINVAL; } With CONFIG_XFS_SUPPORT_V4=y, xfs_has_attr2() always return false and hence no error is returned. Fix: Check if the existing mount has crc enabled(i.e, of type v5 and has attr2 enabled) and the remount has noattr2, if yes, return -EINVAL. I have tested xfs/{189,539} in fstests with v4 and v5 XFS with both CONFIG_XFS_SUPPORT_V4=y/n and they both behave as expected. This patch also fixes remount from noattr2 -> attr2 (on a v4 xfs). Related discussion in [1] [1] https://lore.kernel.org/all/Z65o6nWxT00MaUrW@dread.disaster.area/ Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14xfs: fix zoned GC data corruption due to wrong bv_offsetChristoph Hellwig
xfs_zone_gc_write_chunk writes out the data buffer read in earlier using the same bio, and currenly looks at bv_offset for the offset into the scratch folio for that. But commit 26064d3e2b4d ("block: fix adding folio to bio") changed how bv_page and bv_offset are calculated for adding larger folios, breaking this fragile logic. Switch to extracting the full physical address from the old bio_vec, and calculate the offset into the folio from that instead. This fixes data corruption during garbage collection with heavy rockdsb workloads. Thanks to Hans for tracking down the culprit commit during long bisection sessions. Fixes: 26064d3e2b4d ("block: fix adding folio to bio") Fixes: 080d01c41d44 ("xfs: implement zoned garbage collection") Reported-by: Hans Holmberg <Hans.Holmberg@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <Hans.Holmberg@wdc.com> Tested-by: Hans Holmberg <Hans.Holmberg@wdc.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14xfs: free up mp->m_free[0].count in error caseWengang Wang
In xfs_init_percpu_counters(), memory for mp->m_free[0].count wasn't freed in error case. Free it up in this patch. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Fixes: 712bae96631852 ("xfs: generalize the freespace and reserved blocks handling") Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14dma-buf: insert memory barrier before updating num_fencesHyejeong Choi
smp_store_mb() inserts memory barrier after storing operation. It is different with what the comment is originally aiming so Null pointer dereference can be happened if memory update is reordered. Signed-off-by: Hyejeong Choi <hjeong.choi@samsung.com> Fixes: a590d0fdbaa5 ("dma-buf: Update reservation shared_count after adding the new fence") CC: stable@vger.kernel.org Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250513020638.GA2329653@au1-maretx-p37.eng.sarc.samsung.com Signed-off-by: Christian König <christian.koenig@amd.com>
2025-05-14nvme: all namespaces in a subsystem must adhere to a common atomic write sizeAlan Adamson
The first namespace configured in a subsystem sets the subsystem's atomic write size based on its AWUPF or NAWUPF. Subsequent namespaces must have an atomic write size (per their AWUPF or NAWUPF) less than or equal to the subsystem's atomic write size, or their probing will be rejected. Signed-off-by: Alan Adamson <alan.adamson@oracle.com> [hch: fold in review comments from John Garry] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com>
2025-05-14io_uring/fdinfo: grab ctx->uring_lock around io_uring_show_fdinfo()Jens Axboe
Not everything requires locking in there, which is why the 'has_lock' variable exists. But enough does that it's a bit unwieldy to manage. Wrap the whole thing in a ->uring_lock trylock, and just return with no output if we fail to grab it. The existing trylock() will already have greatly diminished utility/output for the failure case. This fixes an issue with reading the SQE fields, if the ring is being actively resized at the same time. Reported-by: Jann Horn <jannh@google.com> Fixes: 79cfe9e59c2a ("io_uring/register: add IORING_REGISTER_RESIZE_RINGS") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-14brd: avoid extra xarray lookups on first writeChristoph Hellwig
The xarray can return the previous entry at a location. Use this fact to simplify the brd code when there is no existing page at a location. This also slighly improves the handling of racy discards as we now always have a page under RCU protection by the time we are ready to copy the data. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250507060700.3929430-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-14block: Remove obsolete configs BLK_MQ_{PCI,VIRTIO}Lukas Bulwahn
Commit 9bc1e897a821 ("blk-mq: remove unused queue mapping helpers") makes the two config options, BLK_MQ_PCI and BLK_MQ_VIRTIO, have no remaining effect. Remove the two obsolete config options. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250514065513.463941-1-lukas.bulwahn@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-14phy: Fix error handling in tegra_xusb_port_initMa Ke
If device_add() fails, do not use device_unregister() for error handling. device_unregister() consists two functions: device_del() and put_device(). device_unregister() should only be called after device_add() succeeded because device_del() undoes what device_add() does if successful. Change device_unregister() to put_device() call before returning from the function. As comment of device_add() says, 'if device_add() succeeds, you should call device_del() when you want to get rid of it. If device_add() has not succeeded, use only put_device() to drop the reference count'. Found by code review. Cc: stable@vger.kernel.org Fixes: 53d2a715c240 ("phy: Add Tegra XUSB pad controller support") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20250303072739.3874987-1-make24@iscas.ac.cn Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14phy: renesas: rcar-gen3-usb2: Set timing registers only onceClaudiu Beznea
phy-rcar-gen3-usb2 driver exports 4 PHYs. The timing registers are common to all PHYs. There is no need to set them every time a PHY is initialized. Set timing register only when the 1st PHY is initialized. Fixes: f3b5a8d9b50d ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver") Cc: stable@vger.kernel.org Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://lore.kernel.org/r/20250507125032.565017-6-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14phy: renesas: rcar-gen3-usb2: Assert PLL reset on PHY power offClaudiu Beznea
Assert PLL reset on PHY power off. This saves power. Fixes: f3b5a8d9b50d ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver") Cc: stable@vger.kernel.org Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://lore.kernel.org/r/20250507125032.565017-5-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14phy: renesas: rcar-gen3-usb2: Lock around hardware registers and driver dataClaudiu Beznea
The phy-rcar-gen3-usb2 driver exposes four individual PHYs that are requested and configured by PHY users. The struct phy_ops APIs access the same set of registers to configure all PHYs. Additionally, PHY settings can be modified through sysfs or an IRQ handler. While some struct phy_ops APIs are protected by a driver-wide mutex, others rely on individual PHY-specific mutexes. This approach can lead to various issues, including: 1/ the IRQ handler may interrupt PHY settings in progress, racing with hardware configuration protected by a mutex lock 2/ due to msleep(20) in rcar_gen3_init_otg(), while a configuration thread suspends to wait for the delay, another thread may try to configure another PHY (with phy_init() + phy_power_on()); re-running the phy_init() goes to the exact same configuration code, re-running the same hardware configuration on the same set of registers (and bits) which might impact the result of the msleep for the 1st configuring thread 3/ sysfs can configure the hardware (though role_store()) and it can still race with the phy_init()/phy_power_on() APIs calling into the drivers struct phy_ops To address these issues, add a spinlock to protect hardware register access and driver private data structures (e.g., calls to rcar_gen3_is_any_rphy_initialized()). Checking driver-specific data remains necessary as all PHY instances share common settings. With this change, the existing mutex protection is removed and the cleanup.h helpers are used. While at it, to keep the code simpler, do not skip regulator_enable()/regulator_disable() APIs in rcar_gen3_phy_usb2_power_on()/rcar_gen3_phy_usb2_power_off() as the regulators enable/disable operations are reference counted anyway. Fixes: f3b5a8d9b50d ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver") Cc: stable@vger.kernel.org Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://lore.kernel.org/r/20250507125032.565017-4-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14phy: renesas: rcar-gen3-usb2: Move IRQ request in probeClaudiu Beznea
Commit 08b0ad375ca6 ("phy: renesas: rcar-gen3-usb2: move IRQ registration to init") moved the IRQ request operation from probe to struct phy_ops::phy_init API to avoid triggering interrupts (which lead to register accesses) while the PHY clocks (enabled through runtime PM APIs) are not active. If this happens, it results in a synchronous abort. One way to reproduce this issue is by enabling CONFIG_DEBUG_SHIRQ, which calls free_irq() on driver removal. Move the IRQ request and free operations back to probe, and take the runtime PM state into account in IRQ handler. This commit is preparatory for the subsequent fixes in this series. Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://lore.kernel.org/r/20250507125032.565017-3-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14phy: renesas: rcar-gen3-usb2: Fix role detection on unbind/bindClaudiu Beznea
It has been observed on the Renesas RZ/G3S SoC that unbinding and binding the PHY driver leads to role autodetection failures. This issue occurs when PHY 3 is the first initialized PHY. PHY 3 does not have an interrupt associated with the USB2_INT_ENABLE register (as rcar_gen3_int_enable[3] = 0). As a result, rcar_gen3_init_otg() is called to initialize OTG without enabling PHY interrupts. To resolve this, add rcar_gen3_is_any_otg_rphy_initialized() and call it in role_store(), role_show(), and rcar_gen3_init_otg(). At the same time, rcar_gen3_init_otg() is only called when initialization for a PHY with interrupt bits is in progress. As a result, the struct rcar_gen3_phy::otg_initialized is no longer needed. Fixes: 549b6b55b005 ("phy: renesas: rcar-gen3-usb2: enable/disable independent irqs") Cc: stable@vger.kernel.org Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://lore.kernel.org/r/20250507125032.565017-2-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14phy: tegra: xusb: remove a stray unlockDan Carpenter
We used to take a lock in tegra186_utmi_bias_pad_power_on() but now we have moved the lock into the caller. Unfortunately, when we moved the lock this unlock was left behind and it results in a double unlock. Delete it now. Fixes: b47158fb4295 ("phy: tegra: xusb: Use a bitmask for UTMI pad power state tracking") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/aAjmR6To4EnvRl4G@stanley.mountain Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-05-14sched,livepatch: Untangle cond_resched() and live-patchingPeter Zijlstra
With the goal of deprecating / removing VOLUNTARY preempt, live-patch needs to stop relying on cond_resched() to make forward progress. Instead, rely on schedule() with TASK_FREEZABLE set. Just like live-patching, the freezer needs to be able to stop tasks in a safe / known state. [bigeasy: use likely() in __klp_sched_try_switch() and update comments] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Tested-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/20250509113659.wkP_HJ5z@linutronix.de
2025-05-14objtool: Speed up SHT_GROUP reindexingJosh Poimboeuf
After elf_update_group_sh_info() was introduced, a prototype version of "objtool klp diff" went from taking ~1s to several minutes, due to looping almost endlessly in elf_update_group_sh_info() while creating thousands of local symbols in a file with thousands of sections. Dramatically improve the performance by marking all symbols' correlated SHT_GROUP sections while reading the object. That way there's no need to search for it every time a symbol gets reindexed. Fixes: 2cb291596e2c ("objtool: Fix up st_info in COMDAT group section") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Rong Xu <xur@google.com> Link: https://lkml.kernel.org/r/2a33e583c87e3283706f346f9d59aac20653b7fd.1746662991.git.jpoimboe@kernel.org
2025-05-14arm64: dts: marvell: uDPU: define pinctrl state for alarm LEDsGabor Juhos
The two alarm LEDs of on the uDPU board are stopped working since commit 78efa53e715e ("leds: Init leds class earlier"). The LEDs are driven by the GPIO{15,16} pins of the North Bridge GPIO controller. These pins are part of the 'spi_quad' pin group for which the 'spi' function is selected via the default pinctrl state of the 'spi' node. This is wrong however, since in order to allow controlling the LEDs, the pins should use the 'gpio' function. Before the commit mentined above, the 'spi' function is selected first by the pinctrl core before probing the spi driver, but then it gets overridden to 'gpio' implicitly via the devm_gpiod_get_index_optional() call from the 'leds-gpio' driver. After the commit, the LED subsystem gets initialized before the SPI subsystem, so the function of the pin group remains 'spi' which in turn prevents controlling of the LEDs. Despite the change of the initialization order, the root cause is that the pinctrl state definition is wrong since its initial commit 0d45062cfc89 ("arm64: dts: marvell: Add device tree for uDPU board"), To fix the problem, override the function in the 'spi_quad_pins' node to 'gpio' and move the pinctrl state definition from the 'spi' node into the 'leds' node. Cc: stable@vger.kernel.org # needs adjustment for < 6.1 Fixes: 0d45062cfc89 ("arm64: dts: marvell: Add device tree for uDPU board") Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> Signed-off-by: Imre Kaloz <kaloz@openwrt.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2025-05-14xfs: remove the EXPERIMENTAL warning for pNFSChristoph Hellwig
The pNFS layout support has been around for 10 years without major issues, drop the EXPERIMENTAL warning. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14xfs: remove some EXPERIMENTAL warningsDarrick J. Wong
Online fsck was finished a year ago, in Linux 6.10. The exchange-range syscall and parent pointers were merged in the same cycle. None of these have encountered any serious errors in the year that they've been in the kernel (or the many many years they've been under development) so let's drop the shouty warnings. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14Merge branch 'atomic_writes-6.16' into xfs-6.16-mergeCarlos Maiolino
Required update due to conflict with patch: xfs: stop using set_blocksize Conflicts: fs/xfs/xfs_buf.c Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14irqchip/gic-v3-its: Use allocation size from the prepare callMarc Zyngier
Now that .msi_prepare() gets called at the right time and not with semi-random parameters, remove the ugly hack that tried to fix up the number of allocated vectors. It is now correct by construction. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513163144.2215824-6-maz@kernel.org
2025-05-14genirq/msi: Engage the .msi_teardown() callback on domain removalMarc Zyngier
Kindly inform the MSI driver that the domain is torn down, providing the allocation context previously populated on domain creation. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513163144.2215824-5-maz@kernel.org
2025-05-14genirq/msi: Move prepare() call to per-device allocationMarc Zyngier
The current device MSI infrastructure is subtly broken, as it will issue an .msi_prepare() callback into the MSI controller driver every time it needs to allocate an MSI. That's pretty wrong, as the contract (or unwarranted assumption, depending who you ask) between the MSI controller and the core code is that .msi_prepare() is called exactly once per device. This leads to some subtle breakage in some MSI controller drivers, as it gives the impression that there are multiple endpoints sharing a bus identifier (RID in PCI parlance, DID for GICv3+). It implies that whatever allocation the ITS driver (for example) has done on behalf of these devices cannot be undone, as there is no way to track the shared state. This is particularly bad for wire-MSI devices, for which .msi_prepare() is called for each input line. To address this issue, move the call to .msi_prepare() to take place at the point of irq domain allocation, which is the only place that makes sense. The msi_alloc_info_t structure is made part of the msi_domain_template, so that its life-cycle is that of the domain as well. Finally, the msi_info::alloc_data field is made to point at this allocation tracking structure, ensuring that it is carried around the block. This is all pretty straightforward, except for the non-device-MSI leftovers, which still have to call .msi_prepare() at the old spot. One day... Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513163144.2215824-4-maz@kernel.org
2025-05-14irqchip/gic-v3-its: Implement .msi_teardown() callbackMarc Zyngier
The ITS driver currently nukes the structure representing an endpoint device translating via an ITS on freeing the last LPI allocated for it. That's an unfortunate state of affair, as it is pretty common for a driver to allocate a single MSI, do something clever, teardown this MSI, and reallocate a whole bunch of them. The NVME driver does exactly that, amongst others. What happens in that case is that the core code is accidentaly issuing another .msi_prepare() call, even if it shouldn't. This luckily cancels the above behaviour and hides the problem. In order to fix the core code, start by implementing the new .msi_teardown() callback. Nothing calls it yet, so a side effect is that the its_dev structure will not be freed and that the DID will stay mapped. Not a big deal, and this will be solved in following patches. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513163144.2215824-3-maz@kernel.org
2025-05-14genirq/msi: Add .msi_teardown() callback as the reverse of .msi_prepare()Marc Zyngier
While the MSI ops do have a .msi_prepare() callback that is responsible for setting up the relevant (usually per-device) allocation, there is no callback reversing this setup. For this purpose, add .msi_teardown() callback. In order to avoid breaking the ITS driver that suffers from related issues, do not call the callback just yet. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513163144.2215824-2-maz@kernel.org
2025-05-14xfs: Remove deprecated xfs_bufd sysctl parametersZizhi Wo
Commit 64af7a6ea5a4 ("xfs: remove deprecated sysctls") removed the deprecated xfsbufd-related sysctl interface, but forgot to delete the corresponding parameters: "xfs_buf_timer" and "xfs_buf_age". This patch removes those parameters and makes no other changes. Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14xfs: stop using set_blocksizeDarrick J. Wong
XFS has its own buffer cache for metadata that uses submit_bio, which means that it no longer uses the block device pagecache for anything. Create a more lightweight helper that runs the blocksize checks and flushes dirty data and use that instead. No more truncating the pagecache because XFS does not use it or care about it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-05-14Merge branch 'block-6.15' of ↵Carlos Maiolino
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block into xfs-6.16-merge Merging block tree into XFS because of some dependencies like bdev_validate_blocksize() Signed-off-by: Carlos Maiolino <cem@kernel.org>