summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-05-20net: enetc: fix the error handling in enetc4_pf_netdev_create()Wei Fang
Fix the handling of err_wq_init and err_reg_netdev paths in enetc4_pf_netdev_create() function. Fixes: 6c5bafba347b ("net: enetc: add MAC filtering for i.MX95 ENETC PF") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250516052734.3624191-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-20Merge branch 'libbpf-support-multi-split-btf'Andrii Nakryiko
Alan Maguire says: ==================== libbpf: support multi-split BTF In discussing handling of inlines in BTF [1], one area which we may need support for in the future is multiple split BTF, where split BTF sits atop another split BTF which sits atop base BTF. This two-patch series fixes one issue discovered when testing multi-split BTF and extends the split BTF test to cover multi-split BTF also. [1] https://lore.kernel.org/dwarves/20250416-btf_inline-v1-0-e4bd2f8adae5@meta.com/ ==================== Link: https://patch.msgid.link/20250519165935.261614-1-alan.maguire@oracle.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2025-05-20selftests/bpf: Test multi-split BTFAlan Maguire
Extend split BTF test to cover case where we create split BTF on top of existing split BTF and add info to it; ensure that such BTF can be created and handled by searching within it, dumping/comparing to expected. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250519165935.261614-3-alan.maguire@oracle.com
2025-05-20libbpf/btf: Fix string handling to support multi-split BTFAlan Maguire
libbpf handling of split BTF has been written largely with the assumption that multiple splits are possible, i.e. split BTF on top of split BTF on top of base BTF. One area where this does not quite work is string handling in split BTF; the start string offset should be the base BTF string section length + the base BTF string offset. This worked in the past because for a single split BTF with base the start string offset was always 0. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250519165935.261614-2-alan.maguire@oracle.com
2025-05-21crypto: powerpc/poly1305 - add depends on BROKEN for nowEric Biggers
As discussed in the thread containing https://lore.kernel.org/linux-crypto/20250510053308.GB505731@sol/, the Power10-optimized Poly1305 code is currently not safe to call in softirq context. Disable it for now. It can be re-enabled once it is fixed. Fixes: ba8f8624fde2 ("crypto: poly1305-p10 - Glue code for optmized Poly1305 implementation for ppc64le") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-21Revert "crypto: powerpc/poly1305 - Add SIMD fallback"Herbert Xu
This reverts commit c66d7ebbe2fa14e41913adb421090a7426f59786. Link: https://lore.kernel.org/all/02c22854-eebf-4ad1-b89e-8c2b65ab8236@csgroup.eu/ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-20pinctrl: qcom: switch to devm_register_sys_off_handler()Dmitry Baryshkov
Error-handling paths in msm_pinctrl_probe() don't call a function required to unroll restart handler registration, unregister_restart_handler(). Instead of adding calls to this function, switch the msm pinctrl code into using devm_register_sys_off_handler(). Fixes: cf1fc1876289 ("pinctrl: qcom: use restart_notifier mechanism for ps_hold") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/20250513-pinctrl-msm-fix-v2-2-249999af0fc1@oss.qualcomm.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-05-20gpiolib: don't crash on enabling GPIO HOG pinsDmitry Baryshkov
On Qualcomm platforms if the board uses GPIO hogs msm_pinmux_request() calls gpiochip_line_is_valid(). After commit 8015443e24e7 ("gpio: Hide valid_mask from direct assignments") gpiochip_line_is_valid() uses gc->gpiodev, which is NULL when GPIO hog pins are being processed. Thus after this commit using GPIO hogs causes the following crash. In order to fix this, verify that gc->gpiodev is not NULL. Note: it is not possible to reorder calls (e.g. by calling msm_gpio_init() before pinctrl registration or by splitting pinctrl_register() into _and_init() and pinctrl_enable() and calling the latter function after msm_gpio_init()) because GPIO chip registration would fail with EPROBE_DEFER if pinctrl is not enabled at the time of registration. pc : gpiochip_line_is_valid+0x4/0x28 lr : msm_pinmux_request+0x24/0x40 sp : ffff8000808eb870 x29: ffff8000808eb870 x28: 0000000000000000 x27: 0000000000000000 x26: 0000000000000000 x25: ffff726240f9d040 x24: 0000000000000000 x23: ffff7262438c0510 x22: 0000000000000080 x21: ffff726243ea7000 x20: ffffab13f2c4e698 x19: 0000000000000080 x18: 00000000ffffffff x17: ffff726242ba6000 x16: 0000000000000100 x15: 0000000000000028 x14: 0000000000000000 x13: 0000000000002948 x12: 0000000000000003 x11: 0000000000000078 x10: 0000000000002948 x9 : ffffab13f50eb5e8 x8 : 0000000003ecb21b x7 : 000000000000002d x6 : 0000000000000b68 x5 : 0000007fffffffff x4 : ffffab13f52f84a8 x3 : ffff8000808eb804 x2 : ffffab13f1de8190 x1 : 0000000000000080 x0 : 0000000000000000 Call trace: gpiochip_line_is_valid+0x4/0x28 (P) pin_request+0x208/0x2c0 pinmux_enable_setting+0xa0/0x2e0 pinctrl_commit_state+0x150/0x26c pinctrl_enable+0x6c/0x2a4 pinctrl_register+0x3c/0xb0 devm_pinctrl_register+0x58/0xa0 msm_pinctrl_probe+0x2a8/0x584 sdm845_pinctrl_probe+0x20/0x88 platform_probe+0x68/0xc0 really_probe+0xbc/0x298 __driver_probe_device+0x78/0x12c driver_probe_device+0x3c/0x160 __device_attach_driver+0xb8/0x138 bus_for_each_drv+0x84/0xe0 __device_attach+0x9c/0x188 device_initial_probe+0x14/0x20 bus_probe_device+0xac/0xb0 deferred_probe_work_func+0x8c/0xc8 process_one_work+0x208/0x5e8 worker_thread+0x1b4/0x35c kthread+0x144/0x220 ret_from_fork+0x10/0x20 Code: b5fffba0 17fffff2 9432ec27 f9400400 (f9428800) Fixes: 8015443e24e7 ("gpio: Hide valid_mask from direct assignments") Reported-by: Doug Anderson <dianders@chromium.org> Closes: https://lore.kernel.org/r/CAD=FV=Vg8_ZOLgLoC4WhFPzhVsxXFC19NrF38W6cW_W_3nFjbw@mail.gmail.com Tested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com> Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://lore.kernel.org/20250513-pinctrl-msm-fix-v2-1-249999af0fc1@oss.qualcomm.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-05-20io_uring/cmd: axe duplicate io_uring_cmd_import_fixed_vec() declarationCaleb Sander Mateos
io_uring_cmd_import_fixed_vec() is declared in both include/linux/io_uring/cmd.h and io_uring/uring_cmd.h. The declarations are identical (if redundant) for CONFIG_IO_URING=y. But if CONFIG_IO_URING=N, include/linux/io_uring/cmd.h declares the function as static inline while io_uring/uring_cmd.h declares it as extern. This causes linker errors if the declaration in io_uring/uring_cmd.h is used. Remove the declaration in io_uring/uring_cmd.h to avoid linker errors and prevent the declarations getting out of sync. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: ef4902752972 ("io_uring/cmd: introduce io_uring_cmd_import_fixed_vec") Link: https://lore.kernel.org/r/20250520193337.1374509-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20selftests/sched_ext: Add test for scx_bpf_select_cpu_and() via test_runAndrea Righi
Update the allowed_cpus selftest to include a check to validate the behavior of scx_bpf_select_cpu_and() when invoked via a BPF test_run call. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-05-20sched_ext: idle: Allow scx_bpf_select_cpu_and() from unlocked contextAndrea Righi
Allow scx_bpf_select_cpu_and() to be used from an unlocked context, in addition to ops.enqueue() or ops.select_cpu(). This enables schedulers, including user-space ones, to implement a consistent idle CPU selection policy and helps reduce code duplication. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-05-20sched_ext: idle: Validate locking correctness in scx_bpf_select_cpu_and()Andrea Righi
Validate locking correctness when accessing p->nr_cpus_allowed and p->cpus_ptr inside scx_bpf_select_cpu_and(): if the rq lock is held, access is safe; otherwise, require that p->pi_lock is held. This allows to catch potential unsafe calls to scx_bpf_select_cpu_and(). Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-05-20sched_ext: Make scx_kf_allowed_if_unlocked() available outside ext.cAndrea Righi
Relocate the scx_kf_allowed_if_unlocked(), so it can be used from other source files (e.g., ext_idle.c). No functional change. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-05-20sched_ext, docs: add labelShashank Balaji
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-05-20selftests: seccomp: Fix "performace" to "performance"Sumanth Gavini
Fix misspelling reported by codespell Signed-off-by: Sumanth Gavini <sumanth.gavini@yahoo.com> Link: https://lore.kernel.org/r/20250517011725.1149510-1-sumanth.gavini@yahoo.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-05-21Merge tag 'nova-next-v6.16-2025-05-20' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/nova into drm-next Nova changes for v6.16 auxiliary: - bus abstractions - implementation for driver registration - add sample driver drm: - implement __drm_dev_alloc() - DRM core infrastructure Rust abstractions - device, driver and registration - DRM IOCTL - DRM File - GEM object - IntoGEMObject rework - generically implement AlwaysRefCounted through IntoGEMObject - refactor unsound from_gem_obj() into as_ref() - refactor into_gem_obj() into as_raw() driver-core: - merge topic/device-context-2025-04-17 from driver-core tree - implement Devres::access() - fix: doctest build under `!CONFIG_PCI` - accessor for Device::parent() - fix: conditionally expect `dead_code` for `parent()` - impl TryFrom<&Device> bus devices (PCI, platform) nova-core: - remove completed Vec extentions from task list - register auxiliary device for nova-drm - derive useful traits for Chipset - add missing GA100 chipset - take &Device<Bound> in Gpu::new() - infrastructure to generate register definitions - fix register layout of NV_PMC_BOOT_0 - move Firmware into own (Rust) module - fix: select AUXILIARY_BUS nova-drm: - initial driver skeleton (depends on drm and auxiliary bus abstractions) - fix: select AUXILIARY_BUS Rust (dependencies): - implement Opaque::zeroed() - implement Revocable::try_access_with() - implement Revocable::access() From: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/aCxAf3RqQAXLDhAj@cassiopeiae
2025-05-20Merge patch series "can: kvaser_pciefd: Fix ISR race conditions"Marc Kleine-Budde
Axel Forsman <axfo@kvaser.com> says: This patch series fixes a couple of race conditions in the kvaser_pciefd driver surfaced by enabling MSI interrupts and the new Kvaser PCIe 8xCAN. Changes since version 2: * Rebase onto linux-can/main to resolve del_timer()/timer_delete() merge conflict. * Reword 2nd commit message slightly. Changes since version 1: * Change type of srb_cmd_reg from "__le32 __iomem *" to "void __iomem *". * Maintain TX FIFO count in driver instead of querying HW. * Stop queue at end of .start_xmit() if full. Link: https://patch.msgid.link/20250520114332.8961-1-axfo@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-20can: kvaser_pciefd: Continue parsing DMA buf after dropped RXAxel Forsman
Going bus-off on a channel doing RX could result in dropped packets. As netif_running() gets cleared before the channel abort procedure, the handling of any last RDATA packets would see netif_rx() return non-zero to signal a dropped packet. kvaser_pciefd_read_buffer() dealt with this "error" by breaking out of processing the remaining DMA RX buffer. Only return an error from kvaser_pciefd_read_buffer() due to packet corruption, otherwise handle it internally. Cc: stable@vger.kernel.org Signed-off-by: Axel Forsman <axfo@kvaser.com> Tested-by: Jimmy Assarsson <extja@kvaser.com> Reviewed-by: Jimmy Assarsson <extja@kvaser.com> Link: https://patch.msgid.link/20250520114332.8961-4-axfo@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-20can: kvaser_pciefd: Fix echo_skb raceAxel Forsman
The functions kvaser_pciefd_start_xmit() and kvaser_pciefd_handle_ack_packet() raced to stop/wake TX queues and get/put echo skbs, as kvaser_pciefd_can->echo_lock was only ever taken when transmitting and KCAN_TX_NR_PACKETS_CURRENT gets decremented prior to handling of ACKs. E.g., this caused the following error: can_put_echo_skb: BUG! echo_skb 5 is occupied! Instead, use the synchronization helpers in netdev_queues.h. As those piggyback on BQL barriers, start updating in-flight packets and bytes counts as well. Cc: stable@vger.kernel.org Signed-off-by: Axel Forsman <axfo@kvaser.com> Tested-by: Jimmy Assarsson <extja@kvaser.com> Reviewed-by: Jimmy Assarsson <extja@kvaser.com> Link: https://patch.msgid.link/20250520114332.8961-3-axfo@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-20can: kvaser_pciefd: Force IRQ edge in case of nested IRQAxel Forsman
Avoid the driver missing IRQs by temporarily masking IRQs in the ISR to enforce an edge even if a different IRQ is signalled before handled IRQs are cleared. Fixes: 48f827d4f48f ("can: kvaser_pciefd: Move reset of DMA RX buffers to the end of the ISR") Cc: stable@vger.kernel.org Signed-off-by: Axel Forsman <axfo@kvaser.com> Tested-by: Jimmy Assarsson <extja@kvaser.com> Reviewed-by: Jimmy Assarsson <extja@kvaser.com> Link: https://patch.msgid.link/20250520114332.8961-2-axfo@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-20ext4: Add a WARN_ON_ONCE for querying LAST_IN_LEAF insteadRitesh Harjani (IBM)
We added the documentation in ext4_map_blocks() for usage of EXT4_GET_BLOCKS_QUERY_LAST_IN_LEAF flag. But It's better to add a WARN_ON_ONCE in case if anyone tries using this flag with CREATE to avoid a random issue later. Since depth can change with CREATE and it needs to be re-calculated before using it in there. Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/ee6e82a224c50b432df9ce1ce3333c50182d8473.1747677758.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20ext4: Simplify flags in ext4_map_query_blocks()Ritesh Harjani (IBM)
Now that we have EXT4_EX_QUERY_FILTER mask, let's use that to simplify the filtering of flags for passing to ext4_ext_map_blocks() in ext4_map_query_blocks() function. This allows us to kill the query_flags local variable which is not needed anymore. Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/4ae735e83e6f43341e53e2d289e59156a8360134.1747677758.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20ext4: Rename and document EXT4_EX_FILTER to EXT4_EX_QUERY_FILTERRitesh Harjani (IBM)
Rename EXT4_EX_FILTER to EXT4_EX_QUERY_FILTER to better describe its purpose as a filter mask used specifically in ext4_map_query_blocks(). Add a comment explaining that this macro is used to filter flags needed when querying the on-disk extent tree. We will later use EXT4_EX_QUERY_FILTER mask to add another EXT4_GET_BLOCKS_QUERY needed to lookup in on-disk extent tree. Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/51f05d0ba286372eb8693af95bd4b10194b53141.1747677758.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20ext4: Simplify last in leaf check in ext4_map_query_blocksRitesh Harjani (IBM)
This simplifies the check for last in leaf in ext4_map_query_blocks() and fixes this cocci warning. cocci warnings: (new ones prefixed by >>) >> fs/ext4/inode.c:573:49-51: WARNING !A || A && B is equivalent to !A || B Fixes: 5bb12b1837c0 ("ext4: Add support for EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505191524.auftmOwK-lkp@intel.com/ Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/5fd5c806218c83f603c578c95997cf7f6da29d74.1747677758.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20ext4: Unwritten to written conversion requires EXT4_EX_NOCACHERitesh Harjani (IBM)
This fixes the atomic write patch series after it was rebased on top of extent status cache cleanup series i.e. 'commit 402e38e6b71f57 ("ext4: prevent stale extent cache entries caused by concurrent I/O writeback")' After the above series, EXT4_GET_BLOCKS_IO_CONVERT_EXT flag which has EXT4_GET_BLOCKS_IO_SUBMIT flag set, requires that the io submit context of any kind should pass EXT4_EX_NOCACHE to avoid caching unncecessary extents in the extent status cache. This patch fixes that by adding the EXT4_EX_NOCACHE flag in ext4_convert_unwritten_extents_atomic() for unwritten to written conversion calls to ext4_map_blocks(). Fixes: b86629c2b299 ("ext4: Add multi-fsblock atomic write support with bigalloc") Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/ea0ad9378ff6d31d73f4e53f87548e3a20817689.1747677758.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20selftests: ublk: add test for covering UBLK_AUTO_BUF_REG_FALLBACKMing Lei
Add test for covering UBLK_AUTO_BUF_REG_FALLBACK: - pass '--auto_zc_fallback' to null target, which requires both F_AUTO_BUF_REG and F_SUPPORT_ZERO_COPY for handling UBLK_AUTO_BUF_REG_FALLBACK - add ->buf_index() method for returning invalid buffer index to trigger UBLK_AUTO_BUF_REG_FALLBACK - add generic_09 for running the test - add --auto_zc_fallback test in stress_03/stress_04/stress_05 Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250520045455.515691-7-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20selftests: ublk: support UBLK_F_AUTO_BUF_REGMing Lei
Enable UBLK_F_AUTO_BUF_REG support for ublk utility by argument `--auto_zc`, meantime support this feature in null, loop and stripe target code. Add function test generic_08 for covering basic UBLK_F_AUTO_BUF_REG feature. Also cover UBLK_F_AUTO_BUF_REG in stress_03, stress_04 and stress_05 test too. 'fio/t/io_uring -p0 /dev/ublkb0' shows that F_AUTO_BUF_REG can improve IOPS by 50% compared with F_SUPPORT_ZERO_COPY in my test VM. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250520045455.515691-6-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20ublk: support UBLK_AUTO_BUF_REG_FALLBACKMing Lei
For UBLK_F_AUTO_BUF_REG, buffer is registered to uring_cmd context automatically with the provided buffer index. User may provide one wrong buffer index, or the specified buffer is registered by application already. Add UBLK_AUTO_BUF_REG_FALLBACK for supporting to auto buffer registering fallback by completing the uring_cmd and telling ublk server the register failure via UBLK_AUTO_BUF_REG_FALLBACK, then ublk server still can register the buffer from userspace. So we can provide reliable way for supporting auto buffer register. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250520045455.515691-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20ublk: register buffer to local io_uring with provided buf index via ↵Ming Lei
UBLK_F_AUTO_BUF_REG Add UBLK_F_AUTO_BUF_REG for supporting to register buffer automatically to local io_uring context with provided buffer index. Add UAPI structure `struct ublk_auto_buf_reg` for holding user parameter to register request buffer automatically, one 'flags' field is defined, and there is still 32bit available for future extension, such as, adding one io_ring FD field for registering buffer to external io_uring. `struct ublk_auto_buf_reg` is populated from ublk uring_cmd's sqe->addr, and all existing ublk commands are data-less, so it is just fine to reuse sqe->addr for this purpose. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250520045455.515691-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20ublk: prepare for supporting to register request buffer automaticallyMing Lei
UBLK_F_SUPPORT_ZERO_COPY requires ublk server to issue explicit buffer register/unregister uring_cmd for each IO, this way is not only inefficient, but also introduce dependency between buffer consumer and buffer register/ unregister uring_cmd, please see tools/testing/selftests/ublk/stripe.c in which backing file IO has to be issued one by one by IOSQE_IO_LINK. Prepare for adding feature UBLK_F_AUTO_BUF_REG for addressing the existing zero copy limitation: - register request buffer automatically to ublk uring_cmd's io_uring context before delivering io command to ublk server - unregister request buffer automatically from the ublk uring_cmd's io_uring context when completing the request - io_uring will unregister the buffer automatically when uring is exiting, so we needn't worry about accident exit For using this feature, ublk server has to create one sparse buffer table Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250520045455.515691-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20ublk: convert to refcount_tMing Lei
Convert to refcount_t and prepare for supporting to register bvec buffer automatically, which needs to initialize reference counter as 2, and kref doesn't provide this interface, so convert to refcount_t. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Suggested-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250520045455.515691-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20selftests: ublk: make IO & device removal test more stressfulMing Lei
__run_io_and_remove() is used in several stress tests for running heavy IO vs. removing device meantime. However, sequential `readwrite` is taken in the fio script, which isn't correct, we should take random IO for saturating ublk device. Also turns out '--num_jobs=4' isn't stressful enough, so change it to '--num_jobs=$(nproc)'. Finally we don't cover single queue test in `test_stress_02.sh`, so add single queue test which can trigger request tag recycling easier. With above change the issue in #1 can be reproduced reliably in stress_02.sh. Link:https://lore.kernel.org/linux-block/mruqwpf4tqenkbtgezv5oxwq7ngyq24jzeyqy4ixzvivatbbxv@4oh2wzz4e6qn/ #1 Cc: Jared Holzman <jholzman@nvidia.com> Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250519031620.245749-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20Merge tag 'nvme-6.16-2025-05-20' of git://git.infradead.org/nvme into ↵Jens Axboe
for-6.16/block Pull NVMe updates from Christoph: "nvme updates for Linux 6.16 - add per-node DMA pools and use them for PRP/SGL allocations (Caleb Sander Mateos, Keith Busch) - nvme-fcloop refcounting fixes (Daniel Wagner) - support delayed removal of the multipath node and optionally support the multipath node for private namespaces (Nilay Shroff) - support shared CQs in the PCI endpoint target code (Wilfred Mallawa) - support admin-queue only authentication (Hannes Reinecke) - use the crc32c library instead of the crypto API (Eric Biggers) - misc cleanups (Christoph Hellwig, Marcelo Moreira, Hannes Reinecke, Leon Romanovsky, Gustavo A. R. Silva)" * tag 'nvme-6.16-2025-05-20' of git://git.infradead.org/nvme: (42 commits) nvme: rename nvme_mpath_shutdown_disk to nvme_mpath_remove_disk nvme: introduce multipath_always_on module param nvme-multipath: introduce delayed removal of the multipath head node nvme-pci: derive and better document max segments limits nvme-pci: use struct_size for allocation struct nvme_dev nvme-pci: add a symolic name for the small pool size nvme-pci: use a better encoding for small prp pool allocations nvme-pci: rename the descriptor pools nvme-pci: remove struct nvme_descriptor nvme-pci: store aborted state in flags variable nvme-pci: don't try to use SGLs for metadata on the admin queue nvme-pci: make PRP list DMA pools per-NUMA-node nvme-pci: factor out a nvme_init_hctx_common() helper dmapool: add NUMA affinity support nvme-fc: do not reference lsrsp after failure nvmet-fcloop: don't wait for lport cleanup nvmet-fcloop: add missing fcloop_callback_host_done nvmet-fc: take tgtport refs for portentry nvmet-fc: free pending reqs on tgtport unregister nvmet-fcloop: drop response if targetport is gone ...
2025-05-20spi: dt-bindings: Add rk3528-spi compatibleChukun Pan
This adds a compatible string for the SPI controller on RK3528. Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20250520100102.1226725-2-amadeus@jmu.edu.cn Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-20Merge tag 'for-linus-6.15-ofs2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs fix from Mike Marshall: "Fix for orangefs page writeout counting" * tag 'for-linus-6.15-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: orangefs: adjust counting code to recover from 665575cf
2025-05-20wifi: ath9k_htc: Abort software beacon handling if disabledToke Høiland-Jørgensen
A malicious USB device can send a WMI_SWBA_EVENTID event from an ath9k_htc-managed device before beaconing has been enabled. This causes a device-by-zero error in the driver, leading to either a crash or an out of bounds read. Prevent this by aborting the handling in ath9k_htc_swba() if beacons are not enabled. Reported-by: Robert Morris <rtm@csail.mit.edu> Closes: https://lore.kernel.org/r/88967.1743099372@localhost Fixes: 832f6a18fc2a ("ath9k_htc: Add beacon slots") Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://patch.msgid.link/20250402112217.58533-1-toke@toke.dk Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-05-20loop: don't require ->write_iter for writable files in loop_configureChristoph Hellwig
Block devices can be opened read-write even if they can't be written to for historic reasons. Remove the check requiring file->f_op->write_iter when the block devices was opened in loop_configure. The call to loop_check_backing_file just below ensures the ->write_iter is present for backing files opened for writing, which is the only check that is actually needed. Fixes: f5c84eff634b ("loop: Add sanity check for read/write_iter") Reported-by: Christian Hesse <mail@eworm.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250520135420.1177312-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-20orangefs: adjust counting code to recover from 665575cfMike Marshall
A late commit to 6.14-rc7! broke orangefs. 665575cf seems like a good change, but maybe should have been introduced during the merge window. This patch adjusts the counting code associated with writing out pages so that orangefs works in a 665575cf world. Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2025-05-20wifi: ath12k: remove redundant regulatory rules intersection logic in hostAishwarya R
Whenever there is a change in the country code settings from the user, driver does an intersection of the regulatory rules for this new country with the original regulatory rules which were reported during initialization time. There is also similar logic running in firmware with a difference that the intersection in firmware is only done when the country code is configuration during boot up time (BDF/OTP). Firmware logic does not kick in when no country code is configured during device bring up time as the device is always expected to have the country code configured properly in the deployment. There is a debug/test use case that requires absolute regulatory rules to be used for a user configured country code when the device is not configured with a particular country code during boot up time. To support the above test use case, remove the redundant regulatory rules intersection logic in the host driver. Depend on the intersection logic in firmware when the device comes up with pre-configured country code. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00209-QCAHKSWPL_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 Signed-off-by: Aishwarya R <quic_aisr@quicinc.com> Signed-off-by: Rajat Soni <quic_rajson@quicinc.com> Link: https://patch.msgid.link/20250505034351.1365914-1-quic_rajson@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-05-20wifi: ath12k: Send MCS15 support to firmware during peer assocMohan Kumar G
As per IEEE 802.11be-2024 - 9.4.2.321, EHT operation element contains MCS15 Disable subfield as the sixth bit, which is set when MCS15 support is not enabled. During association, firmware will use this MCS15 flag to enable or disable the reception of PPDU with EHT-MCS15 capability. Send MCS15 support to firmware through WMI command during peer assoc. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 Co-developed-by: Dhanavandhana Kannan <quic_dhanavan1@quicinc.com> Signed-off-by: Dhanavandhana Kannan <quic_dhanavan1@quicinc.com> Signed-off-by: Mohan Kumar G <quic_mkumarg@quicinc.com> Link: https://patch.msgid.link/20250505153536.3275145-1-quic_mkumarg@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-05-20ext4: only dirty folios when data journaling regular filesBrian Foster
fstest generic/388 occasionally reproduces a crash that looks as follows: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... Call Trace: <TASK> ext4_block_zero_page_range+0x30c/0x380 [ext4] ext4_truncate+0x436/0x440 [ext4] ext4_process_orphan+0x5d/0x110 [ext4] ext4_orphan_cleanup+0x124/0x4f0 [ext4] ext4_fill_super+0x262d/0x3110 [ext4] get_tree_bdev_flags+0x132/0x1d0 vfs_get_tree+0x26/0xd0 vfs_cmd_create+0x59/0xe0 __do_sys_fsconfig+0x4ed/0x6b0 do_syscall_64+0x82/0x170 ... This occurs when processing a symlink inode from the orphan list. The partial block zeroing code in the truncate path calls ext4_dirty_journalled_data() -> folio_mark_dirty(). The latter calls mapping->a_ops->dirty_folio(), but symlink inodes are not assigned an a_ops vector in ext4, hence the crash. To avoid this problem, update the ext4_dirty_journalled_data() helper to only mark the folio dirty on regular files (for which a_ops is assigned). This also matches the journaling logic in the ext4_symlink() creation path, where ext4_handle_dirty_metadata() is called directly. Fixes: d84c9ebdac1e ("ext4: Mark pages with journalled data dirty") Signed-off-by: Brian Foster <bfoster@redhat.com> Link: https://patch.msgid.link/20250516173800.175577-1-bfoster@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@kernel.org
2025-05-20ext4: Add atomic block write documentationRitesh Harjani (IBM)
Add an initial documentation around atomic writes support in ext4. Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/d3893b9f5ad70317abae72046e81e4c180af91bf.1747337952.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20ext4: Enable support for ext4 multi-fsblock atomic write using bigallocRitesh Harjani (IBM)
Last couple of patches added the needed support for multi-fsblock atomic writes using bigalloc. This patch ensures that filesystem advertizes the needed atomic write unit min and max values for enabling multi-fsblock atomic write support with bigalloc. Acked-by: Darrick J. Wong <djwong@kernel.org> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/5e45d7ed24499024b9079436ba6698dae5298e29.1747337952.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20ext4: Add multi-fsblock atomic write support with bigallocRitesh Harjani (IBM)
EXT4 supports bigalloc feature which allows the FS to work in size of clusters (group of blocks) rather than individual blocks. This patch adds atomic write support for bigalloc so that systems with bs = ps can also create FS using - mkfs.ext4 -F -O bigalloc -b 4096 -C 16384 <dev> With bigalloc ext4 can support multi-fsblock atomic writes. We will have to adjust ext4's atomic write unit max value to cluster size. This can then support atomic write of size anywhere between [blocksize, clustersize]. This patch adds the required changes to enable multi-fsblock atomic write support using bigalloc in the next patch. In this patch for block allocation: we first query the underlying region of the requested range by calling ext4_map_blocks() call. Here are the various cases which we then handle depending upon the underlying mapping type: 1. If the underlying region for the entire requested range is a mapped extent, then we don't call ext4_map_blocks() to allocate anything. We don't need to even start the jbd2 txn in this case. 2. For an append write case, we create a mapped extent. 3. If the underlying region is entirely a hole, then we create an unwritten extent for the requested range. 4. If the underlying region is a large unwritten extent, then we split the extent into 2 unwritten extent of required size. 5. If the underlying region has any type of mixed mapping, then we call ext4_map_blocks() in a loop to zero out the unwritten and the hole regions within the requested range. This then provide a single mapped extent type mapping for the requested range. Note: We invoke ext4_map_blocks() in a loop with the EXT4_GET_BLOCKS_ZERO flag only when the underlying extent mapping of the requested range is not entirely a hole, an unwritten extent, or a fully mapped extent. That is, if the underlying region contains a mix of hole(s), unwritten extent(s), and mapped extent(s), we use this loop to ensure that all the short mappings are zeroed out. This guarantees that the entire requested range becomes a single, uniformly mapped extent. It is ok to do so because we know this is being done on a bigalloc enabled filesystem where the block bitmap represents the entire cluster unit. Note having a single contiguous underlying region of type mapped, unwrittn or hole is not a problem. But the reason to avoid writing on top of mixed mapping region is because, atomic writes requires all or nothing should get written for the userspace pwritev2 request. So if at any point in time during the write if a crash or a sudden poweroff occurs, the region undergoing atomic write should read either complete old data or complete new data. But it should never have a mix of both old and new data. So, we first convert any mixed mapping region to a single contiguous mapped extent before any data gets written to it. This is because normally FS will only convert unwritten extents to written at the end of the write in ->end_io() call. And if we allow the writes over a mixed mapping and if a sudden power off happens in between, we will end up reading mix of new data (over mapped extents) and old data (over unwritten extents), because unwritten to written conversion never went through. So to avoid this and to avoid writes getting torned due to mixed mapping, we first allocate a single contiguous block mapping and then do the write. Acked-by: Darrick J. Wong <djwong@kernel.org> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/c4965ac3407cbc773f0bc954d0966d9696f5038a.1747337952.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20ext4: Add support for EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKSRitesh Harjani (IBM)
There can be a case where there are contiguous extents on the adjacent leaf nodes of on-disk extent trees. So when someone tries to write to this contiguous range, ext4_map_blocks() call will split by returning 1 extent at a time if this is not already cached in extent_status tree cache (where if these extents when cached can get merged since they are contiguous). This is fine for a normal write however in case of atomic writes, it can't afford to break the write into two. Now this is also something that will only happen in the slow write case where we call ext4_map_blocks() for each of these extents spread across different leaf nodes. However, there is no guarantee that these extent status cache cannot be reclaimed before the last call to ext4_map_blocks() in ext4_map_blocks_atomic_write_slow(). Hence this patch adds support of EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS. This flag checks if the requested range can be fully found in extent status cache and return. If not, it looks up in on-disk extent tree via ext4_map_query_blocks(). If the found extent is the last entry in the leaf node, then it goes and queries the next lblk to see if there is an adjacent contiguous extent in the adjacent leaf node of the on-disk extent tree. Even though there can be a case where there are multiple adjacent extent entries spread across multiple leaf nodes. But we only read an adjacent leaf block i.e. in total of 2 extent entries spread across 2 leaf nodes. The reason for this is that we are mostly only going to support atomic writes with upto 64KB or maybe max upto 1MB of atomic write support. Acked-by: Darrick J. Wong <djwong@kernel.org> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/6bb563e661f5fbd80e266a9e6ce6e29178f555f6.1747337952.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20ext4: Make ext4_meta_trans_blocks() non-static for later useRitesh Harjani (IBM)
Let's make ext4_meta_trans_blocks() non-static for use in later functions during ->end_io conversion for atomic writes. We will need this function to estimate journal credits for a special case. Instead of adding another wrapper around it, let's make this non-static. Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/23ce80d4286f792831ce99d13558182ee228fedb.1747337952.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20ext4: Check if inode uses extents in ext4_inode_can_atomic_write()Ritesh Harjani (IBM)
EXT4 only supports doing atomic write on inodes which uses extents, so add a check in ext4_inode_can_atomic_write() which gets called during open. Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/86bb502c979398a736ab371d8f35f6866a477f6c.1747337952.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20ext4: Document an edge case for overwritesRitesh Harjani (IBM)
ext4_iomap_overwrite_begin() clears the flag for IOMAP_WRITE before calling ext4_iomap_begin(). Document this above ext4_map_blocks() call as it is easy to miss it when focusing on write paths alone. Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/fd50ba05440042dff77d555e463a620a79f8d0e9.1747337952.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20jbd2: remove journal_t argument from jbd2_superblock_csum()Eric Biggers
Since jbd2_superblock_csum() no longer uses its journal_t argument, remove it. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Link: https://patch.msgid.link/20250513053809.699974-5-ebiggers@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-05-20jbd2: remove journal_t argument from jbd2_chksum()Eric Biggers
Since jbd2_chksum() no longer uses its journal_t argument, remove it. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Link: https://patch.msgid.link/20250513053809.699974-4-ebiggers@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>