Age | Commit message (Collapse) | Author |
|
Fix the handling of err_wq_init and err_reg_netdev paths in
enetc4_pf_netdev_create() function.
Fixes: 6c5bafba347b ("net: enetc: add MAC filtering for i.MX95 ENETC PF")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250516052734.3624191-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Alan Maguire says:
====================
libbpf: support multi-split BTF
In discussing handling of inlines in BTF [1], one area which we may need
support for in the future is multiple split BTF, where split BTF sits
atop another split BTF which sits atop base BTF. This two-patch series
fixes one issue discovered when testing multi-split BTF and extends the
split BTF test to cover multi-split BTF also.
[1] https://lore.kernel.org/dwarves/20250416-btf_inline-v1-0-e4bd2f8adae5@meta.com/
====================
Link: https://patch.msgid.link/20250519165935.261614-1-alan.maguire@oracle.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
Extend split BTF test to cover case where we create split BTF on top of
existing split BTF and add info to it; ensure that such BTF can be
created and handled by searching within it, dumping/comparing to expected.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250519165935.261614-3-alan.maguire@oracle.com
|
|
libbpf handling of split BTF has been written largely with the
assumption that multiple splits are possible, i.e. split BTF on top of
split BTF on top of base BTF. One area where this does not quite work
is string handling in split BTF; the start string offset should be the
base BTF string section length + the base BTF string offset. This
worked in the past because for a single split BTF with base the start
string offset was always 0.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250519165935.261614-2-alan.maguire@oracle.com
|
|
As discussed in the thread containing
https://lore.kernel.org/linux-crypto/20250510053308.GB505731@sol/, the
Power10-optimized Poly1305 code is currently not safe to call in softirq
context. Disable it for now. It can be re-enabled once it is fixed.
Fixes: ba8f8624fde2 ("crypto: poly1305-p10 - Glue code for optmized Poly1305 implementation for ppc64le")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This reverts commit c66d7ebbe2fa14e41913adb421090a7426f59786.
Link: https://lore.kernel.org/all/02c22854-eebf-4ad1-b89e-8c2b65ab8236@csgroup.eu/
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Error-handling paths in msm_pinctrl_probe() don't call
a function required to unroll restart handler registration,
unregister_restart_handler(). Instead of adding calls to this function,
switch the msm pinctrl code into using devm_register_sys_off_handler().
Fixes: cf1fc1876289 ("pinctrl: qcom: use restart_notifier mechanism for ps_hold")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/20250513-pinctrl-msm-fix-v2-2-249999af0fc1@oss.qualcomm.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
On Qualcomm platforms if the board uses GPIO hogs msm_pinmux_request()
calls gpiochip_line_is_valid(). After commit 8015443e24e7 ("gpio: Hide
valid_mask from direct assignments") gpiochip_line_is_valid() uses
gc->gpiodev, which is NULL when GPIO hog pins are being processed.
Thus after this commit using GPIO hogs causes the following crash. In
order to fix this, verify that gc->gpiodev is not NULL.
Note: it is not possible to reorder calls (e.g. by calling
msm_gpio_init() before pinctrl registration or by splitting
pinctrl_register() into _and_init() and pinctrl_enable() and calling the
latter function after msm_gpio_init()) because GPIO chip registration
would fail with EPROBE_DEFER if pinctrl is not enabled at the time of
registration.
pc : gpiochip_line_is_valid+0x4/0x28
lr : msm_pinmux_request+0x24/0x40
sp : ffff8000808eb870
x29: ffff8000808eb870 x28: 0000000000000000 x27: 0000000000000000
x26: 0000000000000000 x25: ffff726240f9d040 x24: 0000000000000000
x23: ffff7262438c0510 x22: 0000000000000080 x21: ffff726243ea7000
x20: ffffab13f2c4e698 x19: 0000000000000080 x18: 00000000ffffffff
x17: ffff726242ba6000 x16: 0000000000000100 x15: 0000000000000028
x14: 0000000000000000 x13: 0000000000002948 x12: 0000000000000003
x11: 0000000000000078 x10: 0000000000002948 x9 : ffffab13f50eb5e8
x8 : 0000000003ecb21b x7 : 000000000000002d x6 : 0000000000000b68
x5 : 0000007fffffffff x4 : ffffab13f52f84a8 x3 : ffff8000808eb804
x2 : ffffab13f1de8190 x1 : 0000000000000080 x0 : 0000000000000000
Call trace:
gpiochip_line_is_valid+0x4/0x28 (P)
pin_request+0x208/0x2c0
pinmux_enable_setting+0xa0/0x2e0
pinctrl_commit_state+0x150/0x26c
pinctrl_enable+0x6c/0x2a4
pinctrl_register+0x3c/0xb0
devm_pinctrl_register+0x58/0xa0
msm_pinctrl_probe+0x2a8/0x584
sdm845_pinctrl_probe+0x20/0x88
platform_probe+0x68/0xc0
really_probe+0xbc/0x298
__driver_probe_device+0x78/0x12c
driver_probe_device+0x3c/0x160
__device_attach_driver+0xb8/0x138
bus_for_each_drv+0x84/0xe0
__device_attach+0x9c/0x188
device_initial_probe+0x14/0x20
bus_probe_device+0xac/0xb0
deferred_probe_work_func+0x8c/0xc8
process_one_work+0x208/0x5e8
worker_thread+0x1b4/0x35c
kthread+0x144/0x220
ret_from_fork+0x10/0x20
Code: b5fffba0 17fffff2 9432ec27 f9400400 (f9428800)
Fixes: 8015443e24e7 ("gpio: Hide valid_mask from direct assignments")
Reported-by: Doug Anderson <dianders@chromium.org>
Closes: https://lore.kernel.org/r/CAD=FV=Vg8_ZOLgLoC4WhFPzhVsxXFC19NrF38W6cW_W_3nFjbw@mail.gmail.com
Tested-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/20250513-pinctrl-msm-fix-v2-1-249999af0fc1@oss.qualcomm.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
io_uring_cmd_import_fixed_vec() is declared in both
include/linux/io_uring/cmd.h and io_uring/uring_cmd.h. The declarations
are identical (if redundant) for CONFIG_IO_URING=y. But if
CONFIG_IO_URING=N, include/linux/io_uring/cmd.h declares the function as
static inline while io_uring/uring_cmd.h declares it as extern. This
causes linker errors if the declaration in io_uring/uring_cmd.h is used.
Remove the declaration in io_uring/uring_cmd.h to avoid linker errors
and prevent the declarations getting out of sync.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Fixes: ef4902752972 ("io_uring/cmd: introduce io_uring_cmd_import_fixed_vec")
Link: https://lore.kernel.org/r/20250520193337.1374509-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Update the allowed_cpus selftest to include a check to validate the
behavior of scx_bpf_select_cpu_and() when invoked via a BPF test_run
call.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Allow scx_bpf_select_cpu_and() to be used from an unlocked context, in
addition to ops.enqueue() or ops.select_cpu().
This enables schedulers, including user-space ones, to implement a
consistent idle CPU selection policy and helps reduce code duplication.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Validate locking correctness when accessing p->nr_cpus_allowed and
p->cpus_ptr inside scx_bpf_select_cpu_and(): if the rq lock is held,
access is safe; otherwise, require that p->pi_lock is held.
This allows to catch potential unsafe calls to scx_bpf_select_cpu_and().
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Relocate the scx_kf_allowed_if_unlocked(), so it can be used from other
source files (e.g., ext_idle.c).
No functional change.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Fix misspelling reported by codespell
Signed-off-by: Sumanth Gavini <sumanth.gavini@yahoo.com>
Link: https://lore.kernel.org/r/20250517011725.1149510-1-sumanth.gavini@yahoo.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/nova into drm-next
Nova changes for v6.16
auxiliary:
- bus abstractions
- implementation for driver registration
- add sample driver
drm:
- implement __drm_dev_alloc()
- DRM core infrastructure Rust abstractions
- device, driver and registration
- DRM IOCTL
- DRM File
- GEM object
- IntoGEMObject rework
- generically implement AlwaysRefCounted through IntoGEMObject
- refactor unsound from_gem_obj() into as_ref()
- refactor into_gem_obj() into as_raw()
driver-core:
- merge topic/device-context-2025-04-17 from driver-core tree
- implement Devres::access()
- fix: doctest build under `!CONFIG_PCI`
- accessor for Device::parent()
- fix: conditionally expect `dead_code` for `parent()`
- impl TryFrom<&Device> bus devices (PCI, platform)
nova-core:
- remove completed Vec extentions from task list
- register auxiliary device for nova-drm
- derive useful traits for Chipset
- add missing GA100 chipset
- take &Device<Bound> in Gpu::new()
- infrastructure to generate register definitions
- fix register layout of NV_PMC_BOOT_0
- move Firmware into own (Rust) module
- fix: select AUXILIARY_BUS
nova-drm:
- initial driver skeleton (depends on drm and auxiliary bus
abstractions)
- fix: select AUXILIARY_BUS
Rust (dependencies):
- implement Opaque::zeroed()
- implement Revocable::try_access_with()
- implement Revocable::access()
From: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/aCxAf3RqQAXLDhAj@cassiopeiae
|
|
Axel Forsman <axfo@kvaser.com> says:
This patch series fixes a couple of race conditions in the
kvaser_pciefd driver surfaced by enabling MSI interrupts and the new
Kvaser PCIe 8xCAN.
Changes since version 2:
* Rebase onto linux-can/main to resolve del_timer()/timer_delete()
merge conflict.
* Reword 2nd commit message slightly.
Changes since version 1:
* Change type of srb_cmd_reg from "__le32 __iomem *" to
"void __iomem *".
* Maintain TX FIFO count in driver instead of querying HW.
* Stop queue at end of .start_xmit() if full.
Link: https://patch.msgid.link/20250520114332.8961-1-axfo@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Going bus-off on a channel doing RX could result in dropped packets.
As netif_running() gets cleared before the channel abort procedure,
the handling of any last RDATA packets would see netif_rx() return
non-zero to signal a dropped packet. kvaser_pciefd_read_buffer() dealt
with this "error" by breaking out of processing the remaining DMA RX
buffer.
Only return an error from kvaser_pciefd_read_buffer() due to packet
corruption, otherwise handle it internally.
Cc: stable@vger.kernel.org
Signed-off-by: Axel Forsman <axfo@kvaser.com>
Tested-by: Jimmy Assarsson <extja@kvaser.com>
Reviewed-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250520114332.8961-4-axfo@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
The functions kvaser_pciefd_start_xmit() and
kvaser_pciefd_handle_ack_packet() raced to stop/wake TX queues and
get/put echo skbs, as kvaser_pciefd_can->echo_lock was only ever taken
when transmitting and KCAN_TX_NR_PACKETS_CURRENT gets decremented
prior to handling of ACKs. E.g., this caused the following error:
can_put_echo_skb: BUG! echo_skb 5 is occupied!
Instead, use the synchronization helpers in netdev_queues.h. As those
piggyback on BQL barriers, start updating in-flight packets and bytes
counts as well.
Cc: stable@vger.kernel.org
Signed-off-by: Axel Forsman <axfo@kvaser.com>
Tested-by: Jimmy Assarsson <extja@kvaser.com>
Reviewed-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250520114332.8961-3-axfo@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Avoid the driver missing IRQs by temporarily masking IRQs in the ISR
to enforce an edge even if a different IRQ is signalled before handled
IRQs are cleared.
Fixes: 48f827d4f48f ("can: kvaser_pciefd: Move reset of DMA RX buffers to the end of the ISR")
Cc: stable@vger.kernel.org
Signed-off-by: Axel Forsman <axfo@kvaser.com>
Tested-by: Jimmy Assarsson <extja@kvaser.com>
Reviewed-by: Jimmy Assarsson <extja@kvaser.com>
Link: https://patch.msgid.link/20250520114332.8961-2-axfo@kvaser.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
We added the documentation in ext4_map_blocks() for usage of
EXT4_GET_BLOCKS_QUERY_LAST_IN_LEAF flag. But It's better to add
a WARN_ON_ONCE in case if anyone tries using this flag with CREATE to
avoid a random issue later. Since depth can change with CREATE and it
needs to be re-calculated before using it in there.
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://patch.msgid.link/ee6e82a224c50b432df9ce1ce3333c50182d8473.1747677758.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Now that we have EXT4_EX_QUERY_FILTER mask, let's use that to simplify
the filtering of flags for passing to ext4_ext_map_blocks() in
ext4_map_query_blocks() function. This allows us to kill the query_flags
local variable which is not needed anymore.
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/4ae735e83e6f43341e53e2d289e59156a8360134.1747677758.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Rename EXT4_EX_FILTER to EXT4_EX_QUERY_FILTER to better describe its
purpose as a filter mask used specifically in ext4_map_query_blocks().
Add a comment explaining that this macro is used to filter flags needed
when querying the on-disk extent tree.
We will later use EXT4_EX_QUERY_FILTER mask to add another
EXT4_GET_BLOCKS_QUERY needed to lookup in on-disk extent tree.
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/51f05d0ba286372eb8693af95bd4b10194b53141.1747677758.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
This simplifies the check for last in leaf in ext4_map_query_blocks()
and fixes this cocci warning.
cocci warnings: (new ones prefixed by >>)
>> fs/ext4/inode.c:573:49-51: WARNING !A || A && B is equivalent to !A || B
Fixes: 5bb12b1837c0 ("ext4: Add support for EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505191524.auftmOwK-lkp@intel.com/
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://patch.msgid.link/5fd5c806218c83f603c578c95997cf7f6da29d74.1747677758.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
This fixes the atomic write patch series after it was rebased on top of
extent status cache cleanup series i.e.
'commit 402e38e6b71f57 ("ext4: prevent stale extent cache entries caused by
concurrent I/O writeback")'
After the above series, EXT4_GET_BLOCKS_IO_CONVERT_EXT flag which has
EXT4_GET_BLOCKS_IO_SUBMIT flag set, requires that the io submit context
of any kind should pass EXT4_EX_NOCACHE to avoid caching unncecessary
extents in the extent status cache.
This patch fixes that by adding the EXT4_EX_NOCACHE flag in
ext4_convert_unwritten_extents_atomic() for unwritten to written
conversion calls to ext4_map_blocks().
Fixes: b86629c2b299 ("ext4: Add multi-fsblock atomic write support with bigalloc")
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://patch.msgid.link/ea0ad9378ff6d31d73f4e53f87548e3a20817689.1747677758.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Add test for covering UBLK_AUTO_BUF_REG_FALLBACK:
- pass '--auto_zc_fallback' to null target, which requires both F_AUTO_BUF_REG
and F_SUPPORT_ZERO_COPY for handling UBLK_AUTO_BUF_REG_FALLBACK
- add ->buf_index() method for returning invalid buffer index to trigger
UBLK_AUTO_BUF_REG_FALLBACK
- add generic_09 for running the test
- add --auto_zc_fallback test in stress_03/stress_04/stress_05
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250520045455.515691-7-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Enable UBLK_F_AUTO_BUF_REG support for ublk utility by argument `--auto_zc`,
meantime support this feature in null, loop and stripe target code.
Add function test generic_08 for covering basic UBLK_F_AUTO_BUF_REG feature.
Also cover UBLK_F_AUTO_BUF_REG in stress_03, stress_04 and stress_05 test too.
'fio/t/io_uring -p0 /dev/ublkb0' shows that F_AUTO_BUF_REG can improve
IOPS by 50% compared with F_SUPPORT_ZERO_COPY in my test VM.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250520045455.515691-6-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For UBLK_F_AUTO_BUF_REG, buffer is registered to uring_cmd context
automatically with the provided buffer index. User may provide one wrong
buffer index, or the specified buffer is registered by application already.
Add UBLK_AUTO_BUF_REG_FALLBACK for supporting to auto buffer registering
fallback by completing the uring_cmd and telling ublk server the
register failure via UBLK_AUTO_BUF_REG_FALLBACK, then ublk server still
can register the buffer from userspace.
So we can provide reliable way for supporting auto buffer register.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250520045455.515691-5-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
UBLK_F_AUTO_BUF_REG
Add UBLK_F_AUTO_BUF_REG for supporting to register buffer automatically
to local io_uring context with provided buffer index.
Add UAPI structure `struct ublk_auto_buf_reg` for holding user parameter
to register request buffer automatically, one 'flags' field is defined, and
there is still 32bit available for future extension, such as, adding one
io_ring FD field for registering buffer to external io_uring.
`struct ublk_auto_buf_reg` is populated from ublk uring_cmd's sqe->addr,
and all existing ublk commands are data-less, so it is just fine to reuse
sqe->addr for this purpose.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250520045455.515691-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
UBLK_F_SUPPORT_ZERO_COPY requires ublk server to issue explicit buffer
register/unregister uring_cmd for each IO, this way is not only inefficient,
but also introduce dependency between buffer consumer and buffer register/
unregister uring_cmd, please see tools/testing/selftests/ublk/stripe.c
in which backing file IO has to be issued one by one by IOSQE_IO_LINK.
Prepare for adding feature UBLK_F_AUTO_BUF_REG for addressing the existing
zero copy limitation:
- register request buffer automatically to ublk uring_cmd's io_uring
context before delivering io command to ublk server
- unregister request buffer automatically from the ublk uring_cmd's
io_uring context when completing the request
- io_uring will unregister the buffer automatically when uring is
exiting, so we needn't worry about accident exit
For using this feature, ublk server has to create one sparse buffer table
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250520045455.515691-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Convert to refcount_t and prepare for supporting to register bvec buffer
automatically, which needs to initialize reference counter as 2, and
kref doesn't provide this interface, so convert to refcount_t.
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Suggested-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250520045455.515691-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
__run_io_and_remove() is used in several stress tests for running heavy
IO vs. removing device meantime.
However, sequential `readwrite` is taken in the fio script, which isn't
correct, we should take random IO for saturating ublk device.
Also turns out '--num_jobs=4' isn't stressful enough, so change it to
'--num_jobs=$(nproc)'.
Finally we don't cover single queue test in `test_stress_02.sh`, so add
single queue test which can trigger request tag recycling easier.
With above change the issue in #1 can be reproduced reliably in stress_02.sh.
Link:https://lore.kernel.org/linux-block/mruqwpf4tqenkbtgezv5oxwq7ngyq24jzeyqy4ixzvivatbbxv@4oh2wzz4e6qn/ #1
Cc: Jared Holzman <jholzman@nvidia.com>
Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250519031620.245749-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
for-6.16/block
Pull NVMe updates from Christoph:
"nvme updates for Linux 6.16
- add per-node DMA pools and use them for PRP/SGL allocations
(Caleb Sander Mateos, Keith Busch)
- nvme-fcloop refcounting fixes (Daniel Wagner)
- support delayed removal of the multipath node and optionally support
the multipath node for private namespaces (Nilay Shroff)
- support shared CQs in the PCI endpoint target code (Wilfred Mallawa)
- support admin-queue only authentication (Hannes Reinecke)
- use the crc32c library instead of the crypto API (Eric Biggers)
- misc cleanups (Christoph Hellwig, Marcelo Moreira, Hannes Reinecke,
Leon Romanovsky, Gustavo A. R. Silva)"
* tag 'nvme-6.16-2025-05-20' of git://git.infradead.org/nvme: (42 commits)
nvme: rename nvme_mpath_shutdown_disk to nvme_mpath_remove_disk
nvme: introduce multipath_always_on module param
nvme-multipath: introduce delayed removal of the multipath head node
nvme-pci: derive and better document max segments limits
nvme-pci: use struct_size for allocation struct nvme_dev
nvme-pci: add a symolic name for the small pool size
nvme-pci: use a better encoding for small prp pool allocations
nvme-pci: rename the descriptor pools
nvme-pci: remove struct nvme_descriptor
nvme-pci: store aborted state in flags variable
nvme-pci: don't try to use SGLs for metadata on the admin queue
nvme-pci: make PRP list DMA pools per-NUMA-node
nvme-pci: factor out a nvme_init_hctx_common() helper
dmapool: add NUMA affinity support
nvme-fc: do not reference lsrsp after failure
nvmet-fcloop: don't wait for lport cleanup
nvmet-fcloop: add missing fcloop_callback_host_done
nvmet-fc: take tgtport refs for portentry
nvmet-fc: free pending reqs on tgtport unregister
nvmet-fcloop: drop response if targetport is gone
...
|
|
This adds a compatible string for the SPI controller on RK3528.
Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20250520100102.1226725-2-amadeus@jmu.edu.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs fix from Mike Marshall:
"Fix for orangefs page writeout counting"
* tag 'for-linus-6.15-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: adjust counting code to recover from 665575cf
|
|
A malicious USB device can send a WMI_SWBA_EVENTID event from an
ath9k_htc-managed device before beaconing has been enabled. This causes
a device-by-zero error in the driver, leading to either a crash or an
out of bounds read.
Prevent this by aborting the handling in ath9k_htc_swba() if beacons are
not enabled.
Reported-by: Robert Morris <rtm@csail.mit.edu>
Closes: https://lore.kernel.org/r/88967.1743099372@localhost
Fixes: 832f6a18fc2a ("ath9k_htc: Add beacon slots")
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://patch.msgid.link/20250402112217.58533-1-toke@toke.dk
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
Block devices can be opened read-write even if they can't be written to
for historic reasons. Remove the check requiring file->f_op->write_iter
when the block devices was opened in loop_configure. The call to
loop_check_backing_file just below ensures the ->write_iter is present
for backing files opened for writing, which is the only check that is
actually needed.
Fixes: f5c84eff634b ("loop: Add sanity check for read/write_iter")
Reported-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250520135420.1177312-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
A late commit to 6.14-rc7! broke orangefs. 665575cf seems like a
good change, but maybe should have been introduced during the merge
window. This patch adjusts the counting code associated with
writing out pages so that orangefs works in a 665575cf world.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
|
|
Whenever there is a change in the country code settings from the
user, driver does an intersection of the regulatory rules for this
new country with the original regulatory rules which were reported
during initialization time.
There is also similar logic running in firmware with a difference
that the intersection in firmware is only done when the country code
is configuration during boot up time (BDF/OTP). Firmware logic does
not kick in when no country code is configured during device bring
up time as the device is always expected to have the country code
configured properly in the deployment.
There is a debug/test use case that requires absolute regulatory
rules to be used for a user configured country code when the device
is not configured with a particular country code during boot up time.
To support the above test use case, remove the redundant regulatory
rules intersection logic in the host driver. Depend on the
intersection logic in firmware when the device comes up with
pre-configured country code.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00209-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Aishwarya R <quic_aisr@quicinc.com>
Signed-off-by: Rajat Soni <quic_rajson@quicinc.com>
Link: https://patch.msgid.link/20250505034351.1365914-1-quic_rajson@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
As per IEEE 802.11be-2024 - 9.4.2.321, EHT operation element
contains MCS15 Disable subfield as the sixth bit, which is set when
MCS15 support is not enabled.
During association, firmware will use this MCS15 flag to enable or
disable the reception of PPDU with EHT-MCS15 capability.
Send MCS15 support to firmware through WMI command during peer assoc.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Co-developed-by: Dhanavandhana Kannan <quic_dhanavan1@quicinc.com>
Signed-off-by: Dhanavandhana Kannan <quic_dhanavan1@quicinc.com>
Signed-off-by: Mohan Kumar G <quic_mkumarg@quicinc.com>
Link: https://patch.msgid.link/20250505153536.3275145-1-quic_mkumarg@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
fstest generic/388 occasionally reproduces a crash that looks as
follows:
BUG: kernel NULL pointer dereference, address: 0000000000000000
...
Call Trace:
<TASK>
ext4_block_zero_page_range+0x30c/0x380 [ext4]
ext4_truncate+0x436/0x440 [ext4]
ext4_process_orphan+0x5d/0x110 [ext4]
ext4_orphan_cleanup+0x124/0x4f0 [ext4]
ext4_fill_super+0x262d/0x3110 [ext4]
get_tree_bdev_flags+0x132/0x1d0
vfs_get_tree+0x26/0xd0
vfs_cmd_create+0x59/0xe0
__do_sys_fsconfig+0x4ed/0x6b0
do_syscall_64+0x82/0x170
...
This occurs when processing a symlink inode from the orphan list. The
partial block zeroing code in the truncate path calls
ext4_dirty_journalled_data() -> folio_mark_dirty(). The latter calls
mapping->a_ops->dirty_folio(), but symlink inodes are not assigned an
a_ops vector in ext4, hence the crash.
To avoid this problem, update the ext4_dirty_journalled_data() helper to
only mark the folio dirty on regular files (for which a_ops is
assigned). This also matches the journaling logic in the ext4_symlink()
creation path, where ext4_handle_dirty_metadata() is called directly.
Fixes: d84c9ebdac1e ("ext4: Mark pages with journalled data dirty")
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://patch.msgid.link/20250516173800.175577-1-bfoster@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org
|
|
Add an initial documentation around atomic writes support in ext4.
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/d3893b9f5ad70317abae72046e81e4c180af91bf.1747337952.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Last couple of patches added the needed support for multi-fsblock atomic
writes using bigalloc. This patch ensures that filesystem advertizes the
needed atomic write unit min and max values for enabling multi-fsblock
atomic write support with bigalloc.
Acked-by: Darrick J. Wong <djwong@kernel.org>
Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/5e45d7ed24499024b9079436ba6698dae5298e29.1747337952.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
EXT4 supports bigalloc feature which allows the FS to work in size of
clusters (group of blocks) rather than individual blocks. This patch
adds atomic write support for bigalloc so that systems with bs = ps can
also create FS using -
mkfs.ext4 -F -O bigalloc -b 4096 -C 16384 <dev>
With bigalloc ext4 can support multi-fsblock atomic writes. We will have to
adjust ext4's atomic write unit max value to cluster size. This can then support
atomic write of size anywhere between [blocksize, clustersize]. This
patch adds the required changes to enable multi-fsblock atomic write
support using bigalloc in the next patch.
In this patch for block allocation:
we first query the underlying region of the requested range by calling
ext4_map_blocks() call. Here are the various cases which we then handle
depending upon the underlying mapping type:
1. If the underlying region for the entire requested range is a mapped extent,
then we don't call ext4_map_blocks() to allocate anything. We don't need to
even start the jbd2 txn in this case.
2. For an append write case, we create a mapped extent.
3. If the underlying region is entirely a hole, then we create an unwritten
extent for the requested range.
4. If the underlying region is a large unwritten extent, then we split the
extent into 2 unwritten extent of required size.
5. If the underlying region has any type of mixed mapping, then we call
ext4_map_blocks() in a loop to zero out the unwritten and the hole regions
within the requested range. This then provide a single mapped extent type
mapping for the requested range.
Note: We invoke ext4_map_blocks() in a loop with the EXT4_GET_BLOCKS_ZERO
flag only when the underlying extent mapping of the requested range is
not entirely a hole, an unwritten extent, or a fully mapped extent. That
is, if the underlying region contains a mix of hole(s), unwritten
extent(s), and mapped extent(s), we use this loop to ensure that all the
short mappings are zeroed out. This guarantees that the entire requested
range becomes a single, uniformly mapped extent. It is ok to do so
because we know this is being done on a bigalloc enabled filesystem
where the block bitmap represents the entire cluster unit.
Note having a single contiguous underlying region of type mapped,
unwrittn or hole is not a problem. But the reason to avoid writing on
top of mixed mapping region is because, atomic writes requires all or
nothing should get written for the userspace pwritev2 request. So if at
any point in time during the write if a crash or a sudden poweroff
occurs, the region undergoing atomic write should read either complete
old data or complete new data. But it should never have a mix of both
old and new data.
So, we first convert any mixed mapping region to a single contiguous
mapped extent before any data gets written to it. This is because
normally FS will only convert unwritten extents to written at the end of
the write in ->end_io() call. And if we allow the writes over a mixed
mapping and if a sudden power off happens in between, we will end up
reading mix of new data (over mapped extents) and old data (over
unwritten extents), because unwritten to written conversion never went
through.
So to avoid this and to avoid writes getting torned due to mixed
mapping, we first allocate a single contiguous block mapping and then
do the write.
Acked-by: Darrick J. Wong <djwong@kernel.org>
Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/c4965ac3407cbc773f0bc954d0966d9696f5038a.1747337952.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
There can be a case where there are contiguous extents on the adjacent
leaf nodes of on-disk extent trees. So when someone tries to write to
this contiguous range, ext4_map_blocks() call will split by returning
1 extent at a time if this is not already cached in extent_status tree
cache (where if these extents when cached can get merged since they are
contiguous).
This is fine for a normal write however in case of atomic writes, it
can't afford to break the write into two. Now this is also something
that will only happen in the slow write case where we call
ext4_map_blocks() for each of these extents spread across different leaf
nodes. However, there is no guarantee that these extent status cache
cannot be reclaimed before the last call to ext4_map_blocks() in
ext4_map_blocks_atomic_write_slow().
Hence this patch adds support of EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS.
This flag checks if the requested range can be fully found in extent
status cache and return. If not, it looks up in on-disk extent
tree via ext4_map_query_blocks(). If the found extent is the last entry
in the leaf node, then it goes and queries the next lblk to see if there
is an adjacent contiguous extent in the adjacent leaf node of the
on-disk extent tree.
Even though there can be a case where there are multiple adjacent extent
entries spread across multiple leaf nodes. But we only read an adjacent
leaf block i.e. in total of 2 extent entries spread across 2 leaf nodes.
The reason for this is that we are mostly only going to support atomic
writes with upto 64KB or maybe max upto 1MB of atomic write support.
Acked-by: Darrick J. Wong <djwong@kernel.org>
Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/6bb563e661f5fbd80e266a9e6ce6e29178f555f6.1747337952.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Let's make ext4_meta_trans_blocks() non-static for use in later
functions during ->end_io conversion for atomic writes.
We will need this function to estimate journal credits for a special
case. Instead of adding another wrapper around it, let's make this
non-static.
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/23ce80d4286f792831ce99d13558182ee228fedb.1747337952.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
EXT4 only supports doing atomic write on inodes which uses extents, so
add a check in ext4_inode_can_atomic_write() which gets called during
open.
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/86bb502c979398a736ab371d8f35f6866a477f6c.1747337952.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
ext4_iomap_overwrite_begin() clears the flag for IOMAP_WRITE before
calling ext4_iomap_begin(). Document this above ext4_map_blocks() call
as it is easy to miss it when focusing on write paths alone.
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://patch.msgid.link/fd50ba05440042dff77d555e463a620a79f8d0e9.1747337952.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Since jbd2_superblock_csum() no longer uses its journal_t argument,
remove it.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Link: https://patch.msgid.link/20250513053809.699974-5-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Since jbd2_chksum() no longer uses its journal_t argument, remove it.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Link: https://patch.msgid.link/20250513053809.699974-4-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|