summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-01selftests: ublk: kublk: use ioctl-encoded opcodesUday Shankar
There are a couple of places in the kublk selftests ublk server which use the legacy ublk opcodes. These operations fail (with -EOPNOTSUPP) on a kernel compiled without CONFIG_BLKDEV_UBLK_LEGACY_OPCODES set. We could easily require it to be set as a prerequisite for these selftests, but since new applications should not be using the legacy opcodes, use the ioctl-encoded opcodes everywhere in kublk. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250401-ublk_selftests-v1-1-98129c9bc8bb@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-01io_uring/zcrx: return early from io_zcrx_recv_skb if readlen is 0David Wei
When readlen is set for a recvzc request, tcp_read_sock() will call io_zcrx_recv_skb() one final time with len == desc->count == 0. This is caused by the !desc->count check happening too late. The offset + 1 != skb->len happens earlier and causes the while loop to continue. Fix this in io_zcrx_recv_skb() instead of tcp_read_sock(). Return early if len is 0 i.e. the read is done. Fixes: 6699ec9a23f8 ("io_uring/zcrx: add a read limit to recvzc requests") Signed-off-by: David Wei <dw@davidwei.uk> Link: https://lore.kernel.org/r/20250401195355.1613813-1-dw@davidwei.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-31io_uring/net: avoid import_ubuf for regvec sendPavel Begunkov
With registered buffers we set up iterators in helpers like io_import_fixed(), and there is no need for a import_ubuf() before that. It was fine as we used real pointers for offset calculation, but that's not the case anymore since introduction of ublk kernel buffers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9b2de1a50844f848f62c8de609b494971033a6b9.1743437358.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-31io_uring/rsrc: check size when importing reg bufferPavel Begunkov
We're relying on callers to verify the IO size, do it inside of io_import_fixed() instead. It's safer, easier to deal with, and more consistent as now it's done close to the iter init site. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f9c2c75ec4d356a0c61289073f68d98e8a9db190.1743446271.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-31io_uring: cleanup {g,s]etsockopt sqe readingPavel Begunkov
Add a local variable for the sqe pointer to avoid repetition. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8dbac0f9acda2d3842534eeb7ce10d9276b021ae.1743357108.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-31io_uring: hide caches sqes from driversPavel Begunkov
There is now an io_uring private part of cmd async_data, move saved sqe into it. Drivers are accessing it via struct io_uring_cmd::cmd. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ecbe078dd57acefdbc4366d083327086c0879378.1743357121.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-31io_uring: make zcrx depend on CONFIG_IO_URINGPavel Begunkov
Reflect in the kconfig that zcrx requires io_uring compiled. Fixes: 6f377873cb239 ("io_uring/zcrx: add interface queue and refill queue") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8047135a344e79dbd04ee36a7a69cc260aabc2ca.1743356260.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-31io_uring: add req flag invariant build assertionPavel Begunkov
We're caching some of file related request flags in a tricky way, put a build check to make sure flags don't get reshuffled. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9877577b83c25dd78224a8274f799187e7ec7639.1743407551.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-31Documentation: ublk: remove dead footnoteJens Axboe
A previous commit removed the use of this footnote, delete it. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 3fdf2ec7da1c ("Documentation: ublk: Drop Stefan Hajnoczi's message footnote") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-29selftests: ublk: specify io_cmd_buf pointer typeCaleb Sander Mateos
Matching the ublk driver, change the type of io_cmd_buf from char * to struct ublksrv_io_desc *. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20250328194230.2726862-3-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-29ublk: specify io_cmd_buf pointer typeCaleb Sander Mateos
io_cmd_buf points to an array of ublksrv_io_desc structs but its type is char *. Indexing the array requires an explicit multiplication and cast. The compiler also can't check the pointer types. Change io_cmd_buf's type to struct ublksrv_io_desc * so it can be indexed directly and the compiler can type-check the code. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20250328194230.2726862-2-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring: don't pass ctx to tw add remote helperPavel Begunkov
Unlike earlier versions, io_msg_remote_post() creates a valid request with a proper context, so don't pass a context to io_req_task_work_add_remote() explicitly but derive it from the request. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/721f51cf34996d98b48f0bfd24ad40aa2730167e.1743190078.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring/msg: initialise msg request opcodePavel Begunkov
It's risky to have msg request opcode set to garbage, so at least initialise it to nop. Later we might want to add a user inaccessible opcode for such cases. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9afe650fcb348414a4529d89f52eb8969ba06efd.1743190078.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring/msg: rename io_double_lock_ctx()Pavel Begunkov
io_double_lock_ctx() doesn't lock both rings. Rename it to prevent any future confusion. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9e5defa000efd9b0f5e169cbb6bad4994d46ec5c.1743190078.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring/net: import zc ubuf earlierPavel Begunkov
io_send_setup() already sets up the iterator for IORING_OP_SEND_ZC, we don't need repeating that at issue time. Move it all together with mem accounting at prep time, which is more consistent with how the non-zc version does that. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/eb54f007c493ad9f4ca89aa8e715baf30d83fb88.1743202294.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring/net: set sg_from_iter in advancePavel Begunkov
In preparation to the next patch, set ->sg_from_iter callback at request prep time. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5fe2972701df3bacdb3d760bce195fa640bee201.1743202294.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring/net: clusterise send vs msghdr branchesPavel Begunkov
We have multiple branches at prep for send vs sendmsg handling, put them together so that the variant handling is more localised. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/33abf666d9ded74cba4da2f0d9fe58e88520dffe.1743202294.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring/net: unify sendmsg setup with zcPavel Begunkov
io_sendmsg_zc_setup() duplicates parts of io_sendmsg_setup(), and the only difference between them is that the former support vectored registered buffers with nothing zerocopy specific. Merge them together, we want regular sendmsg to eventually support fixed buffers either way. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7e5ec40f9dc93355399dc6fa0cbc8b31f0b20ac5.1743202294.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring/net: combine sendzc flags writesPavel Begunkov
Save an instruction / trip to the cache and assign some of sendzc flags together. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c519d6f406776c3be3ef988a8339a88e45d1ffd9.1743202294.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring/net: open code io_net_vec_assign()Pavel Begunkov
Get rid of io_net_vec_assign() by open coding it into its only caller. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/19191c34b5cfe1161f7eeefa6e785418ea9ad56d.1743202294.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring/net: open code io_sendmsg_copy_hdr()Pavel Begunkov
io_sendmsg_setup() is trivial and io_sendmsg_copy_hdr() doesn't add any good abstraction, open code one into another. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/565318ce585665e88053663eeee5178d2c15692f.1743202294.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: store req in ublk_uring_cmd_pdu for ublk_cmd_tw_cb()Caleb Sander Mateos
Pass struct request *rq to ublk_cmd_tw_cb() through ublk_uring_cmd_pdu, mirroring how it works for ublk_cmd_list_tw_cb(). This saves some pointer dereferences, as well as the bounds check in blk_mq_tag_to_rq(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250328180411.2696494-6-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: avoid redundant io->cmd in ublk_queue_cmd_list()Caleb Sander Mateos
ublk_queue_cmd_list() loads io->cmd twice. The intervening stores prevent the compiler from combining the loads. Since struct ublk_io *io is only used to compute io->cmd, replace the variable with io->cmd. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250328180411.2696494-5-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: get ubq from pdu in ublk_cmd_list_tw_cb()Caleb Sander Mateos
Save a few pointer dereferences by obtaining struct ublk_queue *ubq from the ublk_uring_cmd_pdu instead of the request's mq_hctx. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250328180411.2696494-4-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: skip 1 NULL check in ublk_cmd_list_tw_cb() loopCaleb Sander Mateos
ublk_cmd_list_tw_cb() is always performed on a non-empty request list. So don't check whether rq is NULL on the first iteration of the loop, just on subsequent iterations. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250328180411.2696494-3-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: remove unused cmd argument to ublk_dispatch_req()Caleb Sander Mateos
ublk_dispatch_req() never uses its struct io_uring_cmd *cmd argument. Drop it so callers don't have to pass a value. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250328180411.2696494-2-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28selftests: ublk: add test for checking zero copy related parameterMing Lei
ublk zero copy usually requires to set dma and segment parameter correctly, so hard-code null target's dma & segment parameter in non-default value, and verify if they are setup correctly by ublk driver. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-12-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28selftests: ublk: add more tests for covering MQMing Lei
Add test test_generic_02.sh for covering IO dispatch order in case of MQ. Especially we just support ->queue_rqs() which may affect IO dispatch order. Add test_loop_05.sh and test_stripe_03.sh for covering MQ. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-11-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: rename ublk_rq_task_work_cb as ublk_cmd_tw_cbMing Lei
The new name is aligned with ublk_cmd_list_tw_cb(), and looks more readable. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-10-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: implement ->queue_rqs()Ming Lei
Implement ->queue_rqs() for improving perf in case of MQ. In this way, we just need to call io_uring_cmd_complete_in_task() once for whole IO batch, then both io_uring and ublk server can get exact batch from ublk frontend. Follows IOPS improvement: - tests tools/testing/selftests/ublk/kublk add -t null -q 2 [-z] fio/t/io_uring -p0 /dev/ublkb0 - results: more than 10% IOPS boost observed Pass all ublk selftests, especially the io dispatch order test. Cc: Uday Shankar <ushankar@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-9-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: document zero copy featureMing Lei
Add words to explain how zero copy feature works, and why it has to be trusted for handling IO read command. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-8-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: add segment parameterMing Lei
IO split is usually bad in io_uring world, since -EAGAIN is caused and IO handling may have to fallback to io-wq, this way does hurt performance. ublk starts to support zero copy recently, for avoiding unnecessary IO split, ublk driver's segment limit should be aligned with backend device's segment limit. Another reason is that io_buffer_register_bvec() needs to allocate bvecs, which number is aligned with ublk request segment number, so that big memory allocation can be avoided by setting reasonable max_segments limit. So add segment parameter for providing ublk server chance to align segment limit with backend, and keep it reasonable from implementation viewpoint. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-7-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: call io_uring_cmd_to_pdu to get uring_cmd pduMing Lei
Call io_uring_cmd_to_pdu() to get uring_cmd pdu, and one big benefit is the automatic pdu size build check. Suggested-by: Uday Shankar <ushankar@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250327095123.179113-6-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: add helper of ublk_need_map_io()Ming Lei
ublk_need_map_io() is more readable. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: remove two unused fields from 'struct ublk_queue'Ming Lei
Remove two unused fields(`io_addr` & `max_io_sz`) from `struct ublk_queue`. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: comment on ubq->canceling handling in ublk_queue_rq()Ming Lei
In ublk_queue_rq(), ubq->canceling has to be handled after ->fail_io and ->force_abort are dealt with, otherwise the request may not be failed when deleting disk. Add comment on this usage. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28ublk: make sure ubq->canceling is set when queue is frozenMing Lei
Now ublk driver depends on `ubq->canceling` for deciding if the request can be dispatched via uring_cmd & io_uring_cmd_complete_in_task(). Once ubq->canceling is set, the uring_cmd can be done via ublk_cancel_cmd() and io_uring_cmd_done(). So set ubq->canceling when queue is frozen, this way makes sure that the flag can be observed from ublk_queue_rq() reliably, and avoids use-after-free on uring_cmd. Fixes: 216c8f5ef0f2 ("ublk: replace monitor with cancelable uring_cmd") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250327095123.179113-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28io_uring/net: account memory for zc sendmsgPavel Begunkov
Account pinned pages for IORING_OP_SENDMSG_ZC, just as we for IORING_OP_SEND_ZC and net/ does for MSG_ZEROCOPY. Fixes: 493108d95f146 ("io_uring/net: zerocopy sendmsg") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4f00f67ca6ac8e8ed62343ae92b5816b1e0c9c4b.1743086313.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-28Merge tag 'for-6.15/io_uring-reg-vec-20250327' of git://git.kernel.dk/linuxLinus Torvalds
Pull more io_uring updates from Jens Axboe: "Final separate updates for io_uring. This started out as a series of cleanups improvements and improvements for registered buffers, but as the last series of the io_uring changes for 6.15, it also collected a few fixes for the other branches on top: - Add support for vectored fixed/registered buffers. Previously only single segments have been supported for commands, now vectored variants are supported as well. This series includes networking and file read/write support. - Small series unifying return codes across multi and single shot. - Small series cleaning up registerd buffer importing. - Adding support for vectored registered buffers for uring_cmd. - Fix for io-wq handling of command reissue. - Various little fixes and tweaks" * tag 'for-6.15/io_uring-reg-vec-20250327' of git://git.kernel.dk/linux: (25 commits) io_uring/net: fix io_req_post_cqe abuse by send bundle io_uring/net: use REQ_F_IMPORT_BUFFER for send_zc io_uring: move min_events sanitisation io_uring: rename "min" arg in io_iopoll_check() io_uring: open code __io_post_aux_cqe() io_uring: defer iowq cqe overflow via task_work io_uring: fix retry handling off iowq io_uring/net: only import send_zc buffer once io_uring/cmd: introduce io_uring_cmd_import_fixed_vec io_uring/cmd: add iovec cache for commands io_uring/cmd: don't expose entire cmd async data io_uring: rename the data cmd cache io_uring: rely on io_prep_reg_vec for iovec placement io_uring: introduce io_prep_reg_iovec() io_uring: unify STOP_MULTISHOT with IOU_OK io_uring: return -EAGAIN to continue multishot io_uring: cap cached iovec/bvec size io_uring/net: implement vectored reg bufs for zctx io_uring/net: convert to struct iou_vec io_uring/net: pull vec alloc out of msghdr import ...
2025-03-28Merge tag 'for-6.15/io_uring-epoll-wait-20250325' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring epoll support from Jens Axboe: "This adds support for reading epoll events via io_uring. While this may seem counter-intuitive (and/or productive), the reasoning here is that quite a few existing epoll event loops can easily do a partial conversion to a completion based model, but are still stuck with one (or few) event types that remain readiness based. For that case, they then need to add the io_uring fd to the epoll context, and continue to rely on epoll_wait(2) for waiting on events. This misses out on the finer grained waiting that io_uring can do, to reduce context switches and wait for multiple events in one batch reliably. With adding support for reaping epoll events via io_uring, the whole legacy readiness based event types can still be reaped via epoll, with the overall waiting in the loop be driven by io_uring" * tag 'for-6.15/io_uring-epoll-wait-20250325' of git://git.kernel.dk/linux: io_uring/epoll: add support for IORING_OP_EPOLL_WAIT io_uring/epoll: remove CONFIG_EPOLL guards
2025-03-28Merge tag 'for-6.15/io_uring-rx-zc-20250325' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring zero-copy receive support from Jens Axboe: "This adds support for zero-copy receive with io_uring, enabling fast bulk receive of data directly into application memory, rather than needing to copy the data out of kernel memory. While this version only supports host memory as that was the initial target, other memory types are planned as well, with notably GPU memory coming next. This work depends on some networking components which were queued up on the networking side, but have now landed in your tree. This is the work of Pavel Begunkov and David Wei. From the v14 posting: 'We configure a page pool that a driver uses to fill a hw rx queue to hand out user pages instead of kernel pages. Any data that ends up hitting this hw rx queue will thus be dma'd into userspace memory directly, without needing to be bounced through kernel memory. 'Reading' data out of a socket instead becomes a _notification_ mechanism, where the kernel tells userspace where the data is. The overall approach is similar to the devmem TCP proposal This relies on hw header/data split, flow steering and RSS to ensure packet headers remain in kernel memory and only desired flows hit a hw rx queue configured for zero copy. Configuring this is outside of the scope of this patchset. We share netdev core infra with devmem TCP. The main difference is that io_uring is used for the uAPI and the lifetime of all objects are bound to an io_uring instance. Data is 'read' using a new io_uring request type. When done, data is returned via a new shared refill queue. A zero copy page pool refills a hw rx queue from this refill queue directly. Of course, the lifetime of these data buffers are managed by io_uring rather than the networking stack, with different refcounting rules. This patchset is the first step adding basic zero copy support. We will extend this iteratively with new features e.g. dynamically allocated zero copy areas, THP support, dmabuf support, improved copy fallback, general optimisations and more' In a local setup, I was able to saturate a 200G link with a single CPU core, and at netdev conf 0x19 earlier this month, Jamal reported 188Gbit of bandwidth using a single core (no HT, including soft-irq). Safe to say the efficiency is there, as bigger links would be needed to find the per-core limit, and it's considerably more efficient and faster than the existing devmem solution" * tag 'for-6.15/io_uring-rx-zc-20250325' of git://git.kernel.dk/linux: io_uring/zcrx: add selftest case for recvzc with read limit io_uring/zcrx: add a read limit to recvzc requests io_uring: add missing IORING_MAP_OFF_ZCRX_REGION in io_uring_mmap io_uring: Rename KConfig to Kconfig io_uring/zcrx: fix leaks on failed registration io_uring/zcrx: recheck ifq on shutdown io_uring/zcrx: add selftest net: add documentation for io_uring zcrx io_uring/zcrx: add copy fallback io_uring/zcrx: throttle receive requests io_uring/zcrx: set pp memory provider for an rx queue io_uring/zcrx: add io_recvzc request io_uring/zcrx: dma-map area for the device io_uring/zcrx: implement zerocopy receive pp memory provider io_uring/zcrx: grab a net device io_uring/zcrx: add io_zcrx_area io_uring/zcrx: add interface queue and refill queue
2025-03-28Merge tag 'tpmdd-next-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm updates from Jarkko Sakkinen: "This contains a new driver: a TPM FF-A driver. FF comes from Firmware Framework, and A comes from Arm's A-profile. FF-A is essentially a standard mechanism to communicate with TrustZone apps such as TPM. Other than that, this includes a pile of fixes and small improvments" * tag 'tpmdd-next-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Make chip->{status,cancel,req_canceled} opt MAINTAINERS: TPM DEVICE DRIVER: add missing includes tpm: End any active auth session before shutdown Documentation: tpm: Add documentation for the CRB FF-A interface tpm_crb: Add support for the ARM FF-A start method ACPICA: Add start method for ARM FF-A tpm_crb: Clean-up and refactor check for idle support tpm_crb: ffa_tpm: Implement driver compliant to CRB over FF-A tpm/tpm_ftpm_tee: fix struct ftpm_tee_private documentation tpm, tpm_tis: Workaround failed command reception on Infineon devices tpm, tpm_tis: Fix timeout handling when waiting for TPM status tpm: Convert warn to dbg in tpm2_start_auth_session() tpm: Lazily flush auth session when getting random data tpm: ftpm_tee: remove incorrect of_match_ptr annotation tpm: do not start chip while suspended
2025-03-28Merge tag 'crc-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull CRC fixes from Eric Biggers: "Fix out-of-scope array bugs in arm and arm64's crc_t10dif_arch()" * tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: arm64/crc-t10dif: fix use of out-of-scope array in crc_t10dif_arch() arm/crc-t10dif: fix use of out-of-scope array in crc_t10dif_arch()
2025-03-28Merge tag 'landlock-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This brings two main changes to Landlock: - A signal scoping fix with a new interface for user space to know if it is compatible with the running kernel. - Audit support to give visibility on why access requests are denied, including the origin of the security policy, missing access rights, and description of object(s). This was designed to limit log spam as much as possible while still alerting about unexpected blocked access. With these changes come new and improved documentation, and a lot of new tests" * tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (36 commits) landlock: Add audit documentation selftests/landlock: Add audit tests for network selftests/landlock: Add audit tests for filesystem selftests/landlock: Add audit tests for abstract UNIX socket scoping selftests/landlock: Add audit tests for ptrace selftests/landlock: Test audit with restrict flags selftests/landlock: Add tests for audit flags and domain IDs selftests/landlock: Extend tests for landlock_restrict_self(2)'s flags selftests/landlock: Add test for invalid ruleset file descriptor samples/landlock: Enable users to log sandbox denials landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF landlock: Add LANDLOCK_RESTRICT_SELF_LOG_*_EXEC_* flags landlock: Log scoped denials landlock: Log TCP bind and connect denials landlock: Log truncate and IOCTL denials landlock: Factor out IOCTL hooks landlock: Log file-related denials landlock: Log mount-related denials landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials ...
2025-03-28Merge tag 'caps-pr-20250327' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux Pull capabilities update from Serge Hallyn: "This contains just one patch that removes a helper function whose last user (smack) stopped using it in 2018" * tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux: capability: Remove unused has_capability
2025-03-28Merge tag 'integrity-v6.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull ima updates from Mimi Zohar: "Two performance improvements, which minimize the number of integrity violations" * tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: limit the number of ToMToU integrity violations ima: limit the number of open-writers integrity violations
2025-03-28Merge tag 'ipe-pr-20250324' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe Pull ipe update from Fan Wu: "This contains just one commit from Randy Dunlap, which fixes kernel-doc warnings in the IPE subsystem" * tag 'ipe-pr-20250324' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe: ipe: policy_fs: fix kernel-doc warnings
2025-03-28Revert "Merge tag 'irq-msi-2025-03-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip" This reverts commit 36f5f026df6c1cd8a20373adc4388d2b3401ce91, reversing changes made to 43a7eec035a5b64546c8adefdc9cf96a116da14b. Thomas says: "I just noticed that for some incomprehensible reason, probably sheer incompetemce when trying to utilize b4, I managed to merge an outdated _and_ buggy version of that series. Can you please revert that merge completely?" Done. Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-27Merge tag 'm68knommu-for-v6.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu updates from Greg Ungerer: - remove unused include of linux/fb.h - use strscpy() instead of strncpy() * tag 'm68knommu-for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: mm: Replace deprecated strncpy() with strscpy() m68k: Do not include <linux/fb.h>
2025-03-27Merge tag 'powerpc-6.15-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Madhavan Srinivasan: - Remove support for IBM Cell Blades - SMP support for microwatt platform - Support for inline static calls on PPC32 - Enable pmu selftests for power11 platform - Enable hardware trace macro (HTM) hcall support - Support for limited address mode capability - Changes to RMA size from 512 MB to 768 MB to handle fadump - Misc fixes and cleanups Thanks to Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann, Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom, Gaurav Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook, Mahesh Salgaonkar, Michael Ellerman, Paul Mackerras, Ritesh Harjani (IBM), Sathvika Vasireddy, Segher Boessenkool, Sourabh Jain, Vaibhav Jain, and Venkat Rao Bagalkote. * tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (61 commits) powerpc/kexec: fix physical address calculation in clear_utlb_entry() crypto: powerpc: Mark ghashp8-ppc.o as an OBJECT_FILES_NON_STANDARD powerpc: Fix 'intra_function_call not a direct call' warning powerpc/perf: Fix ref-counting on the PMU 'vpa_pmu' KVM: PPC: Enable CAP_SPAPR_TCE_VFIO on pSeries KVM guests powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7 powerpc/microwatt: Add SMP support powerpc: Define config option for processors with broadcast TLBIE powerpc/microwatt: Define an idle power-save function powerpc/microwatt: Device-tree updates powerpc/microwatt: Select COMMON_CLK in order to get the clock framework net: toshiba: Remove reference to PPC_IBM_CELL_BLADE net: spider_net: Remove powerpc Cell driver cpufreq: ppc_cbe: Remove powerpc Cell driver genirq: Remove IRQ_EDGE_EOI_HANDLER docs: Remove reference to removed CBE_CPUFREQ_SPU_GOVERNOR powerpc: Remove UDBG_RTAS_CONSOLE powerpc/io: Use standard barrier macros in io.c powerpc/io: Rename _insw_ns() etc. powerpc/io: Use generic raw accessors ...