summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-05-09io_uring: add lockdep asserts to io_add_aux_cqePavel Begunkov
io_add_aux_cqe() can only be called for rings with uring_lock protected completion queues, add a couple of assertions in regards to that. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c010eab7b94a187c00a9d46d8b67bf7fcad18af4.1746788592.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-09io_uring/net: move CONFIG_NET guards to MakefilePavel Begunkov
Instruct Makefile to never try to compile net.c without CONFIG_NET and kill ifdefs in the file. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f466400e20c3f536191bfd559b1f3cd2a2ab5a1e.1746788579.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-09io_uring: update parameter name in io_pin_pages function declarationLong Li
Rename first parameter in io_pin_pages from ubuf to uaddr for consistency between declaration and implementation. Signed-off-by: Long Li <leo.lilong@huawei.com> Link: https://lore.kernel.org/r/20250509063015.3799255-1-leo.lilong@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-06io_uring: move io_req_put_rsrc_nodes()Pavel Begunkov
It'd be nice to hide details of how rsrc nodes are used by a request from rsrc.c, specifically which request fields store them, and what bits are signifying if there is a node in a request. It rather belong to generic request handling, so move the helper to io_uring.c. While doing so remove clearing of ->buf_node as it's controlled by REQ_F_BUF_NODE and doesn't require zeroing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/bb73fb42baf825edb39344365aff48cdfdd4c692.1746533789.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-06io_uring: remove io_preinit_req()Pavel Begunkov
Apart from setting ->ctx, io_preinit_req() zeroes a bunch of fields of a request, from which only ->file_node is mandatory. Remove the function and zero the entire request on first allocation. With that, we also need to initialise ->ctx every time, which might be a good thing for performance as now we're likely overwriting the entire cache line, and so it can write combined and avoid RMW. Suggested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ba5485dc913f1e275862ce88f5169d4ac4a33836.1746533807.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-06io_uring/timeout: don't export link t-out disarm helperPavel Begunkov
[__]io_disarm_linked_timeout() are only used inside timeout.c. so confine them inside the file. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1eb200911255e643bf252a8e65fb2c787340cf18.1746533800.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-06io_uring/zcrx: dmabuf backed zerocopy receivePavel Begunkov
Add support for dmabuf backed zcrx areas. To use it, the user should pass IORING_ZCRX_AREA_DMABUF in the struct io_uring_zcrx_area_reg flags field and pass a dmabuf fd in the dmabuf_fd field. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20bb1890e60a82ec945ab36370d1fd54be414ab6.1746097431.git.asml.silence@gmail.com Link: https://lore.kernel.org/io-uring/6e37db97303212bbd8955f9501cf99b579f8aece.1746547722.git.asml.silence@gmail.com [axboe: fold in fixup] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-02io_uring/zcrx: split common area map/unmap partsPavel Begunkov
Extract area type depedent parts of io_zcrx_[un]map_area from the generic path. It'll be helpful once there are more area memory types and not only user pages. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/50f6e893e2d20f937e628196cbf528d15f81c289.1746097431.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-02io_uring/zcrx: split out memory holders from areaPavel Begunkov
In the data path users of struct io_zcrx_area don't need to know what kind of memory it's backed by. Only keep there generic bits in there and and split out memory type dependent fields into a new structure. It also logically separates the step that actually imports the memory, e.g. pinning user pages, from the generic area initialisation. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b60fc09c76921bf69e77eb17e07eb4decedb3bf4.1746097431.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-02io_uring/zcrx: resolve netdev before area creationPavel Begunkov
Some area types will require a valid struct device to be created, so resolve netdev and struct device before creating an area. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ac8c1482be22acfe9ca788d2c3ce31b7451ce488.1746097431.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-02io_uring/zcrx: improve area validationPavel Begunkov
dmabuf backed area will be taking an offset instead of addresses, and io_buffer_validate() is not flexible enough to facilitate it. It also takes an iovec, which may truncate the u64 length zcrx takes. Add a new helper function for validation. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0b3b735391a0a8f8971bf0121c19765131fddd3b.1746097431.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-28io_uring/cmd: move net cmd into a separate filePavel Begunkov
We keep socket io_uring command implementation in io_uring/uring_cmd.c. Separate it from generic command code into a separate file. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/747d0519a2255bd055ae76b691d38d2b4c311001.1745843119.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-28io_uring: delete misleading comment in io_fill_cqe_aux()Pavel Begunkov
io_fill_cqe_aux() doesn't overflow completions, however it might fail them and lets the caller handle it. Remove the comment, which doesn't make any sense. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/021aa8c1d8f20ef2b66da6aeabb6b511938fd2c5.1745843119.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-24io_uring/eventfd: open code io_eventfd_grab()Pavel Begunkov
io_eventfd_grab() doesn't help wit understanding the path, it'll be simpler to keep the helper open coded. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5cb53ce3876c2819db9e8055cf41dca4398521db.1745493845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-24io_uring/eventfd: clean up rcu lockingPavel Begunkov
Conditional locking is never welcome if there are better options. Move rcu locking into io_eventfd_signal(), make it unconditional and use guards. It also helps with sparse warnings. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/91a925e708ca8a5aa7fee61f96d29b24ea9adeaf.1745493845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-24io_uring/eventfd: dedup signalling helpersPavel Begunkov
Consolidate io_eventfd_flush_signal() and io_eventfd_signal(). Not much of a difference for now, but it prepares it for following changes. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5beecd4da65d8d2d83df499196f84b329387f6a2.1745493845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-23io_uring/zcrx: add support for multiple ifqsPavel Begunkov
Allow the user to register multiple ifqs / zcrx contexts. With that we can use multiple interfaces / interface queues in a single io_uring instance. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/668b03bee03b5216564482edcfefbc2ee337dd30.1745141261.git.asml.silence@gmail.com [axboe: fold in fix] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/zcrx: move zcrx region to struct io_zcrx_ifqPavel Begunkov
Refill queue region is a part of zcrx and should stay in struct io_zcrx_ifq. We can't have multiple queues without it, so move it there. As a result there is no context global zcrx region anymore, and the region is looked up together with its ifq. To protect a concurrent mmap from seeing an inconsistent region we were protecting changes to ->zcrx_region with mmap_lock, but now it protect the publishing of the ifq. Reviewed-by: David Wei <dw@davidwei.uk> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/24f1a728fc03d0166f16d099575457e10d9d90f2.1745141261.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/zcrx: let zcrx choose region for mmapingPavel Begunkov
In preparation for adding multiple ifqs, add a helper returning a region for mmaping zcrx refill queue. For now it's trivial and returns the same ctx global ->zcrx_region. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d9006e2ef8cd5e5b337c2ba2228ede8409eb60f2.1745141261.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/zcrx: remove sqe->file_index checkPavel Begunkov
sqe->file_index and sqe->zcrx_ifq_idx are aliased, so even though io_recvzc_prep() has all necessary handling and checking for ->zcrx_ifq_idx, it's already rejected a couple of lines above. It's not a real problem, however, as we can only have one ifq at the moment, which index is always zero, and it returns correct error codes to the user. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b51acedd326d1a87bf53e33303e6fc87661a3e3f.1745141261.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/zcrx: move io_zcrx_iov_pagePavel Begunkov
We'll need io_zcrx_iov_page at the top to keep offset calculations closer together, move it there. Reviewed-by: David Wei <dw@davidwei.uk> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/575617033a8b84a5985c7eb760b7121efdbe7e56.1745141261.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/zcrx: remove duplicated freelist initPavel Begunkov
Several lines below we already initialise the freelist, don't do it twice. Reviewed-by: David Wei <dw@davidwei.uk> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/71b07c4e00d1db65899d1ed8d603d961fe7d1c28.1745141261.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/rsrc: remove null check on importPavel Begunkov
WARN_ON_ONCE() checking imu for NULL in io_import_fixed() is in the hot path and too protective, it's time to get rid of it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3782c01e311dbd474bca45aefaf79a2f2822fafb.1745083025.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/rsrc: clean up io_coalesce_buffer()Pavel Begunkov
We don't need special handling for the first page in io_coalesce_buffer(), move it inside the loop. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Link: https://lore.kernel.org/r/ad698cddc1eadb3d92a7515e95bb13f79420323d.1745083025.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/rsrc: use unpin_user_folioPavel Begunkov
We want to have a full folio to be left pinned but with only one reference, for that we "unpin" all but the first page with unpin_user_pages(), which can be confusing. There is a new helper to achieve that called unpin_user_folio(), so use that. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Link: https://lore.kernel.org/r/e0b2be8f9ea68f6b351ec3bb046f04f437f68491.1745083025.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/rsrc: remove node assignment helpersJens Axboe
There are two helpers here, one assigns and increments the node ref count, and the other is simply a wrapper around that for the buffer node handling. The buffer node assignment benefits from checking and setting REQ_F_BUF_NODE together, otherwise stalls have been observed on setting that flag later in the process. Hence re-do it so that it's set when checked, and cleared in case of (unlikely) failure. With that, the buffer node helper can go, and then drop the generic io_req_assign_rsrc_node() helper as well as there's only a single user of it left at that point. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring: add support for IORING_OP_PIPEJens Axboe
This works just like pipe2(2), except it also supports fixed file descriptors. Used in a similar fashion as for other fd instantiating opcodes (like accept, socket, open, etc), where sqe->file_slot is set appropriately if two direct descriptors are desired rather than a set of normal file descriptors. sqe->addr must be set to a pointer to an array of 2 integers, which is where the fixed/normal file descriptors are copied to. sqe->pipe_flags contains flags, same as what is allowed for pipe2(2). Future expansion of per-op private flags can go in sqe->ioprio, like we do for other opcodes that take both a "syscall" flag set and an io_uring opcode specific flag set. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring: don't store bgid in req->buf_indexPavel Begunkov
Pass buffer group id into the rest of helpers via struct buf_sel_arg and remove all reassignments of req->buf_index back to bgid. Now, it only stores buffer indexes, and the group is provided by callers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3ea9fa08113ecb4d9224b943e7806e80a324bdf9.1743437358.git.asml.silence@gmail.com Link: https://lore.kernel.org/io-uring/0c01d76ff12986c2f48614db8610caff8f78c869.1743500909.git.asml.silence@gmail.com/ [axboe: fold in patch from second link] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/kbuf: pass bgid to io_buffer_select()Pavel Begunkov
The current situation with buffer group id juggling is not ideal. req->buf_index first stores the bgid, then it's overwritten by a buffer id, and then it can get restored back no recycling / etc. It's not so easy to control, and it's not handled consistently across request types with receive requests saving and restoring the bgid it by hand. It's a prep patch that adds a buffer group id argument to io_buffer_select(). The caller will be responsible for stashing a copy somewhere and passing it into the function. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a210d6427cc3f4f42271a6853274cd5a50e56820.1743437358.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring: set IMPORT_BUFFER in generic send setupPavel Begunkov
Move REQ_F_IMPORT_BUFFER to the common send setup. Currently, the only user is send zc, but we'll want for normal sends to support that in the future. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/18b74edfc61d8a1b6c9fed3b78a9276fe80f8ced.1743437358.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/net: don't use io_do_buffer_select at prepPavel Begunkov
Prep code is interested whether it's a selected buffer request, not whether a buffer has already been selected like what io_do_buffer_select() returns. Check for REQ_F_BUFFER_SELECT directly. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4488a029ac698554bebf732263fe19d7734affa6.1743437358.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-21io_uring/wq: avoid indirect do_work/free_work callsCaleb Sander Mateos
struct io_wq stores do_work and free_work function pointers which are called on each work item. But these function pointers are always set to io_wq_submit_work and io_wq_free_work, respectively. So remove these function pointers and just call the functions directly. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://lore.kernel.org/r/20250329161527.3281314-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-20gcc-15: disable '-Wunterminated-string-initialization' entirely for nowLinus Torvalds
I had left the warning around but as a non-fatal error to get my gcc-15 builds going, but fixed up some of the most annoying warning cases so that it wouldn't be *too* verbose. Because I like the _concept_ of the warning, even if I detested the implementation to shut it up. It turns out the implementation to shut it up is even more broken than I thought, and my "shut up most of the warnings" patch just caused fatal errors on gcc-14 instead. I had tested with clang, but when I upgrade my development environment, I try to do it on all machines because I hate having different systems to maintain, and hadn't realized that gcc-14 now had issues. The ACPI case is literally why I wanted to have a *type* that doesn't trigger the warning (see commit d5d45a7f2619: "gcc-15: make 'unterminated string initialization' just a warning"), instead of marking individual places as "__nonstring". But gcc-14 doesn't like that __nonstring location that shut gcc-15 up, because it's on an array of char arrays, not on one single array: drivers/acpi/tables.c:399:1: error: 'nonstring' attribute ignored on objects of type 'const char[][4]' [-Werror=attributes] 399 | static const char table_sigs[][ACPI_NAMESEG_SIZE] __initconst __nonstring = { | ^~~~~~ and my attempts to nest it properly with a type had failed, because of how gcc doesn't like marking the types as having attributes, only symbols. There may be some trick to it, but I was already annoyed by the bad attribute design, now I'm just entirely fed up with it. I wish gcc had a proper way to say "this type is a *byte* array, not a string". The obvious thing would be to distinguish between "char []" and an explicitly signed "unsigned char []" (as opposed to an implicitly unsigned char, which is typically an architecture-specific default, but for the kernel is universal thanks to '-funsigned-char'). But any "we can typedef a 8-bit type to not become a string just because it's an array" model would be fine. But "__attribute__((nonstring))" is sadly not that sane model. Reported-by: Chris Clayton <chris2553@googlemail.com> Fixes: 4b4bd8c50f48 ("gcc-15: acpi: sprinkle random '__nonstring' crumbles around") Fixes: d5d45a7f2619 ("gcc-15: make 'unterminated string initialization' just a warning") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-20Linux 6.15-rc3v6.15-rc3Linus Torvalds
2025-04-20gcc-15: work around sequence-point warningLinus Torvalds
The C sequence points are complicated things, and gcc-15 has apparently added a warning for the case where an object is both used and modified multiple times within the same sequence point. That's a great warning. Or rather, it would be a great warning, except gcc-15 seems to not really be very exact about it, and doesn't notice that the modification are to two entirely different members of the same object: the array counter and the array entries. So that seems kind of silly. That said, the code that gcc complains about is unnecessarily complicated, so moving the array counter update into a separate statement seems like the most straightforward fix for these warnings: drivers/net/wireless/intel/iwlwifi/mld/d3.c: In function ‘iwl_mld_set_netdetect_info’: drivers/net/wireless/intel/iwlwifi/mld/d3.c:1102:66: error: operation on ‘netdetect_info->n_matches’ may be undefined [-Werror=sequence-point] 1102 | netdetect_info->matches[netdetect_info->n_matches++] = match; | ~~~~~~~~~~~~~~~~~~~~~~~~~^~ drivers/net/wireless/intel/iwlwifi/mld/d3.c:1120:58: error: operation on ‘match->n_channels’ may be undefined [-Werror=sequence-point] 1120 | match->channels[match->n_channels++] = | ~~~~~~~~~~~~~~~~~^~ side note: the code at that second warning is actively buggy, and only works on little-endian machines that don't do strict alignment checks. The code casts an array of integers into an array of unsigned long in order to use our bitmap iterators. That happens to work fine on any sane architecture, but it's still wrong. This does *not* fix that more serious problem. This only splits the two assignments into two statements and fixes the compiler warning. I need to get rid of the new warnings in order to be able to actually do any build testing. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-20gcc-15: add '__nonstring' markers to byte arraysLinus Torvalds
All of these cases are perfectly valid and good traditional C, but hit by the "you're not NUL-terminating your byte array" warning. And none of the cases want any terminating NUL character. Mark them __nonstring to shut up gcc-15 (and in the case of the ak8974 magnetometer driver, I just removed the explicit array size and let gcc expand the 3-byte and 6-byte arrays by one extra byte, because it was the simpler change). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-20gcc-15: get rid of misc extra NUL character paddingLinus Torvalds
This removes two cases of explicit NUL padding that now causes warnings because of '-Wunterminated-string-initialization' being part of -Wextra in gcc-15. Gcc is being silly in this case when it says that it truncates a NUL terminator, because in these cases there were _multiple_ NUL characters. But we can get rid of the warning by just simplifying the two initializers that trigger the warning for me, so this does exactly that. I'm not sure why the power supply code did that odd .attr_name = #_name "\0", pattern: it was introduced in commit 2cabeaf15129 ("power: supply: core: Cleanup power supply sysfs attribute list"), but that 'attr_name[]' field is an explicitly sized character array in a statically initialized variable, and a string initializer always has a terminating NUL _and_ statically initialized character arrays are zero-padded anyway, so it really seems to be rather extraneous belt-and-suspenders. The zero_uuid[16] initialization in drivers/md/bcache/super.c makes perfect sense, but it isn't necessary for the same reasons, and not worth the new gcc warning noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-20gcc-15: acpi: sprinkle random '__nonstring' crumbles aroundLinus Torvalds
This is not great: I'd much rather introduce a typedef that is a "ACPI name byte buffer", and use that to mark these special 4-byte ACPI names that do not use NUL termination. But as noted in the previous commit ("gcc-15: make 'unterminated string initialization' just a warning") gcc doesn't actually seem to support that notion, so instead you have to just mark every single array declaration individually. So this is not pretty, but this gets rid of the bulk of the annoying warnings during an allmodconfig build for me. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-20gcc-15: make 'unterminated string initialization' just a warningLinus Torvalds
gcc-15 enabling -Wunterminated-string-initialization in -Wextra by default was done with the best intentions, but the warning is still quite broken. What annoys me about the warning is that this is a very traditional AND CORRECT way to initialize fixed byte arrays in C: unsigned char hex[16] = "0123456789abcdef"; and we use this all over the kernel. And the warning is fine, but gcc developers apparently never made a reasonable way to disable it. As is (sadly) tradition with these things. Yes, there's "__attribute__((nonstring))", and we have a macro to make that absolutely disgusting syntax more palatable (ie the kernel syntax for that monstrosity is just "__nonstring"). But that attribute is misdesigned. What you'd typically want to do is tell the compiler that you are using a type that isn't a string but a byte array, but that doesn't work at all: warning: ‘nonstring’ attribute does not apply to types [-Wattributes] and because of this fundamental mis-design, you then have to mark each instance of that pattern. This is particularly noticeable in our ACPI code, because ACPI has this notion of a 4-byte "type name" that gets used all over, and is exactly this kind of byte array. This is a sad oversight, because the warning is useful, but really would be so much better if gcc had also given a sane way to indicate that we really just want a byte array type at a type level, not the broken "each and every array definition" level. So now instead of creating a nice "ACPI name" type using something like typedef char acpi_name_t[4] __nonstring; we have to do things like char name[ACPI_NAMESEG_SIZE] __nonstring; in every place that uses this concept and then happens to have the typical initializers. This is annoying me mainly because I think the warning _is_ a good warning, which is why I'm not just turning it off in disgust. But it is hampered by this bad implementation detail. [ And obviously I'm doing this now because system upgrades for me are something that happen in the middle of the release cycle: don't do it before or during travel, or just before or during the busy merge window period. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-19Merge tag 'mm-hotfixes-stable-2025-04-19-21-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "16 hotfixes. 2 are cc:stable and the remainder address post-6.14 issues or aren't considered necessary for -stable kernels. All patches are basically for MM although five are alterations to MAINTAINERS" [ Basic counting skills are clearly not a strictly necessary requirement for kernel maintainers. - Linus ] * tag 'mm-hotfixes-stable-2025-04-19-21-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: MAINTAINERS: add section for locking of mm's and VMAs mm: vmscan: fix kswapd exit condition in defrag_mode mm: vmscan: restore high-cpu watermark safety in kswapd MAINTAINERS: add Pedro as reviewer to the MEMORY MAPPING section mm/memory: move sanity checks in do_wp_page() after mapcount vs. refcount stabilization mm, hugetlb: increment the number of pages to be reset on HVO writeback: fix false warning in inode_to_wb() docs: ABI: replace mcroce@microsoft.com with new Meta address mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable() MAINTAINERS: add memory advice section MAINTAINERS: add mmap trace events to MEMORY MAPPING mm: memcontrol: fix swap counter leak from offline cgroup MAINTAINERS: add MM subsection for the page allocator MAINTAINERS: update SLAB ALLOCATOR maintainers fs/dax: fix folio splitting issue by resetting old folio order + _nr_pages mm/page_alloc: fix deadlock on cpu_hotplug_lock in __accept_page()
2025-04-19Merge tag 'vfs-6.15-rc3.fixes.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Revert the hfs{plus} deprecation warning that's also included in this pull request. The commit introducing the deprecation warning resides rather early in this branch. So simply dropping it would've rebased all other commits which I decided to avoid. Hence the revert in the same branch [ Background - the deprecation warning discussion resulted in people stepping up, and so hfs{plus} will have a maintainer taking care of it after all.. - Linus ] - Switch CONFIG_SYSFS_SYCALL default to n and decouple from CONFIG_EXPERT - Fix an audit bug caused by changes to our kernel path lookup helpers this cycle. Audit needs the parent path even if the dentry it tried to look up is negative - Ensure that the kernel path lookup helpers leave the passed in path argument clean when they return an error. This is consistent with all our other helpers - Ensure that vfs_getattr_nosec() calls bdev_statx() so the relevant information is available to kernel consumers as well - Don't set a timer and call schedule() if the timer will expire immediately in epoll - Make netfs lookup tables with __nonstring * tag 'vfs-6.15-rc3.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: Revert "hfs{plus}: add deprecation warning" fs: move the bdex_statx call to vfs_getattr_nosec netfs: Mark __nonstring lookup tables eventpoll: Set epoll timeout if it's in the future fs: ensure that *path_locked*() helpers leave passed path pristine fs: add kern_path_locked_negative() hfs{plus}: add deprecation warning Kconfig: switch CONFIG_SYSFS_SYCALL default to n
2025-04-19Merge tag 'i2c-for-6.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - Address translator: fix wrong include - ChromeOS EC tunnel: fix potential NULL pointer dereference * tag 'i2c-for-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: atr: Fix wrong include i2c: cros-ec-tunnel: defer probe if parent EC is not present
2025-04-19Revert "hfs{plus}: add deprecation warning"Christian Brauner
This reverts commit ddee68c499f76ae47c011549df5be53db0057402. There's ongoing discussion about better maintenance of at least hfsplus. Rever the deprecation warning for now. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-19Merge tag 'trace-v6.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Initialize hash variables in ftrace subops logic The fix that simplified the ftrace subops logic opened a path where some variables could be used without being initialized, and done subtly where the compiler did not catch it. Initialize those variables to the EMPTY_HASH, which is the default hash. - Reinitialize the hash pointers after they are freed Some of the hash pointers in the subop logic were freed but may still be referenced later. To prevent use-after-free bugs, initialize them back to the EMPTY_HASH. - Free the ftrace hashes when they are replaced The fix that simplified the subops logic updated some hash pointers, but left the original hash that they were pointing to where they are no longer used. This caused a memory leak. Free the hashes that are pointed to by the pointers when they are replaced. - Fix size initialization of ftrace direct function hash The ftrace direct function hash used by BPF initialized the hash size incorrectly. It checked the size of items to a hard coded 32, which made the hash bit size of 5. The hash size is supposed to be limited by the bit size of the hash, as the bitmask is allowed to be greater than 5. Rework the size check to first pass the number of elements to fls() and then compare that to FTRACE_HASH_MAX_BITS before allocating the hash. - Fix format output of ftrace_graph_ent_entry event The field depth of the ftrace_graph_ent_entry event is of size 4 but the output showed it as unsigned long and use "%lu". Change it to unsigned int and use "%u" in the print format that is displayed to user space. - Fix the trace event filter on strings Events can be filtered on numbers or string values. The return value checked from strncpy_from_kernel_nofault() and strncpy_from_user_nofault() was used to determine if reading the strings would fault or not. It would return fault if the value was non zero, which is basically meant that it was always considering the read as a fault. - Add selftest to test trace event string filtering In order to catch the breakage of the string filtering, add a self test to make sure that it continues to work. * tag 'trace-v6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: selftests: Add testing a user string to filters tracing: Fix filter string testing ftrace: Fix type of ftrace_graph_ent_entry.depth ftrace: fix incorrect hash size in register_ftrace_direct() ftrace: Free ftrace hashes after they are replaced in the subops code ftrace: Reinitialize hash to EMPTY_HASH after freeing ftrace: Initialize variables for ftrace_startup/shutdown_subops()
2025-04-19Merge tag 'nfsd-6.15-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - v6.15 libcrc clean-up makes invalid configurations possible - Fix a potential deadlock introduced during the v6.15 merge window * tag 'nfsd-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: decrease sc_count directly if fail to queue dl_recall nfs: add missing selections of CONFIG_CRC32
2025-04-19Merge tag 'rust-fixes-6.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull rust fixes from Miguel Ojeda: "Toolchain and infrastructure: - Fix missing KASAN LLVM flags on first build (and fix spurious rebuilds) by skipping '--target' - Fix Make < 4.3 build error by using '$(pound)' - Fix UML build error by removing 'volatile' qualifier from io helpers - Fix UML build error by adding 'dma_{alloc,free}_attrs()' helpers - Clean gendwarfksyms warnings by avoiding to export '__pfx' symbols - Clean objtool warning by adding a new 'noreturn' function for 1.86.0 - Disable 'needless_continue' Clippy lint due to new 1.86.0 warnings - Add missing 'ffi' crate to 'generate_rust_analyzer.py' 'pin-init' crate: - Import a couple fixes from upstream" * tag 'rust-fixes-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: rust: helpers: Add dma_alloc_attrs() and dma_free_attrs() rust: helpers: Remove volatile qualifier from io helpers rust: kbuild: use `pound` to support GNU Make < 4.3 objtool/rust: add one more `noreturn` Rust function for Rust 1.86.0 rust: kasan/kbuild: fix missing flags on first build rust: disable `clippy::needless_continue` rust: kbuild: Don't export __pfx symbols rust: pin-init: use Markdown autolinks in Rust comments rust: pin-init: alloc: restrict `impl ZeroableOption` for `Box` to `T: Sized` scripts: generate_rust_analyzer: Add ffi crate
2025-04-19Merge tag 'drm-fixes-2025-04-19' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Easter rc3 pull request, fixes in all the usuals, amdgpu, xe, msm, with some i915/ivpu/mgag200/v3d fixes, then a couple of bits in dma-buf/gem. Hopefully has no easter eggs in it. dma-buf: - Correctly decrement refcounter on errors gem: - Fix test for imported buffers amdgpu: - Cleaner shader sysfs fix - Suspend fix - Fix doorbell free ordering - Video caps fix - DML2 memory allocation optimization - HDP fix i915: - Fix DP DSC configurations that require 3 DSC engines per pipe xe: - Fix LRC address being written too late for GuC - Fix notifier vs folio deadlock - Fix race betwen dma_buf unmap and vram eviction - Fix debugfs handling PXP terminations unconditionally msm: - Display: - Fix to call dpu_plane_atomic_check_pipe() for both SSPPs in case of multi-rect - Fix to validate plane_state pointer before using it in dpu_plane_virtual_atomic_check() - Fix to make sure dereferencing dpu_encoder_phys happens after making sure it is valid in _dpu_encoder_trigger_start() - Remove the remaining intr_tear_rd_ptr which we initialized to -1 because NO_IRQ indices start from 0 now - GPU: - Fix IB_SIZE overflow ivpu: - Fix debugging - Fixes to frequency - Support firmware API 3.28.3 - Flush jobs upon reset mgag200: - Set vblank start to correct values v3d: - Fix Indirect Dispatch" * tag 'drm-fixes-2025-04-19' of https://gitlab.freedesktop.org/drm/kernel: (26 commits) drm/msm/a6xx+: Don't let IB_SIZE overflow drm/xe/pxp: do not queue unneeded terminations from debugfs drm/xe/dma_buf: stop relying on placement in unmap drm/xe/userptr: fix notifier vs folio deadlock drm/xe: Set LRC addresses before guc load drm/mgag200: Fix value in <VBLKSTR> register drm/gem: Internally test import_attach for imported objects drm/amdgpu: Use the right function for hdp flush drm/amd/display/dml2: use vzalloc rather than kzalloc drm/amdgpu: Add back JPEG to video caps for carrizo and newer drm/amdgpu: fix warning of drm_mm_clean drm/amd: Forbid suspending into non-default suspend states drm/amdgpu: use a dummy owner for sysfs triggered cleaner shaders v4 drm/i915/dp: Check for HAS_DSC_3ENGINES while configuring DSC slices drm/i915/display: Add macro for checking 3 DSC engines dma-buf/sw_sync: Decrement refcount on error in sw_sync_ioctl_get_deadline() accel/ivpu: Add cmdq_id to job related logs accel/ivpu: Show NPU frequency in sysfs accel/ivpu: Fix the NPU's DPU frequency calculation accel/ivpu: Update FW Boot API to version 3.28.3 ...
2025-04-19Merge tag 'drm-msm-fixes-2025-04-18' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-fixes Fixes for v6.15-rc3 Display: - Fix to call dpu_plane_atomic_check_pipe() for both SSPPs in case of multi-rect - Fix to validate plane_state pointer before using it in dpu_plane_virtual_atomic_check() - Fix to make sure dereferencing dpu_encoder_phys happens after making sure it is valid in _dpu_encoder_trigger_start() - Remove the remaining intr_tear_rd_ptr which we initialized to -1 because NO_IRQ indices start from 0 now GPU: - Fix IB_SIZE overflow Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://lore.kernel.org/r/CAF6AEGtVKXEVdzUzFWmQE8JmK3nx_hp+ynOd-5j3vnfcU-sgOA@mail.gmail.com
2025-04-19Merge tag 'drm-xe-fixes-2025-04-18' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix LRC address being written too late for GuC - Fix notifier vs folio deadlock - Fix race betwen dma_buf unmap and vram eviction - Fix debugfs handling PXP terminations unconditionally Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/ndinq644zenywaaycxyfqqivsb2xer4z7err3dlpalbz33jfkm@ttabzsg6wnet
2025-04-18Merge tag '6.15-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - Fix hard link lease key problem when close is deferred - Revert the socket lockdep/refcount workarounds done in cifs.ko now that it is fixed at the socket layer * tag '6.15-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: Revert "smb: client: fix TCP timers deadlock after rmmod" Revert "smb: client: Fix netns refcount imbalance causing leaks and use-after-free" smb3 client: fix open hardlink on deferred close file error