summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2021-10-19io_uring: warning about unused-but-set parameterArnd Bergmann
When enabling -Wunused warnings by building with W=1, I get an instance of the -Wunused-but-set-parameter warning in the io_uring code: fs/io_uring.c: In function 'io_queue_async_work': fs/io_uring.c:1445:61: error: parameter 'locked' set but not used [-Werror=unused-but-set-parameter] 1445 | static void io_queue_async_work(struct io_kiocb *req, bool *locked) | ~~~~~~^~~~~~ There are very few warnings of this type, so it would be nice to enable this by default and fix all the existing instances. As the assignment serves no purpose by itself other than to prevent developers from using the variable, an easy workaround is to remove the assignment and just rename the argument to "dont_use". Fixes: f237c30a5610 ("io_uring: batch task work locking") Link: https://lore.kernel.org/lkml/20210920121352.93063-1-arnd@kernel.org/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211019153507.348480-1-arnd@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19erofs: lzma compression supportGao Xiang
Add MicroLZMA support in order to maximize compression ratios for specific scenarios. For example, it's useful for low-end embedded boards and as a secondary algorithm in a file for specific access patterns. MicroLZMA is a new container format for raw LZMA1, which was created by Lasse Collin aiming to minimize old LZMA headers and get rid of unnecessary EOPM (end of payload marker) as well as to enable fixed-sized output compression, especially for 4KiB pclusters. Similar to LZ4, inplace I/O approach is used to minimize runtime memory footprint when dealing with I/O. Overlapped decompression is handled with 1) bounced buffer for data under processing or 2) extra short-lived pages from the on-stack pagepool which will be shared in the same read request (128KiB for example). Link: https://lore.kernel.org/r/20211010213145.17462-8-xiang@kernel.org Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-10-19erofs: rename some generic methods in decompressorGao Xiang
Previously, some LZ4 methods were named with `generic'. However, while evaluating the effective LZMA approach, it seems they aren't quite generic at all (e.g. no need preparing dstpages for most LZMA cases.) Avoid such naming instead. Link: https://lore.kernel.org/r/20211010213145.17462-7-xiang@kernel.org Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-10-19erofs: introduce readmore decompression strategyGao Xiang
Previously, the readahead window was strictly followed by EROFS decompression strategy in order to minimize extra memory footprint. However, it could become inefficient if just reading the partial requested data for much big LZ4 pclusters and the upcoming LZMA implementation. Let's try to request the leading data in a pcluster without triggering memory reclaiming instead for the LZ4 approach first to boost up 100% randread of large big pclusters, and it has no real impact on low memory scenarios. It also introduces a way to expand read lengths in order to decompress the whole pcluster, which is useful for LZMA since the algorithm itself is relatively slow and causes CPU bound, but LZ4 is not. Link: https://lore.kernel.org/r/20211008200839.24541-4-xiang@kernel.org Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-10-19erofs: introduce the secondary compression headGao Xiang
Previously, for each HEAD lcluster, it can be either HEAD or PLAIN lcluster to indicate whether the whole pcluster is compressed or not. In this patch, a new HEAD2 head type is introduced to specify another compression algorithm other than the primary algorithm for each compressed file, which can be used for upcoming LZMA compression and LZ4 range dictionary compression for various data patterns. It has been stayed in the EROFS roadmap for years. Complete it now! Link: https://lore.kernel.org/r/20211017165721.2442-1-xiang@kernel.org Reviewed-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-10-19NFSD:fix boolreturn.cocci warningChangcheng Deng
./fs/nfsd/nfssvc.c: 1072: 8-9: :WARNING return of 0/1 in function 'nfssvc_decode_voidarg' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-10-19io_uring: inform block layer of how many requests we are submittingJens Axboe
The block layer can use this knowledge to make smarter decisions on how to handle the request, if it knows that N more may be coming. Switch to using blk_start_plug_nr_ios() to pass in that information. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: simplify io_file_supports_nowait()Pavel Begunkov
Make sure that REQ_F_SUPPORT_NOWAIT is always set io_prep_rw(), and so we can stop caring about setting it down the line simplifying io_file_supports_nowait(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/60c8f1f5e2cb45e00f4897b2cec10c5b3669da91.1634425438.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: combine REQ_F_NOWAIT_{READ,WRITE} flagsPavel Begunkov
Merge REQ_F_NOWAIT_READ and REQ_F_NOWAIT_WRITE into one flag, i.e. REQ_F_SUPPORT_NOWAIT. First it gets rid of dependence on CONFIG_64BIT but also simplifies the code. One thing to consider is when we don't have ->{read,write}_iter and go through loop_rw_iter(). Just fail it with -EAGAIN if we expect nowait behaviour but not sure whether it supports it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f832a20e5186c2e79c6519280c238f559a1d2bbc.1634425438.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: arm poll for non-nowait filesPavel Begunkov
Don't check if we can do nowait before arming apoll, there are several reasons for that. First, we don't care much about files that don't support nowait. Second, it may be useful -- we don't want to be taking away extra workers from io-wq when it can go in some async. Even if it will go through io-wq eventually, it make difference in the numbers of workers actually used. And the last one, it's needed to clean nowait in future commits. [kernel test robot: fix unused-var] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9d06f3cb2c8b686d970269a87986f154edb83043.1634425438.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19fs/io_uring: Prioritise checking faster conditions first in io_writeNoah Goldstein
This commit reorders the conditions in a branch in io_write. The reorder to check 'ret2 == -EAGAIN' first as checking '(req->ctx->flags & IORING_SETUP_IOPOLL)' will likely be more expensive due to 2x memory derefences. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Link: https://lore.kernel.org/r/20211017013229.4124279-1-goldstein.w.n@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: clean io_prep_rw()Pavel Begunkov
We already store req->file in a variable in io_prep_rw(), just use it instead of a couple of left references to kicob->ki_filp. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2f5889fc7ab670daefd5ccaedd99416d8355f0ad.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise fixed rw rsrc node settingPavel Begunkov
Move fixed rw io_req_set_rsrc_node() from rw prep into io_import_fixed(), if we're using fixed buffers it will always be called during submission as we save the state in advance, Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/68c06f66d5aa9661f1e4b88d08c52d23528297ec.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: return iovec from __io_import_iovecPavel Begunkov
We pass iovec** into __io_import_iovec(), which should keep it, initialise and modify accordingly. It's expensive, return it directly from __io_import_iovec encoding errors with ERR_PTR if needed. io_import_iovec keeps the old interface, but it's inline and so everything is optimised nicely. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6230e9769982f03a8f86fa58df24666088c44d3e.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise io_import_iovec fixed pathPavel Begunkov
Delay loading req->rw.{addr,len} in io_import_iovec until it's really needed, so removing extra loads for the fixed path, which doesn't use them. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3cc48dd0c4f1a37c4ce9aab5784281a2d83ad8be.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: kill io_wq_current_is_worker() in iopollPavel Begunkov
Don't decide about locking based on io_wq_current_is_worker(), it's not consistent with all other code and is expensive, use issue_flags. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7546d5a58efa4360173541c6fe02ee6b8c7b4ea7.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise req->ctx reloadsPavel Begunkov
Don't load req->ctx in advance, it takes an extra register and the field stays valid even after opcode handlers. It also optimises out req->ctx load in io_iopoll_req_issued() once it's inlined. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1e45ff671c44be0eb904f2e448a211734893fa0b.1634314022.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: rearrange io_read()/write()Pavel Begunkov
Combine force_nonblock branches (which is already optimised by compiler), flip branches so the most hot/common path is the first, e.g. as with non on-stack iov setup, and add extra likely/unlikely attributions for errror paths. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2c2536c5896d70994de76e387ea09a0402173a3f.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: clean up io_import_iovecPavel Begunkov
Make io_import_iovec taking struct io_rw_state instead of an iter pointer. First it takes care of initialising iovec pointer, which can be forgotten. Even more, we can not init it if not needed, e.g. in case of IORING_OP_READ_FIXED or IORING_OP_READ. Also hide saving iter_state inside of it by splitting out an inline function of it to avoid extra ifs. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b1bbc213a95e5272d4da5867bb977d9acb6f2109.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise io_import_iovec nonblock passingPavel Begunkov
First, change IO_URING_F_NONBLOCK to take sign bit of the int, so checking for it can be turned into test + sign-based-jump, makes the binary smaller and may be faster. Then, instead of passing need_lock boolean into io_import_iovec() just give it issue_flags, which is already stored somewhere. Saves some space on stack, a couple of test + cmov operations and other conversions. note: we still leave force_nonblock = issue_flags & IO_URING_F_NONBLOCK variable, but it's optimised out by the compiler into testing issue_flags directly. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ee96547e692f6c975c229cd82fc721679571a734.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise read/write iov state storingPavel Begunkov
Currently io_read() and io_write() keep separate pointers to an iter and to struct iov_iter_state, which is not great for register spilling and requires more on-stack copies. They are both either on-stack or in req->async_data at the same time, so use struct io_rw_state and keep a pointer only to it, so having all the state with just one pointer. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5c5e7ffd7dc25fc35075c70411ba99df72f237fa.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: encapsulate rw statePavel Begunkov
Add a new struct io_rw_state storing all iov related bits: fast iov, iterator and iterator state. Not much changes here, simply convert struct io_async_rw to use it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e8245ffcb568b228a009ec1eb79c993c813679f1.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise rw comletion handlersPavel Begunkov
Don't override req->result in io_complete_rw_iopoll() when it's already of the same value, we have an if just above it, so move the assignment there. Also, add one simle unlikely() in __io_complete_rw_common(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8dfeb4f84026a20172bcf82c05010abe955874ae.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: prioritise read success path over failsPavel Begunkov
Rearrange io_read return handling so first we expect it completing successfully and only then checking for errors, which is a colder path. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c91c7c2da11815ec8b04b5d872f60dc4cde662c5.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: consistent typing for issue_flagsPavel Begunkov
Some of the functions keep issue_flags as int, change those to unsigned. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/04ad43797783bc9cc7567f287ab545518f8e8cf2.1634144845.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise rsrc referencingPavel Begunkov
Apparently, percpu_ref_put/get() are expensive enough if done per request, get them in a batch and cache on the submission side to avoid getting it over and over again. Also, if we're completing under uring_lock, return refs back into the cache instead of perfcpu_ref_put(). Pretty similar to how we do tctx->cached_refs accounting, but fall back to normal putting when we already changed a rsrc node by the time of free. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b40d8c5bc77d3c9550df8a319117a374ac85f8f4.1633817310.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise io_req_set_rsrc_node()Pavel Begunkov
io_req_set_rsrc_node() reloads loads req->ctx, however it's already in registers in all use cases, so better to pass it as a parameter. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/67a25557b8a51e90bfd578447a6f1671911b05ae.1633817310.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: fix io_free_batch_list racesPavel Begunkov
[ 158.514382] WARNING: CPU: 5 PID: 15251 at fs/io_uring.c:1141 io_free_batch_list+0x269/0x360 [ 158.514426] RIP: 0010:io_free_batch_list+0x269/0x360 [ 158.514437] Call Trace: [ 158.514440] __io_submit_flush_completions+0xde/0x180 [ 158.514444] tctx_task_work+0x14a/0x220 [ 158.514447] task_work_run+0x64/0xa0 [ 158.514448] __do_sys_io_uring_enter+0x7c/0x970 [ 158.514450] __x64_sys_io_uring_enter+0x22/0x30 [ 158.514451] do_syscall_64+0x43/0x90 [ 158.514453] entry_SYSCALL_64_after_hwframe+0x44/0xae We should not touch request internals including req->comp_list.next after putting our ref if it's not final, e.g. we can start freeing requests from the free cache. Fixed: 62ca9cb93e7f8 ("io_uring: optimise io_free_batch_list()") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b1f4df38fbb8f111f52911a02fd418d0283a4e6f.1634047298.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: remove extra io_ring_exit_work wake upPavel Begunkov
task_work_add() takes care of waking up the thread, remove useless wake_up_process(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/de9a71ee255112dcaed3b5d426be24934e74722c.1633532552.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise out req->opcode reloadingPavel Begunkov
Looking at the assembly, the compiler decided to reload req->opcode in io_op_defs[opcode].needs_file instead of one it had in a register, so store it in a temp variable so it can be optimised out. Also move the personality block later, it's better for spilling/etc. as it only depends on @sqe, which we're keeping anyway. By the way, zero req->opcode if it over IORING_OP_LAST, not a problem, at the moment but is safer. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6ba869f5f8b7b0f991c87fdf089f0abf87cbe06b.1633532552.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: reshuffle io_submit_state bitsPavel Begunkov
struct io_submit_state's ->free_list and ->link are hotter and smaller than ->plug, place them first. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6ad3c15849f50b27ad012c042c73e6e069d22df7.1633532552.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: safer fallback_work freePavel Begunkov
Add extra wq flushing for fallback_work, that's not necessary but safer if invariants of io_fallback_req_func() change. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/24179419d6748516299600bc914f50b9e0b02275.1633532552.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise pluggingPavel Begunkov
Plugging is only needed with requests that also need a file, so hide plugging under a ->needs_file check. Also, place ->needs_file and ->plug bits into the same byte of io_op_defs, it may matter for compilers, e.g. only with the change a tested one decided to optimise two memory testb into a more with two register testb. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1600d1287bb7d16451d4ef3343252787a5314927.1633532552.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: correct fill events helpers typesPavel Begunkov
CQE result is a 32-bit integer, so the functions generating CQEs are better to accept not long but ints. Convert io_cqring_fill_event() and other helpers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7ca6f15255e9117eae28adcac272744cae29b113.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: inline io_poll_completePavel Begunkov
Inline io_poll_complete(), it's simple and doesn't have any particular purpose. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/933d7ee3e4450749a2d892235462c8f18d030293.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: inline io_req_needs_clean()Pavel Begunkov
There is only a single user of io_req_needs_clean() inline it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6111d0221ef4b439cad401e135dd6a5f990a0501.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: remove struct io_completionPavel Begunkov
We keep struct io_completion only as a temporal storage of cflags, Place it in io_kiocb, it's cleaner, removes extra bits and even might be used for future optimisations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5299bd5c223204065464bd87a515d0e405316086.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: control ->async_data with a REQ_F flagPavel Begunkov
->async_data is a slow path, so it won't matter much if we do the clean up inside io_clean_op(). Moreover, in many cases it's allocated together with setting one or more of IO_REQ_CLEAN_FLAGS flags, so it'd go through io_clean_op() anyway. Control ->async_data allocation with a new flag REQ_F_ASYNC_DATA, so we can do all the maintainence under io_req_needs_clean() fast check. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6892cf5883c459f36bda26f30ceb16742b20b84b.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise io_free_batch_list()Pavel Begunkov
Delay reading the next node in io_free_batch_list(), allows the compiler to load the value a bit later improving register spilling in some cases. With gcc 11.1 it helped to move @task_refs variable from the stack to a register and optimises out a couple of per request instructions. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cc9fdfb6f72a4e8bc9918a5e9f2d97869a263ae4.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: mark cold functionsPavel Begunkov
Attribute cold functions so compilers can optimise them for size. It shrinks the binary by 2.5-3% text data bss dec hex filename 90670 14002 8 104680 198e8 ./fs/io_uring.o 88053 14002 8 102063 18eaf ./fs/io_uring.o Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b53d385f91dca45170b67d7f11c7abd787e821f6.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise ctx referencing by requestsPavel Begunkov
Currenlty, we allocate one ctx reference per request at submission time and put them at free. It's batched and not so expensive but it still bloats the kernel, adds 2 function calls for rcu and adds some overhead for request counting in io_free_batch_list(). Always keep one reference with a request, even when it's freed and in io_uring request caches. There is extra work at ring exit / quiesce paths, which now need to put all cached requests. io_ring_exit_work() is already looping, so it's not a problem. Add hybrid-busy waiting to io_ctx_quiesce() as well for now. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/99613fbe396e80777228cde39bbda1aa8938554e.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: merge CQ and poll waitqueuesPavel Begunkov
->cq_wait and ->poll_wait and waken up in the same manner, use a single waitqueue for both of them. CQ waiters are queued exclusively, so wake up should first go over all pollers and that's what we need. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/00fe603e50000365774cf8435ef5fe03f049c1c9.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: don't wake sqpoll in io_cqring_ev_postedPavel Begunkov
io_cqring_ev_posted() doesn't need to wake SQPOLL, it's either done by userspace or with task_work, but no action is required on request completion. Rip off bits waking it up in io_cqring_ev_posted(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b49dab27b64cf11f4c50f2f90dcaac123430e05d.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise INIT_WQ_LISTPavel Begunkov
The invariant of io_wq_work_list is that it's empty IFF ->first is NULL, so no need to initially set ->last. With now having more users of the list it may play a role, i.e. used in each tw iteration and on every completion flushing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c464ab5cab6e46a858c6d39c107e92b3b5291f13.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise request allocationPavel Begunkov
Even after fully inlining io_alloc_req() my compiler does a NULL check in the path of successful allocation, no hacks like an empty dereference help it. Restructure io_alloc_req() by splitting out refilling part, so the compiler generate a slightly better binary. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/eda17571bdc7248d8e617b23e7132a5416e4680b.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: delay req queueing into compl-batch listPavel Begunkov
io_req_complete_state() is inlined and used in lots of places, so we want to keep it concise. Move adding a request into a completion batch list from io_req_complete_state() into the consumer, i.e. __io_queue_sqe(). before vs after text data bss dec hex filename 91894 14002 8 105904 19db0 ./fs/io_uring.o 91046 14002 8 105056 19a60 ./fs/io_uring.o Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4afca4e11abfd4cc8e99777fdcaf4d34cf4d022d.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: add more likely/unlikely() annotationsPavel Begunkov
Add two extra unlikely() in io_submit_sqes() and one around io_req_needs_clean() to help the compiler to avoid extra jumps in hot paths. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/88e087afe657e7660194353aada9b00f11d480f9.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: optimise kiocb layoutPavel Begunkov
We want ->comp_list in the second cacheline, which is hotter comparing to the 3rd. Swap the field with ->link, which is not as hot and controlled by flags and so not accessed unless there is a link. By the way add a couple of comments for io_kiocb fields. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9d9dde31f8f62279a5f48c575bbc27b8290edc0c.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: add flag to not fail link after timeoutPavel Begunkov
For some reason non-off IORING_OP_TIMEOUT always fails links, it's pretty inconvenient and unnecessary limits chaining after it to hard linking, which is far from ideal, e.g. doesn't pair well with timeout cancellation. Add a flag forcing it to not fail links on -ETIME. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/17c7ec0fb7a6113cc6be8cdaedcada0ba836ac0e.1633199723.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19io_uring: clean up buffer selectPavel Begunkov
Hiding a pointer to a struct io_buffer in rw.addr is error prone. We have some place in io_kiocb, so keep kbuf's in a separate field without aliasing and risks of it being misused. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3e63a6a953b04cad81d9ea827b12344dd57b37b4.1633107393.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>