summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2020-01-22Merge tag 'icc-5.6-rc1' of https://git.linaro.org/people/georgi.djakov/linux ↵Greg Kroah-Hartman
into char-misc-next Georgi writes: interconnect patches for 5.6 Here are the interconnect patches for the 5.6-rc1 merge window. - New core helper functions for some common functionalities in drivers. - Improvements in the information exposed via debugfs. - Basic tracepoints support. - New interconnect driver for msm8916 platforms. - Misc fixes. Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org> * tag 'icc-5.6-rc1' of https://git.linaro.org/people/georgi.djakov/linux: interconnect: qcom: Add MSM8916 interconnect provider driver dt-bindings: interconnect: Add Qualcomm MSM8916 DT bindings interconnect: Check for valid path in icc_set_bw() interconnect: Print the tag in the debugfs summary interconnect: Add interconnect_graph file to debugfs interconnect: qcom: Use the standard aggregate function interconnect: Add a common standard aggregate function interconnect: Add basic tracepoints interconnect: Add a name to struct icc_path interconnect: Move internal structs into a separate file interconnect: qcom: Use the new common helper for node removal interconnect: Add a common helper for removing all nodes
2020-01-22crypto: atmel-{aes,sha,tdes} - Retire crypto_platform_dataTudor Ambarus
These drivers no longer need it as they are only probed via DT. crypto_platform_data was allocated but unused, so remove it. This is a follow up for: commit 45a536e3a7e0 ("crypto: atmel-tdes - Retire dma_request_slave_channel_compat()") commit db28512f48e2 ("crypto: atmel-sha - Retire dma_request_slave_channel_compat()") commit 62f72cbdcf02 ("crypto: atmel-aes - Retire dma_request_slave_channel_compat()") Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-22Merge 5.5-rc7 into char-misc-nextGreg Kroah-Hartman
We need the char-misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22Merge 5.5-rc7 into staging-nextGreg Kroah-Hartman
We want the staging fixes in here as well Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22xsk, net: Make sock_def_readable() have external linkageBjörn Töpel
XDP sockets use the default implementation of struct sock's sk_data_ready callback, which is sock_def_readable(). This function is called in the XDP socket fast-path, and involves a retpoline. By letting sock_def_readable() have external linkage, and being called directly, the retpoline can be avoided. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200120092917.13949-1-bjorn.topel@gmail.com
2020-01-21Merge 5.5-rc7 into usb-nextGreg Kroah-Hartman
We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-21Merge branch 'regmap-5.6' into regmap-nextMark Brown
2020-01-21ASoC: dapm: add snd_soc_dapm_put_enum_double_lockedTzung-Bi Shih
Adds snd_soc_dapm_put_enum_double_locked() for those use cases if dapm_mutex has already locked. Signed-off-by: Tzung-Bi Shih <tzungbi@google.com> Link: https://lore.kernel.org/r/20200117073814.82441-3-tzungbi@google.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-01-21ASoC: soc-core: remove bus_controlKuninori Morimoto
Now, snd_soc_dai_driver::bus_control is used for how to resume. But, no driver which has bus_control has DAI driver suspend/resume support. This patch removes pointless bus_control from ALSA SoC. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/87pnffx7i4.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-01-21ASoC: soc-core: remove DAI suspend/resumeKuninori Morimoto
Historically, CPU and Codec were implemented different, but now it is merged as Component. ALSA SoC is supporting suspend/resume at DAI and Component level. The method is like below. 1) Suspend/Resume all CPU DAI if bus-control was 0 2) Suspend/Resume all Component 3) Suspend/Resume all CPU DAI if bus-control was 1 Historically 2) was Codec special operation. Because CPU and Codec were merged into Component, CPU suspend/resume has 3 chance to suspend(= 1/2/3), but Codec suspend/resume has 1 chance (= 2). Here, DAI side suspend/resume is caring bus-control, but no driver which is supporting suspend/resume is setting bus-control. This means 3) was never used. Here, used parameter for suspend/resume component->dev and dai->dev are same pointer. For that reason, we can merge DAI and Component suspend/resume. One note is that we should use 2), because it is caring BIAS level. This patch removes 1) and 3). Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/87r1zvx7i8.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-01-21Merge tag 'rds-odp-for-5.5' into rdma.git for-nextJason Gunthorpe
From https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma Leon Romanovsky says: ==================== Use ODP MRs for kernel ULPs The following series extends MR creation routines to allow creation of user MRs through kernel ULPs as a proxy. The immediate use case is to allow RDS to work over FS-DAX, which requires ODP (on-demand-paging) MRs to be created and such MRs were not possible to create prior this series. The first part of this patchset extends RDMA to have special verb ib_reg_user_mr(). The common use case that uses this function is a userspace application that allocates memory for HCA access but the responsibility to register the memory at the HCA is on an kernel ULP. This ULP acts as an agent for the userspace application. The second part provides advise MR functionality for ULPs. This is integral part of ODP flows and used to trigger pagefaults in advance to prepare memory before running working set. The third part is actual user of those in-kernel APIs. ==================== * tag 'rds-odp-for-5.5': net/rds: Use prefetch for On-Demand-Paging MR net/rds: Handle ODP mr registration/unregistration net/rds: Detect need of On-Demand-Paging memory registration RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs RDMA/mlx5: Don't fake udata for kernel path IB/mlx5: Add ODP WQE handlers for kernel QPs IB/core: Add interface to advise_mr for kernel users IB/core: Introduce ib_reg_user_mr IB: Allow calls to ib_umem_get from kernel ULPs Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-21kvm: Refactor handling of VM debugfs filesMilan Pandurov
We can store reference to kvm_stats_debugfs_item instead of copying its values to kvm_stat_data. This allows us to remove duplicated code and usage of temporary kvm_stat_data inside vm_stat_get et al. Signed-off-by: Milan Pandurov <milanpa@amazon.de> Reviewed-by: Alexander Graf <graf@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-21sparc/console: kill off obsolete declarationsArvind Sankar
commit 09d3f3f0e02c ("sparc: Kill PROM console driver.") missed removing the declarations of the deleted prom_con structure and prom_con_init function from console.h. Kill them off now. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-01-21 1) Add support for TCP encapsulation of IKE and ESP messages, as defined by RFC 8229. Patchset from Sabrina Dubroca. Please note that there is a merge conflict in: net/unix/af_unix.c between commit: 3c32da19a858 ("unix: Show number of pending scm files of receive queue in fdinfo") from the net-next tree and commit: b50b0580d27b ("net: add queue argument to __skb_wait_for_more_packets and __skb_{,try_}recv_datagram") from the ipsec-next tree. The conflict can be solved as done in linux-next. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21wan/hdlc_x25: make lapb params configurableMartin Schiller
This enables you to configure mode (DTE/DCE), Modulo, Window, T1, T2, N2 via sethdlc (which needs to be patched as well). Signed-off-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: phy: add new version of phy_do_ioctlHeiner Kallweit
Add a new version of phy_do_ioctl that doesn't check whether net_device is running. It will typically be used if suitable drivers attach the PHY in probe already. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21net: phy: rename phy_do_ioctl to phy_do_ioctl_runningHeiner Kallweit
We just added phy_do_ioctl, but it turned out that we need another version of this function that doesn't check whether net_device is running. So rename phy_do_ioctl to phy_do_ioctl_running. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21dmaengine: Move dma_get_{,any_}slave_channel() to private dmaengine.hGeert Uytterhoeven
The functions dma_get_slave_channel() and dma_get_any_slave_channel() are called from DMA engine drivers only. Hence move their declarations from the public header file <linux/dmaengine.h> to the private header file drivers/dma/dmaengine.h. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20200121093311.28639-4-geert+renesas@glider.be Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21dmaengine: Remove dma_request_slave_channel_compat() wrapperGeert Uytterhoeven
At its original introduction, dma_request_slave_channel_compat() used a wrapper, to accommodate filter functions that modify the mask passed. Filter functions can no longer modify masks, and the mask parameter was made const in commit a53e28da574a40bc ("dma: Make the 'mask' parameter of __dma_request_channel const") consecutively. Hence remove the wrapper, and rename __dma_request_slave_channel_compat() to dma_request_slave_channel_compat(), to get rid of one more function name starting with a double underscore. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20200121093311.28639-3-geert+renesas@glider.be Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21Merge tag 'rds-odp-for-5.5' of ↵David S. Miller
https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma Leon Romanovsky says: ==================== Use ODP MRs for kernel ULPs The following series extends MR creation routines to allow creation of user MRs through kernel ULPs as a proxy. The immediate use case is to allow RDS to work over FS-DAX, which requires ODP (on-demand-paging) MRs to be created and such MRs were not possible to create prior this series. The first part of this patchset extends RDMA to have special verb ib_reg_user_mr(). The common use case that uses this function is a userspace application that allocates memory for HCA access but the responsibility to register the memory at the HCA is on an kernel ULP. This ULP acts as an agent for the userspace application. The second part provides advise MR functionality for ULPs. This is integral part of ODP flows and used to trigger pagefaults in advance to prepare memory before running working set. The third part is actual user of those in-kernel APIs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21ALSA: pcm: Set per-card upper limit of PCM buffer allocationsTakashi Iwai
Currently, the available buffer allocation size for a PCM stream depends on the preallocated size; when a buffer has been preallocated, the max buffer size is set to that size, so that application won't re-allocate too much memory. OTOH, when no preallocation is done, each substream may allocate arbitrary size of buffers as long as snd_pcm_hardware.buffer_bytes_max allows -- which can be quite high, HD-audio sets 1GB there. It means that the system may consume a high amount of pages for PCM buffers, and they are pinned and never swapped out. This can lead to OOM easily. For avoiding such a situation, this patch adds the upper limit per card. Each snd_pcm_lib_malloc_pages() and _free_pages() calls are tracked and it will return an error if the total amount of buffers goes over the defined upper limit. The default value is set to 32MB, which should be really large enough for usual operations. If larger buffers are needed for any specific usage, it can be adjusted (also dynamically) via snd_pcm.max_alloc_per_card option. Setting zero there means no chceck is performed, and again, unlimited amount of buffers are allowed. Link: https://lore.kernel.org/r/20200120124423.11862-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-01-21dmaengine: ti: k3-udma: Add glue layer for non DMAengine usersGrygorii Strashko
Certain users can not use right now the DMAengine API due to missing features in the core. Prime example is Networking. These users can use the glue layer interface to avoid misuse of DMAengine API and when the core gains the needed features they can be converted to use generic API. The most prominent features the glue layer clients are depending on: - most PSI-L native peripheral use extra rflow ranges on a receive channel and depending on the peripheral's configuration packets from a single free descriptor ring is going to be received to different receive ring - it is also possible to have different free descriptor rings per rflow and an rflow can also support 4 additional free descriptor ring based on the size of the incoming packet - out of order completion of descriptors on a channel - when we have several queues to handle different priority packets the descriptors will be completed 'out-of-order' - the notion of prep_slave_sg is not matching with what the streaming type of operation is demanding for networking - Streaming type of operation - Ability to fill the free descriptor ring with descriptors in anticipation of incoming traffic and when a packet arrives UDMAP will form a packet and gives it to the client driver - the descriptors are not backed with exact size data buffers as we don't know the size of the packet we will receive, but as a generic pool of buffers to be used by the receive channel - NAPI type of operation (polling instead of interrupt driven transfer) - without this we can not sustain gigabit speeds and we need to support NAPI - not to limit this to networking, but other high performance operations Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: Keerthy <j-keerthy@ti.com> Link: https://lore.kernel.org/r/20191223110458.30766-12-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21dmaengine: ti: k3 PSI-L remote endpoint configurationPeter Ujfalusi
In K3 architecture the DMA operates within threads. One end of the thread is UDMAP, the other is on the peripheral side. The UDMAP channel configuration depends on the needs of the remote endpoint and it can be differ from peripheral to peripheral. This patch adds database for am654 and j721e and small API to fetch the PSI-L endpoint configuration from the database which should only used by the DMA driver(s). Another API is added for native peripherals to give possibility to pass new configuration for the threads they are using, which is needed to be able to handle changes caused by different firmware loaded for the peripheral for example. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: Keerthy <j-keerthy@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Link: https://lore.kernel.org/r/20191223110458.30766-9-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21dmaengine: ti: Add cppi5 header for K3 NAVSS/UDMAPeter Ujfalusi
The K3 DMA architecture uses CPPI5 (Communications Port Programming Interface) specified descriptors over PSI-L bus within NAVSS. The header provides helpers, macros to work with these descriptors in a consistent way. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: Keerthy <j-keerthy@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Link: https://lore.kernel.org/r/20191223110458.30766-8-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21dmaengine: Add helper function to convert direction value to textPeter Ujfalusi
dmaengine_get_direction_text() can be useful when the direction is printed out. The text is easier to comprehend than the number. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Link: https://lore.kernel.org/r/20191223110458.30766-7-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21dmaengine: Add support for reporting DMA cached data amountPeter Ujfalusi
A DMA hardware can have big cache or FIFO and the amount of data sitting in the DMA fabric can be an interest for the clients. For example in audio we want to know the delay in the data flow and in case the DMA have significantly large FIFO/cache, it can affect the latenc/delay Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Reviewed-by: Tero Kristo <t-kristo@ti.com> Tested-by: Keerthy <j-keerthy@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Link: https://lore.kernel.org/r/20191223110458.30766-6-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21dmaengine: Add metadata_ops for dma_async_tx_descriptorPeter Ujfalusi
The metadata is best described as side band data or parameters traveling alongside the data DMAd by the DMA engine. It is data which is understood by the peripheral and the peripheral driver only, the DMA engine see it only as data block and it is not interpreting it in any way. The metadata can be different per descriptor as it is a parameter for the data being transferred. If the DMA supports per descriptor metadata it can implement the attach, get_ptr/set_len callbacks. Client drivers must only use either attach or get_ptr/set_len to avoid misconfiguration. Client driver can check if a given metadata mode is supported by the channel during probe time with dmaengine_is_metadata_mode_supported(chan, DESC_METADATA_CLIENT); dmaengine_is_metadata_mode_supported(chan, DESC_METADATA_ENGINE); and based on this information can use either mode. Wrappers are also added for the metadata_ops. To be used in DESC_METADATA_CLIENT mode: dmaengine_desc_attach_metadata() To be used in DESC_METADATA_ENGINE mode: dmaengine_desc_get_metadata_ptr() dmaengine_desc_set_metadata_len() Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Reviewed-by: Tero Kristo <t-kristo@ti.com> Tested-by: Keerthy <j-keerthy@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Link: https://lore.kernel.org/r/20191223110458.30766-5-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21Merge TI ringacc driver from SantoshVinod Koul
This is for dependency of new TI ringacc dmaengine drivers Merge tag 'drivers_soc_for_5.6' into topic/ti SOC: TI Keystone Ring Accelerator driver The Ring Accelerator (RINGACC or RA) provides hardware acceleration to enable straightforward passing of work between a producer and a consumer. There is one RINGACC module per NAVSS on TI AM65x SoCs. Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-01-21drm/exynos: Rename Exynos to lowercaseKrzysztof Kozlowski
Fix up inconsistent usage of upper and lowercase letters in "Exynos" name. "EXYNOS" is not an abbreviation but a regular trademarked name. Therefore it should be written with lowercase letters starting with capital letter. The lowercase "Exynos" name is promoted by its manufacturer Samsung Electronics Co., Ltd., in advertisement materials and on website. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2020-01-20io_uring: optimise sqe-to-req flags translationPavel Begunkov
For each IOSQE_* flag there is a corresponding REQ_F_* flag. And there is a repetitive pattern of their translation: e.g. if (sqe->flags & SQE_FLAG*) req->flags |= REQ_F_FLAG* Use same numeric values/bits for them and copy instead of manual handling. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for probing opcodesJens Axboe
The application currently has no way of knowing if a given opcode is supported or not without having to try and issue one and see if we get -EINVAL or not. And even this approach is fraught with peril, as maybe we're getting -EINVAL due to some fields being missing, or maybe it's just not that easy to issue that particular command without doing some other leg work in terms of setup first. This adds IORING_REGISTER_PROBE, which fills in a structure with info on what it supported or not. This will work even with sparse opcode fields, which may happen in the future or even today if someone backports specific features to older kernels. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add opcode to issue trace eventJens Axboe
For some test apps at least, user_data is just zeroes. So it's not a good way to tell what the command actually is. Add the opcode to the issue trace point. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for IORING_OP_OPENAT2Jens Axboe
Add support for the new openat2(2) system call. It's trivial to do, as we can have openat(2) just be wrapped around it. Suggested-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: enable option to only trigger eventfd for async completionsJens Axboe
If an application is using eventfd notifications with poll to know when new SQEs can be issued, it's expecting the following read/writes to complete inline. And with that, it knows that there are events available, and don't want spurious wakeups on the eventfd for those requests. This adds IORING_REGISTER_EVENTFD_ASYNC, which works just like IORING_REGISTER_EVENTFD, except it only triggers notifications for events that happen from async completions (IRQ, or io-wq worker completions). Any completions inline from the submission itself will not trigger notifications. Suggested-by: Mark Papadakis <markuspapadakis@icloud.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for send(2) and recv(2)Jens Axboe
This adds IORING_OP_SEND for send(2) support, and IORING_OP_RECV for recv(2) support. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for IORING_SETUP_CLAMPJens Axboe
Some applications like to start small in terms of ring size, and then ramp up as needed. This is a bit tricky to do currently, since we don't advertise the max ring size. This adds IORING_SETUP_CLAMP. If set, and the values for SQ or CQ ring size exceed what we support, then clamp them at the max values instead of returning -EINVAL. Since we return the chosen ring sizes after setup, no further changes are needed on the application side. io_uring already changes the ring sizes if the application doesn't ask for power-of-two sizes, for example. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20pcpu_ref: add percpu_ref_tryget_many()Pavel Begunkov
Add percpu_ref_tryget_many(), which works the same way as percpu_ref_tryget(), but grabs specified number of refs. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Dennis Zhou <dennis@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add IORING_OP_MADVISEJens Axboe
This adds support for doing madvise(2) through io_uring. We assume that any operation can block, and hence punt everything async. This could be improved, but hard to make bullet proof. The async punt ensures it's safe. Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20mm: make do_madvise() available internallyJens Axboe
This is in preparation for enabling this functionality through io_uring. Add a helper that is just exporting what sys_madvise() does, and have the system call use it. No functional changes in this patch. Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add IORING_OP_FADVISEJens Axboe
This adds support for doing fadvise through io_uring. We assume that WILLNEED doesn't block, but that DONTNEED may block. Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: allow use of offset == -1 to mean file positionJens Axboe
This behaves like preadv2/pwritev2 with offset == -1, it'll use (and update) the current file position. This obviously comes with the caveat that if the application has multiple read/writes in flight, then the end result will not be as expected. This is similar to threads sharing a file descriptor and doing IO using the current file position. Since this feature isn't easily detectable by doing a read or write, add a feature flags, IORING_FEAT_RW_CUR_POS, to allow applications to detect presence of this feature. Reported-by: 李通洲 <carter.li@eoitek.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add non-vectored read/write commandsJens Axboe
For uses cases that don't already naturally have an iovec, it's easier (or more convenient) to just use a buffer address + length. This is particular true if the use case is from languages that want to create a memory safe abstraction on top of io_uring, and where introducing the need for the iovec may impose an ownership issue. For those cases, they currently need an indirection buffer, which means allocating data just for this purpose. Add basic read/write that don't require the iovec. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add IOSQE_ASYNCJens Axboe
io_uring defaults to always doing inline submissions, if at all possible. But for larger copies, even if the data is fully cached, that can take a long time. Add an IOSQE_ASYNC flag that the application can set on the SQE - if set, it'll ensure that we always go async for those kinds of requests. Use the io-wq IO_WQ_WORK_CONCURRENT flag to ensure we get the concurrency we desire for this case. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for IORING_OP_STATXJens Axboe
This provides support for async statx(2) through io_uring. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: avoid ring quiesce for fixed file set unregister and updateJens Axboe
We currently fully quiesce the ring before an unregister or update of the fixed fileset. This is very expensive, and we can be a bit smarter about this. Add a percpu refcount for the file tables as a whole. Grab a percpu ref when we use a registered file, and put it on completion. This is cheap to do. Upon removal of a file from a set, switch the ref count to atomic mode. When we hit zero ref on the completion side, then we know we can drop the previously registered files. When the old files have been dropped, switch the ref back to percpu mode for normal operation. Since there's a period between doing the update and the kernel being done with it, add a IORING_OP_FILES_UPDATE opcode that can perform the same action. The application knows the update has completed when it gets the CQE for it. Between doing the update and receiving this completion, the application must continue to use the unregistered fd if submitting IO on this particular file. This takes the runtime of test/file-register from liburing from 14s to about 0.7s. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for IORING_OP_CLOSEJens Axboe
This works just like close(2), unsurprisingly. We remove the file descriptor and post the completion inline, then offload the actual (potential) last file put to async context. Mark the async part of this work as uncancellable, as we really must guarantee that the latter part of the close is run. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for IORING_OP_OPENATJens Axboe
This works just like openat(2), except it can be performed async. For the normal case of a non-blocking path lookup this will complete inline. If we have to do IO to perform the open, it'll be done from async context. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20io_uring: add support for fallocate()Jens Axboe
This exposes fallocate(2) through io_uring. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20Merge branch 'io_uring-5.5' into for-5.6/io_uring-vfsJens Axboe
Pull in compatability fix for the files_update command. * io_uring-5.5: io_uring: fix compat for IORING_REGISTER_FILES_UPDATE
2020-01-20io_uring: fix compat for IORING_REGISTER_FILES_UPDATEEugene Syromiatnikov
fds field of struct io_uring_files_update is problematic with regards to compat user space, as pointer size is different in 32-bit, 32-on-64-bit, and 64-bit user space. In order to avoid custom handling of compat in the syscall implementation, make fds __u64 and use u64_to_user_ptr in order to retrieve it. Also, align the field naturally and check that no garbage is passed there. Fixes: c3a31e605620c279 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>