summaryrefslogtreecommitdiff
path: root/include/uapi/linux
AgeCommit message (Collapse)Author
2022-06-27net: dsa: add Renesas RZ/N1 switch tag driverClément Léger
The switch that is present on the Renesas RZ/N1 SoC uses a specific VLAN value followed by 6 bytes which contains forwarding configuration. Signed-off-by: Clément Léger <clement.leger@bootlin.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-24Bonding: add per-port priority for failover re-selectionHangbin Liu
Add per port priority support for bonding active slave re-selection during failover. A higher number means higher priority in selection. The primary slave still has the highest priority. This option also follows the primary_reselect rules. This option could only be configured via netlink. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-24KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basisBen Gardon
In some cases, the NX hugepage mitigation for iTLB multihit is not needed for all guests on a host. Allow disabling the mitigation on a per-VM basis to avoid the performance hit of NX hugepages on trusted workloads. In order to disable NX hugepages on a VM, ensure that the userspace actor has permission to reboot the system. Since disabling NX hugepages would allow a guest to crash the system, it is similar to reboot permissions. Ideally, KVM would require userspace to prove it has access to KVM's nx_huge_pages module param, e.g. so that userspace can opt out without needing full reboot permissions. But getting access to the module param file info is difficult because it is buried in layers of sysfs and module glue. Requiring CAP_SYS_BOOT is sufficient for all known use cases. Suggested-by: Jim Mattson <jmattson@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20220613212523.3436117-9-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-20media: uapi: Add some RGB bus formats for i.MX8qm/qxp pixel combinerLiu Ying
This patch adds RGB666_1X30_CPADLO, RGB888_1X30_CPADLO, RGB666_1X36_CPADLO and RGB888_1X36_CPADLO bus formats used by i.MX8qm/qxp pixel combiner. The RGB pixels with padding low per component are transmitted on a 30-bit input bus(10-bit per component) from a display controller or a 36-bit output bus(12-bit per component) to a pixel link. Reviewed-by: Robert Foss <robert.foss@linaro.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Liu Ying <victor.liu@nxp.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Robert Foss <robert.foss@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220611141421.718743-2-victor.liu@nxp.com
2022-06-20Merge drm/drm-next into drm-misc-nextThomas Zimmermann
Backmerging to get new regmap APIs of v5.19-rc1. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2022-06-20cfg80211: Indicate MLO connection info in connect and roam callbacksVeerendranath Jakkam
The MLO links used for connection with an MLD AP are decided by the driver in case of SME offloaded to driver. Add support for the drivers to indicate the information of links used for MLO connection in connect and roam callbacks, update the connected links information in wdev from connect/roam result sent by driver. Also, send the connected links information to userspace. Add a netlink flag attribute to indicate that userspace supports handling of MLO connection. Drivers must not do MLO connection when this flag is not set. This is to maintain backwards compatibility with older supplicant versions which doesn't have support for MLO connection. Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-06-20wifi: nl80211: support MLO in auth/assocJohannes Berg
For authentication, we need the BSS, the link_id and the AP MLD address to create the link and station, (for now) the driver assigns a link address and sends the frame, the MLD address needs to be the address of the interface. For association, pass the list of BSSes that were selected for the MLO connection, along with extra per-STA profile elements, the AP MLD address and the link ID on which the association request should be sent. Note that for now we don't have a proper way to pass the link address(es) and so the driver/mac80211 will select one, but depending on how that selection works it means that assoc w/o auth data still being around (mac80211 implementation detail) the association won't necessarily work - so this will need to be extended in the future to sort out the link addressing. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-06-20wifi: cfg80211: do some rework towards MLO link APIsJohannes Berg
In order to support multi-link operation with multiple links, start adding some APIs. The notable addition here is to have the link ID in a new nl80211 attribute, that will be used to differentiate the links in many nl80211 operations. So far, this patch adds the netlink NL80211_ATTR_MLO_LINK_ID attribute (as well as the NL80211_ATTR_MLO_LINKS attribute) and plugs it through the system in some places, checking the validity etc. along with other infrastructure needed for it. For now, I've decided to include only the over-the-air link ID in the API. I know we discussed that we eventually need to have to have other ways of identifying a link, but for local AP mode and auth/assoc commands as well as set_key etc. we'll use the OTA ID. Also included in this patch is some refactoring of the data structures in struct wireless_dev, splitting for the first time the data into type dependent pieces, to make reasoning about these things easier. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-06-20media: Add P010 video formatBenjamin Gaignard
P010 is a YUV format with 10-bits per component with interleaved UV. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> Acked-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2022-06-17Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-06-17 We've added 72 non-merge commits during the last 15 day(s) which contain a total of 92 files changed, 4582 insertions(+), 834 deletions(-). The main changes are: 1) Add 64 bit enum value support to BTF, from Yonghong Song. 2) Implement support for sleepable BPF uprobe programs, from Delyan Kratunov. 3) Add new BPF helpers to issue and check TCP SYN cookies without binding to a socket especially useful in synproxy scenarios, from Maxim Mikityanskiy. 4) Fix libbpf's internal USDT address translation logic for shared libraries as well as uprobe's symbol file offset calculation, from Andrii Nakryiko. 5) Extend libbpf to provide an API for textual representation of the various map/prog/attach/link types and use it in bpftool, from Daniel Müller. 6) Provide BTF line info for RV64 and RV32 JITs, and fix a put_user bug in the core seen in 32 bit when storing BPF function addresses, from Pu Lehui. 7) Fix libbpf's BTF pointer size guessing by adding a list of various aliases for 'long' types, from Douglas Raillard. 8) Fix bpftool to readd setting rlimit since probing for memcg-based accounting has been unreliable and caused a regression on COS, from Quentin Monnet. 9) Fix UAF in BPF cgroup's effective program computation triggered upon BPF link detachment, from Tadeusz Struk. 10) Fix bpftool build bootstrapping during cross compilation which was pointing to the wrong AR process, from Shahab Vahedi. 11) Fix logic bug in libbpf's is_pow_of_2 implementation, from Yuze Chi. 12) BPF hash map optimization to avoid grabbing spinlocks of all CPUs when there is no free element. Also add a benchmark as reproducer, from Feng Zhou. 13) Fix bpftool's codegen to bail out when there's no BTF, from Michael Mullin. 14) Various minor cleanup and improvements all over the place. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (72 commits) bpf: Fix bpf_skc_lookup comment wrt. return type bpf: Fix non-static bpf_func_proto struct definitions selftests/bpf: Don't force lld on non-x86 architectures selftests/bpf: Add selftests for raw syncookie helpers in TC mode bpf: Allow the new syncookie helpers to work with SKBs selftests/bpf: Add selftests for raw syncookie helpers bpf: Add helpers to issue and check SYN cookies in XDP bpf: Allow helpers to accept pointers with a fixed size bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie selftests/bpf: add tests for sleepable (uk)probes libbpf: add support for sleepable uprobe programs bpf: allow sleepable uprobe programs to attach bpf: implement sleepable uprobes by chaining gps bpf: move bpf_prog to bpf.h libbpf: Fix internal USDT address translation logic for shared libraries samples/bpf: Check detach prog exist or not in xdp_fwd selftests/bpf: Avoid skipping certain subtests selftests/bpf: Fix test_varlen verification failure with latest llvm bpftool: Do not check return value from libbpf_set_strict_mode() Revert "bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK" ... ==================== Link: https://lore.kernel.org/r/20220617220836.7373-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-16bpf: Add helpers to issue and check SYN cookies in XDPMaxim Mikityanskiy
The new helpers bpf_tcp_raw_{gen,check}_syncookie_ipv{4,6} allow an XDP program to generate SYN cookies in response to TCP SYN packets and to check those cookies upon receiving the first ACK packet (the final packet of the TCP handshake). Unlike bpf_tcp_{gen,check}_syncookie these new helpers don't need a listening socket on the local machine, which allows to use them together with synproxy to accelerate SYN cookie generation. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220615134847.3753567-4-maximmi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-16bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookieMaxim Mikityanskiy
bpf_tcp_gen_syncookie expects the full length of the TCP header (with all options), and bpf_tcp_check_syncookie accepts lengths bigger than sizeof(struct tcphdr). Fix the documentation that says these lengths should be exactly sizeof(struct tcphdr). While at it, fix a typo in the name of struct ipv6hdr. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220615134847.3753567-2-maximmi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-16include/uapi/linux/swab.h: move explicit cast outside ternaryJustin Stitt
A cast inside __builtin_constant_p doesn't do anything since it should evaluate as constant at compile time irrespective of this cast. Instead, I moved this cast outside the ternary to ensure the return type is as expected. Additionally, if __HAVE_BUILTIN_BSWAP16__ was not defined then __swab16 is actually returning an `int` not a `u16` due to integer promotion. As Al Viro notes: You *can't* get smaller-than-int out of ? :, same as you can't get it out of addition, etc. This also fixes some clang -Wformat warnings involving default argument promotion. Link: https://github.com/ClangBuiltLinux/linux/issues/378 Link: https://lkml.kernel.org/r/20220608223539.470472-1-justinstitt@google.com Signed-off-by: Justin Stitt <jstitt007@gmail.com> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Suggested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-14drm/amdkfd: Add available memory ioctlDaniel Phillips
Add a new KFD ioctl to return the largest possible memory size that can be allocated as a buffer object using kfd_ioctl_alloc_memory_of_gpu. It attempts to use exactly the same accept/reject criteria as that function so that allocating a new buffer object of the size returned by this new ioctl is guaranteed to succeed, barring races with other allocating tasks. This IOCTL will be used by libhsakmt: https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg75743.html Signed-off-by: Daniel Phillips <Daniel.Phillips@amd.com> Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-14io_uring: remove IORING_CLOSE_FD_AND_FILE_SLOTPavel Begunkov
This partially reverts a7c41b4687f5902af70cd559806990930c8a307b Even though IORING_CLOSE_FD_AND_FILE_SLOT might save cycles for some users, but it tries to do two things at a time and it's not clear how to handle errors and what to return in a single result field when one part fails and another completes well. Kill it for now. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/837c745019b3795941eee4fcfd7de697886d645b.1655224415.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10Merge tag 'wireless-next-2022-06-10' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== wireless-next patches for v5.20 Here's a first set of patches for v5.20. This is just a queue flush, before we get things back from net-next that are causing conflicts, and then can start merging a lot of MLO (multi-link operation, part of 802.11be) code. Lots of cleanups all over. The only notable change is perhaps wilc1000 being the first driver to disable WEP (while enabling WPA3). * tag 'wireless-next-2022-06-10' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (29 commits) wifi: mac80211_hwsim: Directly use ida_alloc()/free() wifi: mac80211: refactor some key code wifi: mac80211: remove cipher scheme support wifi: nl80211: fix typo in comment wifi: virt_wifi: fix typo in comment rtw89: add new state to CFO state machine for UL-OFDMA rtw89: 8852c: add trigger frame counter ieee80211: add trigger frame definition wifi: wfx: Remove redundant NULL check before release_firmware() call wifi: rtw89: support MULTI_BSSID and correct BSSID mask of H2C wifi: ray_cs: Drop useless status variable in parse_addr() wifi: ray_cs: Utilize strnlen() in parse_addr() wifi: rtw88: use %*ph to print small buffer wifi: wilc1000: add IGTK support wifi: wilc1000: add WPA3 SAE support wifi: wilc1000: remove WEP security support wifi: wilc1000: use correct sequence of RESET for chip Power-UP/Down wifi: rtlwifi: fix error codes in rtl_debugfs_set_write_h2c() wifi: rtw88: Fix Sparse warning for rtw8821c_hw_spec wifi: rtw88: Fix Sparse warning for rtw8723d_hw_spec ... ==================== Link: https://lore.kernel.org/r/20220610142838.330862-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10wifi: nl80211: fix typo in commentJulia Lawall
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Link: https://lore.kernel.org/r/20220521111145.81697-77-Julia.Lawall@inria.fr Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-06-10netfilter: xtables: Bring SPDX identifier backThomas Gleixner
Commit e2be04c7f995 ("License cleanup: add SPDX license identifier to uapi header files with a license") added the correct SPDX identifier to include/uapi/linux/netfilter/xt_IDLETIMER.h. A subsequent commit removed it for no reason and reintroduced the UAPI license incorrectness as the file is now missing the UAPI exception again. Add it back and remove the GPLv2 boilerplate while at it. Fixes: 68983a354a65 ("netfilter: xtables: Add snapshot of hardidletimer target") Cc: Manoj Basapathi <manojbm@codeaurora.org> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: netfilter-devel@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10xfrm: convert alg_key to flexible array memberStephen Hemminger
Iproute2 build generates a warning when built with gcc-12. This is because the alg_key in xfrm.h API has zero size array element instead of flexible array. CC xfrm_state.o In function ‘xfrm_algo_parse’, inlined from ‘xfrm_state_modify.constprop’ at xfrm_state.c:573:5: xfrm_state.c:162:32: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] 162 | buf[j] = val; | ~~~~~~~^~~~~ This patch convert the alg_key into flexible array member. There are other zero size arrays here that should be converted as well. This patch is RFC only since it is only compile tested and passes trivial iproute2 tests. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-06-10fscrypt: Add HCTR2 support for filename encryptionNathan Huckleberry
HCTR2 is a tweakable, length-preserving encryption mode that is intended for use on CPUs with dedicated crypto instructions. HCTR2 has the property that a bitflip in the plaintext changes the entire ciphertext. This property fixes a known weakness with filename encryption: when two filenames in the same directory share a prefix of >= 16 bytes, with AES-CTS-CBC their encrypted filenames share a common substring, leaking information. HCTR2 does not have this problem. More information on HCTR2 can be found here: "Length-preserving encryption with HCTR2": https://eprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-06-09tls: Rename TLS_INFO_ZC_SENDFILE to TLS_INFO_ZC_TXMaxim Mikityanskiy
To embrace possible future optimizations of TLS, rename zerocopy sendfile definitions to more generic ones: * setsockopt: TLS_TX_ZEROCOPY_SENDFILE- > TLS_TX_ZEROCOPY_RO * sock_diag: TLS_INFO_ZC_SENDFILE -> TLS_INFO_ZC_RO_TX RO stands for readonly and emphasizes that the application shouldn't modify the data being transmitted with zerocopy to avoid potential disconnection. Fixes: c1318b39c7d3 ("tls: Add opt-in zerocopy mode of sendfile()") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Link: https://lore.kernel.org/r/20220608153425.3151146-1-maximmi@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08dma-buf: Add an API for importing sync files (v10)Jason Ekstrand
This patch is analogous to the previous sync file export patch in that it allows you to import a sync_file into a dma-buf. Unlike the previous patch, however, this does add genuinely new functionality to dma-buf. Without this, the only way to attach a sync_file to a dma-buf is to submit a batch to your driver of choice which waits on the sync_file and claims to write to the dma-buf. Even if said batch is a no-op, a submit is typically way more overhead than just attaching a fence. A submit may also imply extra synchronization with other work because it happens on a hardware queue. In the Vulkan world, this is useful for dealing with the out-fence from vkQueuePresent. Current Linux window-systems (X11, Wayland, etc.) all rely on dma-buf implicit sync. Since Vulkan is an explicit sync API, we get a set of fences (VkSemaphores) in vkQueuePresent and have to stash those as an exclusive (write) fence on the dma-buf. We handle it in Mesa today with the above mentioned dummy submit trick. This ioctl would allow us to set it directly without the dummy submit. This may also open up possibilities for GPU drivers to move away from implicit sync for their kernel driver uAPI and instead provide sync files and rely on dma-buf import/export for communicating with other implicit sync clients. We make the explicit choice here to only allow setting RW fences which translates to an exclusive fence on the dma_resv. There's no use for read-only fences for communicating with other implicit sync userspace and any such attempts are likely to be racy at best. When we got to insert the RW fence, the actual fence we set as the new exclusive fence is a combination of the sync_file provided by the user and all the other fences on the dma_resv. This ensures that the newly added exclusive fence will never signal before the old one would have and ensures that we don't break any dma_resv contracts. We require userspace to specify RW in the flags for symmetry with the export ioctl and in case we ever want to support read fences in the future. There is one downside here that's worth documenting: If two clients writing to the same dma-buf using this API race with each other, their actions on the dma-buf may happen in parallel or in an undefined order. Both with and without this API, the pattern is the same: Collect all the fences on dma-buf, submit work which depends on said fences, and then set a new exclusive (write) fence on the dma-buf which depends on said work. The difference is that, when it's all handled by the GPU driver's submit ioctl, the three operations happen atomically under the dma_resv lock. If two userspace submits race, one will happen before the other. You aren't guaranteed which but you are guaranteed that they're strictly ordered. If userspace manages the fences itself, then these three operations happen separately and the two render operations may happen genuinely in parallel or get interleaved. However, this is a case of userspace racing with itself. As long as we ensure userspace can't back the kernel into a corner, it should be fine. v2 (Jason Ekstrand): - Use a wrapper dma_fence_array of all fences including the new one when importing an exclusive fence. v3 (Jason Ekstrand): - Lock around setting shared fences as well as exclusive - Mark SIGNAL_SYNC_FILE as a read-write ioctl. - Initialize ret to 0 in dma_buf_wait_sync_file v4 (Jason Ekstrand): - Use the new dma_resv_get_singleton helper v5 (Jason Ekstrand): - Rename the IOCTLs to import/export rather than wait/signal - Drop the WRITE flag and always get/set the exclusive fence v6 (Jason Ekstrand): - Split import and export into separate patches - New commit message v7 (Daniel Vetter): - Fix the uapi header to use the right struct in the ioctl - Use a separate dma_buf_import_sync_file struct - Add kerneldoc for dma_buf_import_sync_file v8 (Jason Ekstrand): - Rebase on Christian König's fence rework v9 (Daniel Vetter): - Fix -EINVAL checks for the flags parameter - Add documentation about read/write fences - Add documentation about the expected usage of import/export and specifically call out the possible userspace race. v10 (Simon Ser): - Fix a typo in the docs Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Simon Ser <contact@emersion.fr> Link: https://patchwork.freedesktop.org/patch/msgid/20220608152142.14495-3-jason@jlekstrand.net
2022-06-08dma-buf: Add an API for exporting sync files (v14)Jason Ekstrand
Modern userspace APIs like Vulkan are built on an explicit synchronization model. This doesn't always play nicely with the implicit synchronization used in the kernel and assumed by X11 and Wayland. The client -> compositor half of the synchronization isn't too bad, at least on intel, because we can control whether or not i915 synchronizes on the buffer and whether or not it's considered written. The harder part is the compositor -> client synchronization when we get the buffer back from the compositor. We're required to be able to provide the client with a VkSemaphore and VkFence representing the point in time where the window system (compositor and/or display) finished using the buffer. With current APIs, it's very hard to do this in such a way that we don't get confused by the Vulkan driver's access of the buffer. In particular, once we tell the kernel that we're rendering to the buffer again, any CPU waits on the buffer or GPU dependencies will wait on some of the client rendering and not just the compositor. This new IOCTL solves this problem by allowing us to get a snapshot of the implicit synchronization state of a given dma-buf in the form of a sync file. It's effectively the same as a poll() or I915_GEM_WAIT only, instead of CPU waiting directly, it encapsulates the wait operation, at the current moment in time, in a sync_file so we can check/wait on it later. As long as the Vulkan driver does the sync_file export from the dma-buf before we re-introduce it for rendering, it will only contain fences from the compositor or display. This allows to accurately turn it into a VkFence or VkSemaphore without any over-synchronization. By making this an ioctl on the dma-buf itself, it allows this new functionality to be used in an entirely driver-agnostic way without having access to a DRM fd. This makes it ideal for use in driver-generic code in Mesa or in a client such as a compositor where the DRM fd may be hard to reach. v2 (Jason Ekstrand): - Use a wrapper dma_fence_array of all fences including the new one when importing an exclusive fence. v3 (Jason Ekstrand): - Lock around setting shared fences as well as exclusive - Mark SIGNAL_SYNC_FILE as a read-write ioctl. - Initialize ret to 0 in dma_buf_wait_sync_file v4 (Jason Ekstrand): - Use the new dma_resv_get_singleton helper v5 (Jason Ekstrand): - Rename the IOCTLs to import/export rather than wait/signal - Drop the WRITE flag and always get/set the exclusive fence v6 (Jason Ekstrand): - Drop the sync_file import as it was all-around sketchy and not nearly as useful as import. - Re-introduce READ/WRITE flag support for export - Rework the commit message v7 (Jason Ekstrand): - Require at least one sync flag - Fix a refcounting bug: dma_resv_get_excl() doesn't take a reference - Use _rcu helpers since we're accessing the dma_resv read-only v8 (Jason Ekstrand): - Return -ENOMEM if the sync_file_create fails - Predicate support on IS_ENABLED(CONFIG_SYNC_FILE) v9 (Jason Ekstrand): - Add documentation for the new ioctl v10 (Jason Ekstrand): - Go back to dma_buf_sync_file as the ioctl struct name v11 (Daniel Vetter): - Go back to dma_buf_export_sync_file as the ioctl struct name - Better kerneldoc describing what the read/write flags do v12 (Christian König): - Document why we chose to make it an ioctl on dma-buf v13 (Jason Ekstrand): - Rebase on Christian König's fence rework v14 (Daniel Vetter & Christian König): - Use dma_rev_usage_rw to get the properly inverted usage to pass to dma_resv_get_singleton() - Clean up the sync_file and fd if copy_to_user() fails Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Jason Ekstrand <jason.ekstrand@collabora.com> Acked-by: Simon Ser <contact@emersion.fr> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Simon Ser <contact@emersion.fr> Link: https://patchwork.freedesktop.org/patch/msgid/20220608152142.14495-2-jason@jlekstrand.net
2022-06-08KVM: VMX: Enable Notify VM exitTao Xu
There are cases that malicious virtual machines can cause CPU stuck (due to event windows don't open up), e.g., infinite loop in microcode when nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and IRQ) can be delivered. It leads the CPU to be unavailable to host or other VMs. VMM can enable notify VM exit that a VM exit generated if no event window occurs in VM non-root mode for a specified amount of time (notify window). Feature enabling: - The new vmcs field SECONDARY_EXEC_NOTIFY_VM_EXITING is introduced to enable this feature. VMM can set NOTIFY_WINDOW vmcs field to adjust the expected notify window. - Add a new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT so that user space can query and enable this feature in per-VM scope. The argument is a 64bit value: bits 63:32 are used for notify window, and bits 31:0 are for flags. Current supported flags: - KVM_X86_NOTIFY_VMEXIT_ENABLED: enable the feature with the notify window provided. - KVM_X86_NOTIFY_VMEXIT_USER: exit to userspace once the exits happen. - It's safe to even set notify window to zero since an internal hardware threshold is added to vmcs.notify_window. VM exit handling: - Introduce a vcpu state notify_window_exits to records the count of notify VM exits and expose it through the debugfs. - Notify VM exit can happen incident to delivery of a vector event. Allow it in KVM. - Exit to userspace unconditionally for handling when VM_CONTEXT_INVALID bit is set. Nested handling - Nested notify VM exits are not supported yet. Keep the same notify window control in vmcs02 as vmcs01, so that L1 can't escape the restriction of notify VM exits through launching L2 VM. Notify VM exit is defined in latest Intel Architecture Instruction Set Extensions Programming Reference, chapter 9.2. Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Tao Xu <tao3.xu@intel.com> Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220524135624.22988-5-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-08KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple faultChenyi Qiang
For the triple fault sythesized by KVM, e.g. the RSM path or nested_vmx_abort(), if KVM exits to userspace before the request is serviced, userspace could migrate the VM and lose the triple fault. Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault with a new event KVM_VCPUEVENT_VALID_FAULT_FAULT so that userspace can save and restore the triple fault event. This extension is guarded by a new KVM capability KVM_CAP_TRIPLE_FAULT_EVENT. Note that in the set_vcpu_events path, userspace is able to set/clear the triple fault request through triple_fault.pending field. Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20220524135624.22988-2-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-07bpf: Add btf enum64 supportYonghong Song
Currently, BTF only supports upto 32bit enum value with BTF_KIND_ENUM. But in kernel, some enum indeed has 64bit values, e.g., in uapi bpf.h, we have enum { BPF_F_INDEX_MASK = 0xffffffffULL, BPF_F_CURRENT_CPU = BPF_F_INDEX_MASK, BPF_F_CTXLEN_MASK = (0xfffffULL << 32), }; In this case, BTF_KIND_ENUM will encode the value of BPF_F_CTXLEN_MASK as 0, which certainly is incorrect. This patch added a new btf kind, BTF_KIND_ENUM64, which permits 64bit value to cover the above use case. The BTF_KIND_ENUM64 has the following three fields followed by the common type: struct bpf_enum64 { __u32 nume_off; __u32 val_lo32; __u32 val_hi32; }; Currently, btf type section has an alignment of 4 as all element types are u32. Representing the value with __u64 will introduce a pad for bpf_enum64 and may also introduce misalignment for the 64bit value. Hence, two members of val_hi32 and val_lo32 are chosen to avoid these issues. The kflag is also introduced for BTF_KIND_ENUM and BTF_KIND_ENUM64 to indicate whether the value is signed or unsigned. The kflag intends to provide consistent output of BTF C fortmat with the original source code. For example, the original BTF_KIND_ENUM bit value is 0xffffffff. The format C has two choices, printing out 0xffffffff or -1 and current libbpf prints out as unsigned value. But if the signedness is preserved in btf, the value can be printed the same as the original source code. The kflag value 0 means unsigned values, which is consistent to the default by libbpf and should also cover most cases as well. The new BTF_KIND_ENUM64 is intended to support the enum value represented as 64bit value. But it can represent all BTF_KIND_ENUM values as well. The compiler ([1]) and pahole will generate BTF_KIND_ENUM64 only if the value has to be represented with 64 bits. In addition, a static inline function btf_kind_core_compat() is introduced which will be used later when libbpf relo_core.c changed. Here the kernel shares the same relo_core.c with libbpf. [1] https://reviews.llvm.org/D124641 Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062600.3716578-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07Merge tag 'kvm-s390-next-5.19-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: pvdump and selftest improvements - add an interface to provide a hypervisor dump for secure guests - improve selftests to show tests
2022-06-05Merge tag 'mm-nonmm-stable-2022-06-05' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull delay-accounting update from Andrew Morton: "A single featurette for delay accounting. Delayed a bit because, unusually, it had dependencies on both the mm-stable and mm-nonmm-stable queues" * tag 'mm-nonmm-stable-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: delayacct: track delays from write-protect copy
2022-06-05Merge tag 'hte/for-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux Pull hardware timestamping subsystem from Thierry Reding: "This contains the new HTE (hardware timestamping engine) subsystem that has been in the works for a couple of months now. The infrastructure provided allows for drivers to register as hardware timestamp providers, while consumers will be able to request events that they are interested in (such as GPIOs and IRQs) to be timestamped by the hardware providers. Note that this currently supports only one provider, but there seems to be enough interest in this functionality and we expect to see more drivers added once this is merged" [ Linus Walleij mentions the Intel PMC in the Elkhart and Tiger Lake platforms as another future timestamp provider ] * tag 'hte/for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: dt-bindings: timestamp: Correct id path dt-bindings: Renamed hte directory to timestamp hte: Uninitialized variable in hte_ts_get() hte: Fix off by one in hte_push_ts_ns() hte: Fix possible use-after-free in tegra_hte_test_remove() hte: Remove unused including <linux/version.h> MAINTAINERS: Add HTE Subsystem hte: Add Tegra HTE test driver tools: gpio: Add new hardware clock type gpiolib: cdev: Add hardware timestamp clock type gpio: tegra186: Add HTE support gpiolib: Add HTE support dt-bindings: Add HTE bindings hte: Add Tegra194 HTE kernel provider drivers: Add hardware timestamp engine (HTE) subsystem Documentation: Add HTE subsystem guide
2022-06-03Merge tag 'loongarch-5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull initial Loongarch architecture code from Arnd Bergmann: "This is the majority of the loongarch architecture code, including the final system call interface and all core functionality. It still misses three sets of peripheral but vital patches to add support for other subsystems, which have yet to pass review: - The drivers/firmware/efi stub for booting from a standard UEFI firmware implementation. Both the original custom boot interface and a draft implementation of the EFI stub did not make it, so it is currently impossible to boot the kernel, until the loongarch specific portions get accepted into the UEFI subsystem - The drivers/irqchip/irq-loongson-*.c drivers are shared with the the MIPS port, but currently lack support for ACPI based booting, which will get merged through the irqchip subsystem. - Similarly, the drivers/pci/controller/pci-loongson.c needs to be modified for ACPI support, which will be merged through the PCI subsystem. While the port cannot actually be used before all the above are merged, having it in 5.19 helps to establish the user space ABI for the libc ports to build on, and to help any treewide changes in the mainline kernel get applied here as well. A gcc-12 based tool chains for build testing is now included in https://mirrors.edge.kernel.org/pub/tools/crosstool/" Original description from Huacai Chen: "LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V. LoongArch includes a reduced 32-bit version (LA32R), a standard 32-bit version (LA32S) and a 64-bit version (LA64). LoongArch use ACPI as its boot protocol LoongArch-specific interrupt controllers (similar to APIC) are already added in the next revision of ACPI Specification (current revision is 6.4). This patchset is adding basic LoongArch support in mainline kernel, we can see a complete snapshot here: https://github.com/loongson/linux/tree/loongarch-next https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next Cross-compile tool chain to build kernel: https://github.com/loongson/build-tools/releases/download/2021.12.21/loongarch64-clfs-2022-03-03-cross-tools-gcc-glibc.tar.xz A CLFS-based Linux distro: https://github.com/loongson/build-tools/releases/download/2021.12.21/loongarch64-clfs-system-2022-03-03.tar.bz2 Open-source tool chain which is under review (Binutils and Gcc are already upstream): https://github.com/loongson/binutils-gdb/tree/upstream_v3.1 https://github.com/loongson/gcc/tree/loongarch_upstream_v6.3 https://github.com/loongson/glibc/tree/loongarch_2_35_dev_v2.2 Loongson and LoongArch documentations: https://github.com/loongson/LoongArch-Documentation LoongArch-specific interrupt controllers: https://mantis.uefi.org/mantis/view.php?id=2203 https://mantis.uefi.org/mantis/view.php?id=2313" Link: https://lore.kernel.org/lkml/20220603072053.35005-1-chenhuacai@loongson.cn/ * tag 'loongarch-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (24 commits) MAINTAINERS: Add maintainer information for LoongArch LoongArch: Add Loongson-3 default config file LoongArch: Add Non-Uniform Memory Access (NUMA) support LoongArch: Add multi-processor (SMP) support LoongArch: Add VDSO and VSYSCALL support LoongArch: Add some library functions LoongArch: Add misc common routines LoongArch: Add ELF and module support LoongArch: Add signal handling support LoongArch: Add system call support LoongArch: Add memory management LoongArch: Add process management LoongArch: Add exception/interrupt handling LoongArch: Add boot and setup routines LoongArch: Add other common headers LoongArch: Add atomic/locking headers LoongArch: Add CPU definition headers LoongArch: Add build infrastructure LoongArch: Add writecombine support for drm LoongArch: Add ELF-related definitions ...
2022-06-03Merge tag 'char-misc-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc / other smaller driver subsystem updates from Greg KH: "Here is the large set of char, misc, and other driver subsystem updates for 5.19-rc1. The merge request for this has been delayed as I wanted to get lots of linux-next testing due to some late arrivals of changes for the habannalabs driver. Highlights of this merge are: - habanalabs driver updates for new hardware types and fixes and other updates - IIO driver tree merge which includes loads of new IIO drivers and cleanups and additions - PHY driver tree merge with new drivers and small updates to existing ones - interconnect driver tree merge with fixes and updates - soundwire driver tree merge with some small fixes - coresight driver tree merge with small fixes and updates - mhi bus driver tree merge with lots of updates and new device support - firmware driver updates - fpga driver updates - lkdtm driver updates (with a merge conflict, more on that below) - extcon driver tree merge with small updates - lots of other tiny driver updates and fixes and cleanups, full details in the shortlog. All of these have been in linux-next for almost 2 weeks with no reported problems" * tag 'char-misc-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (387 commits) habanalabs: use separate structure info for each error collect data habanalabs: fix missing handle shift during mmap habanalabs: remove hdev from hl_ctx_get args habanalabs: do MMU prefetch as deferred work habanalabs: order memory manager messages habanalabs: return -EFAULT on copy_to_user error habanalabs: use NULL for eventfd habanalabs: update firmware header habanalabs: add support for notification via eventfd habanalabs: add topic to memory manager buffer habanalabs: handle race in driver fini habanalabs: add device memory scrub ability through debugfs habanalabs: use unified memory manager for CB flow habanalabs: unified memory manager new code for CB flow habanalabs/gaudi: set arbitration timeout to a high value habanalabs: add put by handle method to memory manager habanalabs: hide memory manager page shift habanalabs: Add separate poll interval value for protocol habanalabs: use get_task_pid() to take PID habanalabs: add prefetch flag to the MAP operation ...
2022-06-03Merge tag 'io_uring-5.19-2022-06-02' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull more io_uring updates from Jens Axboe: - A small series with some prep patches for the upcoming 5.20 split of the io_uring.c file. No functional changes here, just minor bits that are nice to get out of the way now (me) - Fix for a memory leak in high numbered provided buffer groups, introduced in the merge window (me) - Wire up the new socket opcode for allocated direct descriptors, making it consistent with the other opcodes that can instantiate a descriptor (me) - Fix for the inflight tracking, should go into 5.18-stable as well (me) - Fix for a deadlock for io-wq offloaded file slot allocations (Pavel) - Direct descriptor failure fput leak fix (Xiaoguang) - Fix for the direct descriptor allocation hinting in case of unsuccessful install (Xiaoguang) * tag 'io_uring-5.19-2022-06-02' of git://git.kernel.dk/linux-block: io_uring: reinstate the inflight tracking io_uring: fix deadlock on iowq file slot alloc io_uring: let IORING_OP_FILES_UPDATE support choosing fixed file slots io_uring: defer alloc_hint update to io_file_bitmap_set() io_uring: ensure fput() called correspondingly when direct install fails io_uring: wire up allocated direct descriptors for socket io_uring: fix a memory leak of buffer group list on exit io_uring: move shutdown under the general net section io_uring: unify calling convention for async prep handling io_uring: add io_op_defs 'def' pointer in req init and issue io_uring: make prep and issue side of req handlers named consistently io_uring: make timeout prep handlers consistent with other prep handlers
2022-06-03Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: "vhost,virtio and vdpa features, fixes, and cleanups: - mac vlan filter and stats support in mlx5 vdpa - irq hardening in virtio - performance improvements in virtio crypto - polling i/o support in virtio blk - ASID support in vhost - fixes, cleanups all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (64 commits) vdpa: ifcvf: set pci driver data in probe vdpa/mlx5: Add RX MAC VLAN filter support vdpa/mlx5: Remove flow counter from steering vhost: rename vhost_work_dev_flush vhost-test: drop flush after vhost_dev_cleanup vhost-scsi: drop flush after vhost_dev_cleanup vhost_vsock: simplify vhost_vsock_flush() vhost_test: remove vhost_test_flush_vq() vhost_net: get rid of vhost_net_flush_vq() and extra flush calls vhost: flush dev once during vhost_dev_stop vhost: get rid of vhost_poll_flush() wrapper vhost-vdpa: return -EFAULT on copy_to_user() failure vdpasim: Off by one in vdpasim_set_group_asid() virtio: Directly use ida_alloc()/free() virtio: use WARN_ON() to warning illegal status value virtio: harden vring IRQ virtio: allow to unbreak virtqueue virtio-ccw: implement synchronize_cbs() virtio-mmio: implement synchronize_cbs() virtio-pci: implement synchronize_cbs() ...
2022-06-03LoongArch: Add ELF-related definitionsHuacai Chen
Add ELF-related definitions for LoongArch, including: EM_LOONGARCH, KEXEC_ARCH_LOONGARCH, AUDIT_ARCH_LOONGARCH32, AUDIT_ARCH_LOONGARCH64 and NT_LOONGARCH_*. Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-02Merge tag 'asm-generic-fixes-5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic fixes from Arnd Bergmann: "The header cleanup series from Masahiro Yamada ended up causing some regressions in the ABI because of an ambigous uid_t type. This was only caught after the original patches got merged, but at least the fixes are trivial and hopefully complete" * tag 'asm-generic-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: binder: fix sender_euid type in uapi header sparc: fix mis-use of __kernel_{uid,gid}_t in uapi/asm/stat.h powerpc: use __kernel_{uid,gid}32_t in uapi/asm/stat.h mips: use __kernel_{uid,gid}32_t in uapi/asm/stat.h
2022-06-02Merge tag 'net-5.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. Current release - new code bugs: - af_packet: make sure to pull the MAC header, avoid skb panic in GSO - ptp_clockmatrix: fix inverted logic in is_single_shot() - netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag - dt-bindings: net: adin: fix adi,phy-output-clock description syntax - wifi: iwlwifi: pcie: rename CAUSE macro, avoid MIPS build warning Previous releases - regressions: - Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process" - tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd - nf_tables: disallow non-stateful expression in sets earlier - nft_limit: clone packet limits' cost value - nf_tables: double hook unregistration in netns path - ping6: fix ping -6 with interface name Previous releases - always broken: - sched: fix memory barriers to prevent skbs from getting stuck in lockless qdiscs - neigh: set lower cap for neigh_managed_work rearming, avoid constantly scheduling the probe work - bpf: fix probe read error on big endian in ___bpf_prog_run() - amt: memory leak and error handling fixes Misc: - ipv6: expand & rename accept_unsolicited_na to accept_untracked_na" * tag 'net-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits) net/af_packet: make sure to pull mac header net: add debug info to __skb_pull() net: CONFIG_DEBUG_NET depends on CONFIG_NET stmmac: intel: Add RPL-P PCI ID net: stmmac: use dev_err_probe() for reporting mdio bus registration failure tipc: check attribute length for bearer name ice: fix access-beyond-end in the switch code nfp: remove padding in nfp_nfdk_tx_desc ax25: Fix ax25 session cleanup problems net: usb: qmi_wwan: Add support for Cinterion MV31 with new baseline sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels sfc/siena: fix considering that all channels have TX queues socket: Don't use u8 type in uapi socket.h net/sched: act_api: fix error code in tcf_ct_flow_table_fill_tuple_ipv6() net: ping6: Fix ping -6 with interface name macsec: fix UAF bug for real_dev octeontx2-af: fix error code in is_valid_offset() wifi: mac80211: fix use-after-free in chanctx code bonding: guard ns_targets by CONFIG_IPV6 tcp: tcp_rtx_synack() can be called from process context ...
2022-06-02binder: fix sender_euid type in uapi headerCarlos Llamas
The {pid,uid}_t fields of struct binder_transaction were recently replaced to use kernel types in commit 169adc2b6b3c ("android/binder.h: add linux/android/binder(fs).h to UAPI compile-test coverage"). However, using __kernel_uid_t here breaks backwards compatibility in architectures using 16-bits for this type, since glibc and some others still expect a 32-bit uid_t. Instead, let's use __kernel_uid32_t which avoids this compatibility problem. Fixes: 169adc2b6b3c ("android/binder.h: add linux/android/binder(fs).h to UAPI compile-test coverage") Reported-by: Christopher Ferris <cferris@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Acked-by: Todd Kjos <tkjos@google.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-01socket: Don't use u8 type in uapi socket.hTobias Klauser
Use plain 255 instead, which also avoid introducing an additional header dependency on <linux/types.h> Fixes: 26859240e4ee ("txhash: Add socket option to control TX hash rethink behavior") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Link: https://lore.kernel.org/r/20220531094345.13801-1-tklauser@distanz.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-01delayacct: track delays from write-protect copyYang Yang
Delay accounting does not track the delay of write-protect copy. When tasks trigger many write-protect copys(include COW and unsharing of anonymous pages[1]), it may spend a amount of time waiting for them. To get the delay of tasks in write-protect copy, could help users to evaluate the impact of using KSM or fork() or GUP. Also update tools/accounting/getdelays.c: / # ./getdelays -dl -p 231 print delayacct stats ON listen forever PID 231 CPU count real total virtual total delay total delay average 6247 1859000000 2154070021 1674255063 0.268ms IO count delay total delay average 0 0 0ms SWAP count delay total delay average 0 0 0ms RECLAIM count delay total delay average 0 0 0ms THRASHING count delay total delay average 0 0 0ms COMPACT count delay total delay average 3 72758 0ms WPCOPY count delay total delay average 3635 271567604 0ms [1] commit 31cc5bc4af70("mm: support GUP-triggered unsharing of anonymous pages") Link: https://lkml.kernel.org/r/20220409014342.2505532-1-yang.yang29@zte.com.cn Signed-off-by: Yang Yang <yang.yang29@zte.com.cn> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jiang Xuexin <jiang.xuexin@zte.com.cn> Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Reviewed-by: wangyong <wang.yong12@zte.com.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-01Merge tag 'vfio-v5.19-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull vfio updates from Alex Williamson: - Improvements to mlx5 vfio-pci variant driver, including support for parallel migration per PF (Yishai Hadas) - Remove redundant iommu_present() check (Robin Murphy) - Ongoing refactoring to consolidate the VFIO driver facing API to use vfio_device (Jason Gunthorpe) - Use drvdata to store vfio_device among all vfio-pci and variant drivers (Jason Gunthorpe) - Remove redundant code now that IOMMU core manages group DMA ownership (Jason Gunthorpe) - Remove vfio_group from external API handling struct file ownership (Jason Gunthorpe) - Correct typo in uapi comments (Thomas Huth) - Fix coccicheck detected deadlock (Wan Jiabing) - Use rwsem to remove races and simplify code around container and kvm association to groups (Jason Gunthorpe) - Harden access to devices in low power states and use runtime PM to enable d3cold support for unused devices (Abhishek Sahu) - Fix dma_owner handling of fake IOMMU groups (Jason Gunthorpe) - Set driver_managed_dma on vfio-pci variant drivers (Jason Gunthorpe) - Pass KVM pointer directly rather than via notifier (Matthew Rosato) * tag 'vfio-v5.19-rc1' of https://github.com/awilliam/linux-vfio: (38 commits) vfio: remove VFIO_GROUP_NOTIFY_SET_KVM vfio/pci: Add driver_managed_dma to the new vfio_pci drivers vfio: Do not manipulate iommu dma_owner for fake iommu groups vfio/pci: Move the unused device into low power state with runtime PM vfio/pci: Virtualize PME related registers bits and initialize to zero vfio/pci: Change the PF power state to D0 before enabling VFs vfio/pci: Invalidate mmaps and block the access in D3hot power state vfio: Change struct vfio_group::container_users to a non-atomic int vfio: Simplify the life cycle of the group FD vfio: Fully lock struct vfio_group::container vfio: Split up vfio_group_get_device_fd() vfio: Change struct vfio_group::opened from an atomic to bool vfio: Add missing locking for struct vfio_group::kvm kvm/vfio: Fix potential deadlock problem in vfio include/uapi/linux/vfio.h: Fix trivial typo - _IORW should be _IOWR instead vfio/pci: Use the struct file as the handle not the vfio_group kvm/vfio: Remove vfio_group from kvm vfio: Change vfio_group_set_kvm() to vfio_file_set_kvm() vfio: Change vfio_external_check_extension() to vfio_file_enforced_coherent() vfio: Remove vfio_external_group_match_file() ...
2022-06-01KVM: s390: Add KVM_CAP_S390_PROTECTED_DUMPJanosch Frank
The capability indicates dump support for protected VMs. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-9-frankja@linux.ibm.com Message-Id: <20220517163629.3443-9-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01KVM: s390: Add CPU dump functionalityJanosch Frank
The previous patch introduced the per-VM dump functions now let's focus on dumping the VCPU state via the newly introduced KVM_S390_PV_CPU_COMMAND ioctl which mirrors the VM UV ioctl and can be extended with new commands later. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-8-frankja@linux.ibm.com Message-Id: <20220517163629.3443-8-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01KVM: s390: Add configuration dump functionalityJanosch Frank
Sometimes dumping inside of a VM fails, is unavailable or doesn't yield the required data. For these occasions we dump the VM from the outside, writing memory and cpu data to a file. Up to now PV guests only supported dumping from the inside of the guest through dumpers like KDUMP. A PV guest can be dumped from the hypervisor but the data will be stale and / or encrypted. To get the actual state of the PV VM we need the help of the Ultravisor who safeguards the VM state. New UV calls have been added to initialize the dump, dump storage state data, dump cpu data and complete the dump process. We expose these calls in this patch via a new UV ioctl command. The sensitive parts of the dump data are encrypted, the dump key is derived from the Customer Communication Key (CCK). This ensures that only the owner of the VM who has the CCK can decrypt the dump data. The memory is dumped / read via a normal export call and a re-import after the dump initialization is not needed (no re-encryption with a dump key). Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-7-frankja@linux.ibm.com Message-Id: <20220517163629.3443-7-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01KVM: s390: pv: Add query dump informationJanosch Frank
The dump API requires userspace to provide buffers into which we will store data. The dump information added in this patch tells userspace how big those buffers need to be. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-6-frankja@linux.ibm.com Message-Id: <20220517163629.3443-6-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-06-01KVM: s390: pv: Add query interfaceJanosch Frank
Some of the query information is already available via sysfs but having a IOCTL makes the information easier to retrieve. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20220517163629.3443-4-frankja@linux.ibm.com Message-Id: <20220517163629.3443-4-frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-05-31vhost-vdpa: introduce uAPI to set group ASIDGautam Dawar
Follows the vDPA support for associating ASID to a specific virtqueue group. This patch adds a uAPI to support setting them from userspace. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-15-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vhost-vdpa: uAPI to get virtqueue group idGautam Dawar
Follows the support for virtqueue group in vDPA. This patches introduces uAPI to get the virtqueue group ID for a specific virtqueue in vhost-vdpa. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-14-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vhost-vdpa: introduce uAPI to get the number of address spacesGautam Dawar
This patch introduces the uAPI for getting the number of address spaces supported by this vDPA device. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-13-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>