summaryrefslogtreecommitdiff
path: root/include/uapi
AgeCommit message (Collapse)Author
2024-08-15Revert "misc: fastrpc: Restrict untrusted app to attach to privileged PD"Griffin Kroah-Hartman
This reverts commit bab2f5e8fd5d2f759db26b78d9db57412888f187. Joel reported that this commit breaks userspace and stops sensors in SDM845 from working. Also breaks other qcom SoC devices running postmarketOS. Cc: stable <stable@kernel.org> Cc: Ekansh Gupta <quic_ekangupt@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reported-by: Joel Selvaraj <joelselvaraj.oss@gmail.com> Link: https://lore.kernel.org/r/9a9f5646-a554-4b65-8122-d212bb665c81@umsystem.edu Signed-off-by: Griffin Kroah-Hartman <griffin@kroah.com> Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Fixes: bab2f5e8fd5d ("misc: fastrpc: Restrict untrusted app to attach to privileged PD") Link: https://lore.kernel.org/r/20240815094920.8242-1-griffin@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-14UAPI: net/sched: Use __struct_group() in flex struct tc_u32_selGustavo A. R. Silva
Use the `__struct_group()` helper to create a new tagged `struct tc_u32_sel_hdr`. This structure groups together all the members of the flexible `struct tc_u32_sel` except the flexible array. As a result, the array is effectively separated from the rest of the members without modifying the memory layout of the flexible structure. This new tagged struct will be used to fix problematic declarations of middle-flex-arrays in composite structs[1]. [1] https://git.kernel.org/linus/d88cabfd9abc Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://patch.msgid.link/e59fe833564ddc5b2cc83056a4c504be887d6193.1723586870.git.gustavoars@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "s390: - Fix failure to start guests with kvm.use_gisa=0 - Panic if (un)share fails to maintain security. ARM: - Use kvfree() for the kvmalloc'd nested MMUs array - Set of fixes to address warnings in W=1 builds - Make KVM depend on assembler support for ARMv8.4 - Fix for vgic-debug interface for VMs without LPIs - Actually check ID_AA64MMFR3_EL1.S1PIE in get-reg-list selftest - Minor code / comment cleanups for configuring PAuth traps - Take kvm->arch.config_lock to prevent destruction / initialization race for a vCPU's CPUIF which may lead to a UAF x86: - Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX) - Fix smatch issues - Small cleanups - Make x2APIC ID 100% readonly - Fix typo in uapi constant Generic: - Use synchronize_srcu_expedited() on irqfd shutdown" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits) KVM: SEV: uapi: fix typo in SEV_RET_INVALID_CONFIG KVM: x86: Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX) KVM: eventfd: Use synchronize_srcu_expedited() on shutdown KVM: selftests: Add a testcase to verify x2APIC is fully readonly KVM: x86: Make x2APIC ID 100% readonly KVM: x86: Use this_cpu_ptr() instead of per_cpu_ptr(smp_processor_id()) KVM: x86: hyper-v: Remove unused inline function kvm_hv_free_pa_page() KVM: SVM: Fix an error code in sev_gmem_post_populate() KVM: SVM: Fix uninitialized variable bug KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface KVM: selftests: arm64: Correct feature test for S1PIE in get-reg-list KVM: arm64: Tidying up PAuth code in KVM KVM: arm64: vgic-debug: Exit the iterator properly w/o LPI KVM: arm64: Enforce dependency on an ARMv8.4-aware toolchain s390/uv: Panic for set and remove shared access UVC errors KVM: s390: fix validity interception issue when gisa is switched off docs: KVM: Fix register ID of SPSR_FIQ KVM: arm64: vgic: fix unexpected unlock sparse warnings KVM: arm64: fix kdoc warnings in W=1 builds KVM: arm64: fix override-init warnings in W=1 builds ...
2024-08-14KVM: SEV: uapi: fix typo in SEV_RET_INVALID_CONFIGAmit Shah
"INVALID" is misspelt in "SEV_RET_INAVLID_CONFIG". Since this is part of the UAPI, keep the current definition and add a new one with the fix. Fix-suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Amit Shah <amit.shah@amd.com> Message-ID: <20240814083113.21622-1-amit@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-14Merge tag 'vfs-6.11-rc4.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "VFS: - Fix the name of file lease slab cache. When file leases were split out of file locks the name of the file lock slab cache was used for the file leases slab cache as well. - Fix a type in take_fd() helper. - Fix infinite directory iteration for stable offsets in tmpfs. - When the icache is pruned all reclaimable inodes are marked with I_FREEING and other processes that try to lookup such inodes will block. But some filesystems like ext4 can trigger lookups in their inode evict callback causing deadlocks. Ext4 does such lookups if the ea_inode feature is used whereby a separate inode may be used to store xattrs. Introduce I_LRU_ISOLATING which pins the inode while its pages are reclaimed. This avoids inode deletion during inode_lru_isolate() avoiding the deadlock and evict is made to wait until I_LRU_ISOLATING is done. netfs: - Fault in smaller chunks for non-large folio mappings for filesystems that haven't been converted to large folios yet. - Fix the CONFIG_NETFS_DEBUG config option. The config option was renamed a short while ago and that introduced two minor issues. First, it depended on CONFIG_NETFS whereas it wants to depend on CONFIG_NETFS_SUPPORT. The former doesn't exist, while the latter does. Second, the documentation for the config option wasn't fixed up. - Revert the removal of the PG_private_2 writeback flag as ceph is using it and fix how that flag is handled in netfs. - Fix DIO reads on 9p. A program watching a file on a 9p mount wouldn't see any changes in the size of the file being exported by the server if the file was changed directly in the source filesystem. Fix this by attempting to read the full size specified when a DIO read is requested. - Fix a NULL pointer dereference bug due to a data race where a cachefiles cookies was retired even though it was still in use. Check the cookie's n_accesses counter before discarding it. nsfs: - Fix ioctl declaration for NS_GET_MNTNS_ID from _IO() to _IOR() as the kernel is writing to userspace. pidfs: - Prevent the creation of pidfds for kthreads until we have a use-case for it and we know the semantics we want. It also confuses userspace why they can get pidfds for kthreads. squashfs: - Fix an unitialized value bug reported by KMSAN caused by a corrupted symbolic link size read from disk. Check that the symbolic link size is not larger than expected" * tag 'vfs-6.11-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: Squashfs: sanity check symbolic link size 9p: Fix DIO read through netfs vfs: Don't evict inode under the inode lru traversing context netfs: Fix handling of USE_PGPRIV2 and WRITE_TO_CACHE flags netfs, ceph: Revert "netfs: Remove deprecated use of PG_private_2 as a second writeback flag" file: fix typo in take_fd() comment pidfd: prevent creation of pidfds for kthreads netfs: clean up after renaming FSCACHE_DEBUG config libfs: fix infinite directory reads for offset dir nsfs: fix ioctl declaration fs/netfs/fscache_cookie: add missing "n_accesses" check filelock: fix name of file_lease slab cache netfs: Fault in smaller chunks for non-large folio mappings
2024-08-14media: rkisp1: Add support for the companding blockPaul Elder
Add support to the rkisp1 driver for the companding block that exists on the i.MX8MP version of the ISP. This requires usage of the new extensible parameters format, and showcases how the format allows for extensions without breaking backward compatibility. Signed-off-by: Paul Elder <paul.elder@ideasonboard.com> Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Reviewed-by: Paul Elder <paul.elder@ideasonboard.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
2024-08-13usb: gadget: f_fs: add capability for dfu functional descriptorDavid Sands
Add the ability for the USB FunctionFS (FFS) gadget driver to be able to create Device Firmware Upgrade (DFU) functional descriptors. [1] This patch allows implementation of DFU in userspace using the FFS gadget. The DFU protocol uses the control pipe (ep0) for all messaging so only the addition of the DFU functional descriptor is needed in the kernel driver. The DFU functional descriptor is written to the ep0 file along with any other descriptors during FFS setup. DFU requires an interface descriptor followed by the DFU functional descriptor. This patch includes documentation of the added descriptor for DFU and conversion of some existing documentation to kernel-doc format so that it can be included in the generated docs. An implementation of DFU 1.1 that implements just the runtime descriptor using the FunctionFS gadget (with rebooting into u-boot for DFU mode) has been tested on an i.MX8 Nano. An implementation of DFU 1.1 that implements both runtime and DFU mode using the FunctionFS gadget has been tested on Xilinx Zynq UltraScale+. Note that for the best performance of firmware update file transfers, the userspace program should respond as quick as possible to the setup packets. [1] https://www.usb.org/sites/default/files/DFU_1.1.pdf Signed-off-by: David Sands <david.sands@biamp.com> Co-developed-by: Chris Wulff <crwulff@gmail.com> Signed-off-by: Chris Wulff <crwulff@gmail.com> Link: https://lore.kernel.org/r/20240811000004.1395888-2-crwulff@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-12net: nexthop: Increase weight to u16Petr Machata
In CLOS networks, as link failures occur at various points in the network, ECMP weights of the involved nodes are adjusted to compensate. With high fan-out of the involved nodes, and overall high number of nodes, a (non-)ECMP weight ratio that we would like to configure does not fit into 8 bits. Instead of, say, 255:254, we might like to configure something like 1000:999. For these deployments, the 8-bit weight may not be enough. To that end, in this patch increase the next hop weight from u8 to u16. Increasing the width of an integral type can be tricky, because while the code still compiles, the types may not check out anymore, and numerical errors come up. To prevent this, the conversion was done in two steps. First the type was changed from u8 to a single-member structure, which invalidated all uses of the field. This allowed going through them one by one and audit for type correctness. Then the structure was replaced with a vanilla u16 again. This should ensure that no place was missed. The UAPI for configuring nexthop group members is that an attribute NHA_GROUP carries an array of struct nexthop_grp entries: struct nexthop_grp { __u32 id; /* nexthop id - must exist */ __u8 weight; /* weight of this nexthop */ __u8 resvd1; __u16 resvd2; }; The field resvd1 is currently validated and required to be zero. We can lift this requirement and carry high-order bits of the weight in the reserved field: struct nexthop_grp { __u32 id; /* nexthop id - must exist */ __u8 weight; /* weight of this nexthop */ __u8 weight_high; __u16 resvd2; }; Keeping the fields split this way was chosen in case an existing userspace makes assumptions about the width of the weight field, and to sidestep any endianness issues. The weight field is currently encoded as the weight value minus one, because weight of 0 is invalid. This same trick is impossible for the new weight_high field, because zero must mean actual zero. With this in place: - Old userspace is guaranteed to carry weight_high of 0, therefore configuring 8-bit weights as appropriate. When dumping nexthops with 16-bit weight, it would only show the lower 8 bits. But configuring such nexthops implies existence of userspace aware of the extension in the first place. - New userspace talking to an old kernel will work as long as it only attempts to configure 8-bit weights, where the high-order bits are zero. Old kernel will bounce attempts at configuring >8-bit weights. Renaming reserved fields as they are allocated for some purpose is commonly done in Linux. Whoever touches a reserved field is doing so at their own risk. nexthop_grp::resvd1 in particular is currently used by at least strace, however they carry an own copy of UAPI headers, and the conversion should be trivial. A helper is provided for decoding the weight out of the two fields. Forcing a conversion seems preferable to bending backwards and introducing anonymous unions or whatever. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/483e2fcf4beb0d9135d62e7d27b46fa2685479d4.1723036486.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-12net: nexthop: Add flag to assert that NHGRP reserved fields are zeroPetr Machata
There are many unpatched kernel versions out there that do not initialize the reserved fields of struct nexthop_grp. The issue with that is that if those fields were to be used for some end (i.e. stop being reserved), old kernels would still keep sending random data through the field, and a new userspace could not rely on the value. In this patch, use the existing NHA_OP_FLAGS, which is currently inbound only, to carry flags back to the userspace. Add a flag to indicate that the reserved fields in struct nexthop_grp are zeroed before dumping. This is reliant on the actual fix from commit 6d745cd0e972 ("net: nexthop: Initialize all fields in dumped nexthops"). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/21037748d4f9d8ff486151f4c09083bcf12d5df8.1723036486.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-12nsfs: fix ioctl declarationChristian Brauner
The kernel is writing an object of type __u64, so the ioctl has to be defined to _IOR(NSIO, 0x5, __u64) instead of _IO(NSIO, 0x5). Reported-by: Dmitry V. Levin <ldv@strace.io> Link: https://lore.kernel.org/r/20240730164554.GA18486@altlinux.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-12Merge 6.11-rc3 into char-misc-nextGreg Kroah-Hartman
We need the char/misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-12ethtool: rss: support skipping contexts during dumpJakub Kicinski
Applications may want to deal with dynamic RSS contexts only. So dumping context 0 will be counter-productive for them. Support starting the dump from a given context ID. Alternative would be to implement a dump flag to skip just context 0, not sure which is better... Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-12Merge drm/drm-next into drm-misc-nextThomas Zimmermann
Get drm-misc-next to the state of v6.11-rc2. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-08-12media: uapi: videodev2: Add V4L2_META_FMT_RK_ISP1_EXT_PARAMSJacopo Mondi
The rkisp1 driver stores ISP configuration parameters in the fixed rkisp1_params_cfg structure. As the members of the structure are part of the userspace API, the structure layout is immutable and cannot be extended further. Introducing new parameters or modifying the existing ones would change the buffer layout and cause breakages in existing applications. The allow for future extensions to the ISP parameters, introduce a new extensible parameters format, with a new format 4CC. Document usage of the new format in the rkisp1 admin guide. Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Reviewed-by: Daniel Scally <dan.scally@ideasonboard.com> Reviewed-by: Paul Elder <paul.elder@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
2024-08-12media: uapi: rkisp1-config: Add extensible params formatJacopo Mondi
Add to the rkisp1-config.h header data types and documentation of the extensible parameters format. Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Paul Elder <paul.elder@ideasonboard.com> Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
2024-08-12drm: Add missing documentation for struct drm_plane_size_hintMohammed Anees
This patch takes care of the following warnings during documentation compiling: ./include/uapi/drm/drm_mode.h:869: warning: Function parameter or struct member 'width' not described in 'drm_plane_size_hint' ./include/uapi/drm/drm_mode.h:869: warning: Function parameter or struct member 'height' not described in 'drm_plane_size_hint' Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20240811101653.170223-1-pvmohammedanees2003@gmail.com
2024-08-11RDMA/mlx5: Introduce GET_DATA_DIRECT_SYSFS_PATH ioctlYishai Hadas
Introduce the 'GET_DATA_DIRECT_SYSFS_PATH' ioctl to return the sysfs path of the affiliated 'data direct' device for a given device. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://patch.msgid.link/403745463e0ef52adbef681ff09aa6a29a756352.1722512548.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-11RDMA/mlx5: Add support for DMABUF MR registrations with Data-directYishai Hadas
Add support for DMABUF MR registrations with Data-direct device. Upon userspace calling to register a DMABUF MR with the data direct bit set, the below algorithm will be followed. 1) Obtain a pinned DMABUF umem from the IB core using the user input parameters (FD, offset, length) and the DMA PF device. The DMA PF device is needed to allow the IOMMU to enable the DMA PF to access the user buffer over PCI. 2) Create a KSM MKEY by setting its entries according to the user buffer VA to IOVA mapping, with the MKEY being the data direct device-crossed MKEY. This KSM MKEY is umrable and will be used as part of the MR cache. The PD for creating it is the internal device 'data direct' kernel one. 3) Create a crossing MKEY that points to the KSM MKEY using the crossing access mode. 4) Manage the KSM MKEY by adding it to a list of 'data direct' MKEYs managed on the mlx5_ib device. 5) Return the crossing MKEY to the user, created with its supplied PD. Upon DMA PF unbind flow, the driver will revoke the KSM entries. The final deregistration will occur under the hood once the application deregisters its MKEY. Notes: - This version supports only the PINNED UMEM mode, so there is no dependency on ODP. - The IOVA supplied by the application must be system page aligned due to HW translations of KSM. - The crossing MKEY will not be umrable or part of the MR cache, as we cannot change its crossed (i.e. KSM) MKEY over UMR. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://patch.msgid.link/1f99d8020ed540d9702b9e2252a145a439609ba6.1722512548.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-09Merge branch 'topic/control-lookup-rwlock' into for-nextTakashi Iwai
Pull control lookup optimization changes. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-08-09Merge patch series "nsfs: iterate through mount namespaces"Christian Brauner
Christian Brauner <brauner@kernel.org> says: Recently, we added the ability to list mounts in other mount namespaces and the ability to retrieve namespace file descriptors without having to go through procfs by deriving them from pidfds. This extends nsfs in two ways: (1) Add the ability to retrieve information about a mount namespace via NS_MNT_GET_INFO. This will return the mount namespace id and the number of mounts currently in the mount namespace. The number of mounts can be used to size the buffer that needs to be used for listmount() and is in general useful without having to actually iterate through all the mounts. The structure is extensible. (2) Add the ability to iterate through all mount namespaces over which the caller holds privilege returning the file descriptor for the next or previous mount namespace. To retrieve a mount namespace the caller must be privileged wrt to it's owning user namespace. This means that PID 1 on the host can list all mounts in all mount namespaces or that a container can list all mounts of its nested containers. Optionally pass a structure for NS_MNT_GET_INFO with NS_MNT_GET_{PREV,NEXT} to retrieve information about the mount namespace in one go. (1) and (2) can be implemented for other namespace types easily. Together with recent api additions this means one can iterate through all mounts in all mount namespaces without ever touching procfs. Here's a sample program list_all_mounts_everywhere.c: // SPDX-License-Identifier: GPL-2.0-or-later #define _GNU_SOURCE #include <asm/unistd.h> #include <assert.h> #include <errno.h> #include <fcntl.h> #include <getopt.h> #include <linux/stat.h> #include <sched.h> #include <stddef.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/ioctl.h> #include <sys/param.h> #include <sys/pidfd.h> #include <sys/stat.h> #include <sys/statfs.h> #define die_errno(format, ...) \ do { \ fprintf(stderr, "%m | %s: %d: %s: " format "\n", __FILE__, \ __LINE__, __func__, ##__VA_ARGS__); \ exit(EXIT_FAILURE); \ } while (0) /* Get the id for a mount namespace */ #define NS_GET_MNTNS_ID _IO(0xb7, 0x5) /* Get next mount namespace. */ struct mnt_ns_info { __u32 size; __u32 nr_mounts; __u64 mnt_ns_id; }; #define MNT_NS_INFO_SIZE_VER0 16 /* size of first published struct */ /* Get information about namespace. */ #define NS_MNT_GET_INFO _IOR(0xb7, 10, struct mnt_ns_info) /* Get next namespace. */ #define NS_MNT_GET_NEXT _IOR(0xb7, 11, struct mnt_ns_info) /* Get previous namespace. */ #define NS_MNT_GET_PREV _IOR(0xb7, 12, struct mnt_ns_info) #define PIDFD_GET_MNT_NAMESPACE _IO(0xFF, 3) #define STATX_MNT_ID_UNIQUE 0x00004000U /* Want/got extended stx_mount_id */ #define __NR_listmount 458 #define __NR_statmount 457 /* * @mask bits for statmount(2) */ #define STATMOUNT_SB_BASIC 0x00000001U /* Want/got sb_... */ #define STATMOUNT_MNT_BASIC 0x00000002U /* Want/got mnt_... */ #define STATMOUNT_PROPAGATE_FROM 0x00000004U /* Want/got propagate_from */ #define STATMOUNT_MNT_ROOT 0x00000008U /* Want/got mnt_root */ #define STATMOUNT_MNT_POINT 0x00000010U /* Want/got mnt_point */ #define STATMOUNT_FS_TYPE 0x00000020U /* Want/got fs_type */ #define STATMOUNT_MNT_NS_ID 0x00000040U /* Want/got mnt_ns_id */ #define STATMOUNT_MNT_OPTS 0x00000080U /* Want/got mnt_opts */ struct statmount { __u32 size; /* Total size, including strings */ __u32 mnt_opts; __u64 mask; /* What results were written */ __u32 sb_dev_major; /* Device ID */ __u32 sb_dev_minor; __u64 sb_magic; /* ..._SUPER_MAGIC */ __u32 sb_flags; /* SB_{RDONLY,SYNCHRONOUS,DIRSYNC,LAZYTIME} */ __u32 fs_type; /* [str] Filesystem type */ __u64 mnt_id; /* Unique ID of mount */ __u64 mnt_parent_id; /* Unique ID of parent (for root == mnt_id) */ __u32 mnt_id_old; /* Reused IDs used in proc/.../mountinfo */ __u32 mnt_parent_id_old; __u64 mnt_attr; /* MOUNT_ATTR_... */ __u64 mnt_propagation; /* MS_{SHARED,SLAVE,PRIVATE,UNBINDABLE} */ __u64 mnt_peer_group; /* ID of shared peer group */ __u64 mnt_master; /* Mount receives propagation from this ID */ __u64 propagate_from; /* Propagation from in current namespace */ __u32 mnt_root; /* [str] Root of mount relative to root of fs */ __u32 mnt_point; /* [str] Mountpoint relative to current root */ __u64 mnt_ns_id; __u64 __spare2[49]; char str[]; /* Variable size part containing strings */ }; struct mnt_id_req { __u32 size; __u32 spare; __u64 mnt_id; __u64 param; __u64 mnt_ns_id; }; #define MNT_ID_REQ_SIZE_VER1 32 /* sizeof second published struct */ #define LSMT_ROOT 0xffffffffffffffff /* root mount */ static int __statmount(__u64 mnt_id, __u64 mnt_ns_id, __u64 mask, struct statmount *stmnt, size_t bufsize, unsigned int flags) { struct mnt_id_req req = { .size = MNT_ID_REQ_SIZE_VER1, .mnt_id = mnt_id, .param = mask, .mnt_ns_id = mnt_ns_id, }; return syscall(__NR_statmount, &req, stmnt, bufsize, flags); } static struct statmount *sys_statmount(__u64 mnt_id, __u64 mnt_ns_id, __u64 mask, unsigned int flags) { size_t bufsize = 1 << 15; struct statmount *stmnt = NULL, *tmp = NULL; int ret; for (;;) { tmp = realloc(stmnt, bufsize); if (!tmp) goto out; stmnt = tmp; ret = __statmount(mnt_id, mnt_ns_id, mask, stmnt, bufsize, flags); if (!ret) return stmnt; if (errno != EOVERFLOW) goto out; bufsize <<= 1; if (bufsize >= UINT_MAX / 2) goto out; } out: free(stmnt); printf("statmount failed"); return NULL; } static ssize_t sys_listmount(__u64 mnt_id, __u64 last_mnt_id, __u64 mnt_ns_id, __u64 list[], size_t num, unsigned int flags) { struct mnt_id_req req = { .size = MNT_ID_REQ_SIZE_VER1, .mnt_id = mnt_id, .param = last_mnt_id, .mnt_ns_id = mnt_ns_id, }; return syscall(__NR_listmount, &req, list, num, flags); } int main(int argc, char *argv[]) { #define LISTMNT_BUFFER 10 __u64 list[LISTMNT_BUFFER], last_mnt_id = 0; int ret, pidfd, fd_mntns; struct mnt_ns_info info = {}; pidfd = pidfd_open(getpid(), 0); if (pidfd < 0) die_errno("pidfd_open failed"); fd_mntns = ioctl(pidfd, PIDFD_GET_MNT_NAMESPACE, 0); if (fd_mntns < 0) die_errno("ioctl(PIDFD_GET_MNT_NAMESPACE) failed"); ret = ioctl(fd_mntns, NS_MNT_GET_INFO, &info); if (ret < 0) die_errno("ioctl(NS_GET_MNTNS_ID) failed"); printf("Listing %u mounts for mount namespace %d:%llu\n", info.nr_mounts, fd_mntns, info.mnt_ns_id); for (;;) { ssize_t nr_mounts; next: nr_mounts = sys_listmount(LSMT_ROOT, last_mnt_id, info.mnt_ns_id, list, LISTMNT_BUFFER, 0); if (nr_mounts <= 0) { printf("Finished listing mounts for mount namespace %d:%llu\n\n", fd_mntns, info.mnt_ns_id); ret = ioctl(fd_mntns, NS_MNT_GET_NEXT, 0); if (ret < 0) die_errno("ioctl(NS_MNT_GET_NEXT) failed"); close(ret); ret = ioctl(fd_mntns, NS_MNT_GET_NEXT, &info); if (ret < 0) { if (errno == ENOENT) { printf("Finished listing all mount namespaces\n"); exit(0); } die_errno("ioctl(NS_MNT_GET_NEXT) failed"); } close(fd_mntns); fd_mntns = ret; last_mnt_id = 0; printf("Listing %u mounts for mount namespace %d:%llu\n", info.nr_mounts, fd_mntns, info.mnt_ns_id); goto next; } for (size_t cur = 0; cur < nr_mounts; cur++) { struct statmount *stmnt; last_mnt_id = list[cur]; stmnt = sys_statmount(last_mnt_id, info.mnt_ns_id, STATMOUNT_SB_BASIC | STATMOUNT_MNT_BASIC | STATMOUNT_MNT_ROOT | STATMOUNT_MNT_POINT | STATMOUNT_MNT_NS_ID | STATMOUNT_MNT_OPTS | STATMOUNT_FS_TYPE, 0); if (!stmnt) { printf("Failed to statmount(%llu) in mount namespace(%llu)\n", last_mnt_id, info.mnt_ns_id); continue; } printf("mnt_id(%u/%llu) | mnt_parent_id(%u/%llu): %s @ %s ==> %s with options: %s\n", stmnt->mnt_id_old, stmnt->mnt_id, stmnt->mnt_parent_id_old, stmnt->mnt_parent_id, stmnt->str + stmnt->fs_type, stmnt->str + stmnt->mnt_root, stmnt->str + stmnt->mnt_point, stmnt->str + stmnt->mnt_opts); free(stmnt); } } exit(0); } * patches from https://lore.kernel.org/r/20240719-work-mount-namespace-v1-0-834113cab0d2@kernel.org: nsfs: iterate through mount namespaces file: add fput() cleanup helper fs: add put_mnt_ns() cleanup helper fs: allow mount namespace fd Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-09nsfs: iterate through mount namespacesChristian Brauner
It is already possible to list mounts in other mount namespaces and to retrieve namespace file descriptors without having to go through procfs by deriving them from pidfds. Augment these abilities by adding the ability to retrieve information about a mount namespace via NS_MNT_GET_INFO. This will return the mount namespace id and the number of mounts currently in the mount namespace. The number of mounts can be used to size the buffer that needs to be used for listmount() and is in general useful without having to actually iterate through all the mounts. The structure is extensible. And add the ability to iterate through all mount namespaces over which the caller holds privilege returning the file descriptor for the next or previous mount namespace. To retrieve a mount namespace the caller must be privileged wrt to it's owning user namespace. This means that PID 1 on the host can list all mounts in all mount namespaces or that a container can list all mounts of its nested containers. Optionally pass a structure for NS_MNT_GET_INFO with NS_MNT_GET_{PREV,NEXT} to retrieve information about the mount namespace in one go. Both ioctls can be implemented for other namespace types easily. Together with recent api additions this means one can iterate through all mounts in all mount namespaces without ever touching procfs. Link: https://lore.kernel.org/r/20240719-work-mount-namespace-v1-5-834113cab0d2@kernel.org Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-09Merge tag 'drm-misc-next-2024-08-09' of ↵Daniel Vetter
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.12: UAPI Changes: - remove Power Saving Policy property Core Changes: - update connector documentation CI: - add tests for mediatek, meson, rockchip Driver Changes: amdgpu: - revert support for Power Saving Policy property bridge: - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR mgag200: - transparently support BMC outputs omapdrm: - use common helper for_each_endpoint_of_node() panel: - panel-edp: fix name for HKC MB116AN01 vkms: - clean up endianess warnings Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240809071241.GA222501@localhost.localdomain
2024-08-08bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flagsAlan Maguire
Currently the only opportunity to set sock ops flags dictating which callbacks fire for a socket is from within a TCP-BPF sockops program. This is problematic if the connection is already set up as there is no further chance to specify callbacks for that socket. Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt() to allow users to specify callbacks later, either via an iterator over sockets or via a socket-specific program triggered by a setsockopt() on the socket. Previous discussion on this here [1]. [1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/ Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20240808150558.1035626-2-alan.maguire@oracle.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-08Merge tag 'drm-misc-next-2024-08-01' of ↵Daniel Vetter
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.12: UAPI Changes: virtio: - Define DRM capset Cross-subsystem Changes: dma-buf: - heaps: Clean up documentation printk: - Pass description to kmsg_dump() Core Changes: CI: - Update IGT tests - Point upstream repo to GitLab instance modesetting: - Introduce Power Saving Policy property for connectors - Add might_fault() to drm_modeset_lock priming - Add dynamic per-crtc vblank configuration support panic: - Avoid build-time interference with framebuffer console docs: - Document Colorspace property scheduler: - Remove full_recover from drm_sched_start TTM: - Make LRU walk restartable after dropping locks - Allow direct reclaim to allocate local memory Driver Changes: amdgpu: - Support Power Saving Policy connector property ast: - astdp: Support AST2600 with VGA; Clean up HPD bridge: - Silence error message on -EPROBE_DEFER - analogix: Clean aup - bridge-connector: Fix double free - lt6505: Disable interrupt when powered off - tc358767: Make default DP port preemphasis configurable gma500: - Update i2c terminology ivpu: - Add MODULE_FIRMWARE() lcdif: - Fix pixel clock loongson: - Use GEM refcount over TTM's mgag200: - Improve BMC handling - Support VBLANK intterupts nouveau: - Refactor and clean up internals - Use GEM refcount over TTM's panel: - Shutdown fixes plus documentation - Refactor several drivers for better code sharing - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus DT; Fix porch parameter - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2, CMN N116BCP-EA2, CSW MNB601LS1-4 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor for code sharing sti: - Fix module owner stm: - Avoid UAF wih managed plane and CRTC helpers - Fix module owner - Fix error handling in probe - Depend on COMMON_CLK - ltdc: Fix transparency after disabling plane; Remove unused interrupt tegra: - Call drm_atomic_helper_shutdown() v3d: - Clean up perfmon vkms: - Clean up Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240801121406.GA102996@linux.fritz.box
2024-08-08media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flagsHans Verkuil
The cec_msg_set_reply_to() helper function never zeroed the struct cec_msg flags field, this can cause unexpected behavior if flags was uninitialized to begin with. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: 0dbacebede1e ("[media] cec: move the CEC framework out of staging and to media") Cc: <stable@vger.kernel.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2024-08-06ALSA: ump: Handle MIDI 1.0 Function Block in MIDI 2.0 protocolTakashi Iwai
The UMP v1.1 spec says in the section 6.2.1: "If a UMP Endpoint declares MIDI 2.0 Protocol but a Function Block represents a MIDI 1.0 connection, then may optionally be used for messages to/from that Function Block." It implies that the driver can (and should) keep MIDI 1.0 CVM exceptionally for those FBs even if UMP Endpoint is running in MIDI 2.0 protocol, and the current driver lacks of it. This patch extends the sequencer port info to indicate a MIDI 1.0 port, and tries to send/receive MIDI 1.0 CVM as is when this port is the source or sink. The sequencer port flag is set by the driver at parsing FBs and GTBs although application can set it to its own user-space clients, too. Link: https://patch.msgid.link/20240806070024.14301-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-08-05Merge tag 'drm-xe-next-2024-07-30' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next drm-xe-next for 6.12 UAPI Changes: - Rename xe perf layer as xe observation layer, but was also made available via fixes to previous verison (Ashutosh) - Use write-back caching mode for system memory on DGFX, but was also mad available via fixes to previous version (Thomas) - Expose SIMD16 EU mask in topology query for userspace to know the type of EU, as available in PVC, Lunar Lake and Battlemage (Lucas) - Return ENOBUFS instead of ENOMEM in vm_bind if failure is tied to an array of binds (Matthew Brost) Driver Changes: - Log cleanup moving messages to debug priority (Michal Wajdeczko) - Add timeout to fences to adhere to dma_buf rules (Matthew Brost) - Rename old engine nomenclature to exec_queue (Matthew Brost) - Convert multiple bind ops to 1 job (Matthew Brost) - Add error injection for vm bind to help testing error path (Matthew Brost) - Fix error handling in page table to propagate correctly to userspace (Matthew Brost) - Re-organize and cleanup SR-IOV related registers (Michal Wajdeczko) - Make the device write barrier compatible with VF (Michal Wajdeczko) - New display workarounds for Battlemage (Matthew Auld) - New media workarounds for Lunar Lake and Battlemage (Ngai-Mint Kwan) - New graphics workarounds for Lunar Lake (Bommu Krishnaiah) - Tracepoint updates (Matthew Brost, Nirmoy Das) - Cleanup the header generation for OOB workarounds (Lucas De Marchi) - Fix leaking HDCP-related object (Nirmoy Das) - Serialize L2 flushes to avoid races (Tejas Upadhyay) - Log pid and comm on job timeout (José Roberto de Souza) - Simplify boilerplate code for live kunit (Michal Wajdeczko) - Improve kunit skips for live kunit (Michal Wajdeczko) - Fix xe_sync cleanup when handling xe_exec ioctl (Ashutosh Dixit) - Limit fair VF LMEM provisioning (Michal Wajdeczko) - New workaround to fence mmio writes in Lunar Lake (Tejas Upadhyay) - Warn on writes inaccessible register in VF (Michal Wajdeczko) - Fix register lookup in VF (Michal Wajdeczko) - Add GSC support for Battlemage (Alexander Usyskin) - Fix wedging only the GT in which timeout occurred (Matthew Brost) - Block device suspend when wedging (Matthew Brost) - Handle compression and migration changes for Battlemage (Akshata Jahagirdar) - Limit access of stolen memory for Lunar Lake (Uma Shankar) - Fail invalid addresses during user fence creation (Matthew Brost) - Refcount xe_file to safely and accurately store fdinfo stats (Umesh Nerlige Ramappa) - Cleanup and fix PM reference for TLB invalidation code (Matthew Brost) - Fix PM reference handling when communicating with GuC (Matthew Brost) - Add new BO flag for 2 MiB alignement and use in VF (Michal Wajdeczko) - Simplify MMIO setup for multi-tile platforms (Lucas De Marchi) - Add check for uninitialized access to OOB workarounds (Lucas De Marchi) - New GSC and HuC firmware blobs for Lunar Lake and Battlemage (Daniele Ceraolo Spurio) - Unify mmio wait logic (Gustavo Sousa) - Fix off-by-one when processing RTP rules (Lucas De Marchi) - Future-proof migrate logic with compressed PAT flag (Matt Roper) - Add WA kunit tests for Battlemage (Lucas De Marchi) - Test active tracking for workaorunds with kunit (Lucas De Marchi) - Add kunit tests for RTP with no actions (Lucas De Marchi) - Unify parse of OR rules in RTP (Lucas De Marchi) - Add performance tuning for Battlemage (Sai Teja Pottumuttu) - Make bit masks unsigned (Geert Uytterhoeven) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/k7xuktfav4zmtxxjr77glu2hszypvzgmzghoumh757nqfnk7kn@ccfi4ts3ytbk
2024-08-05media: cec: core: add new CEC_MSG_FL_REPLY_VENDOR_ID flagHans Verkuil
If this flag is set, then the reply is expected to consist of the CEC_MSG_VENDOR_COMMAND_WITH_ID opcode followed by the Vendor ID (as used in bytes 1-4 of the message), followed by the struct cec_msg reply field. Note that this assumes that the byte after the Vendor ID is a vendor-specific opcode. This flag makes it easier to wait for replies to vendor commands, using the same CEC framework support for waiting for regular replies. Support for this flag is indicated by setting the new CEC_CAP_REPLY_VENDOR_ID capability. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2024-08-02Revert "drm: Introduce 'power saving policy' drm property"Hamza Mahfooz
This reverts commit 76299a557f36d624ca32500173ad7856e1ad93c0. It was merged without meeting userspace requirements. Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240802145946.48073-1-hamza.mahfooz@amd.com
2024-08-02uretprobe: change syscall number, againArnd Bergmann
Despite multiple attempts to get the syscall number assignment right for the newly added uretprobe syscall, we ended up with a bit of a mess: - The number is defined as 467 based on the assumption that the xattrat family of syscalls would use 463 through 466, but those did not make it into 6.11. - The include/uapi/asm-generic/unistd.h file still lists the number 463, but the new scripts/syscall.tbl that was supposed to have the same data lists 467 instead as the number for arc, arm64, csky, hexagon, loongarch, nios2, openrisc and riscv. None of these architectures actually provide a uretprobe syscall. - All the other architectures (powerpc, arm, mips, ...) don't list this syscall at all. There are two ways to make it consistent again: either list it with the same syscall number on all architectures, or only list it on x86 but not in scripts/syscall.tbl and asm-generic/unistd.h. Based on the most recent discussion, it seems like we won't need it anywhere else, so just remove the inconsistent assignment and instead move the x86 number to the next available one in the architecture specific range, which is 335. Fixes: 5c28424e9a34 ("syscalls: Fix to add sys_uretprobe to syscall.tbl") Fixes: 190fec72df4a ("uprobe: Wire up uretprobe system call") Fixes: 63ded110979b ("uprobe: Change uretprobe syscall scope and number") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-31binder: frozen notificationYu-Ting Tseng
Frozen processes present a significant challenge in binder transactions. When a process is frozen, it cannot, by design, accept and/or respond to binder transactions. As a result, the sender needs to adjust its behavior, such as postponing transactions until the peer process unfreezes. However, there is currently no way to subscribe to these state change events, making it impossible to implement frozen-aware behaviors efficiently. Introduce a binder API for subscribing to frozen state change events. This allows programs to react to changes in peer process state, mitigating issues related to binder transactions sent to frozen processes. Implementation details: For a given binder_ref, the state of frozen notification can be one of the followings: 1. Userspace doesn't want a notification. binder_ref->freeze is null. 2. Userspace wants a notification but none is in flight. list_empty(&binder_ref->freeze->work.entry) = true 3. A notification is in flight and waiting to be read by userspace. binder_ref_freeze.sent is false. 4. A notification was read by userspace and kernel is waiting for an ack. binder_ref_freeze.sent is true. When a notification is in flight, new state change events are coalesced into the existing binder_ref_freeze struct. If userspace hasn't picked up the notification yet, the driver simply rewrites the state. Otherwise, the notification is flagged as requiring a resend, which will be performed once userspace acks the original notification that's inflight. See https://r.android.com/3070045 for how userspace is going to use this feature. Signed-off-by: Yu-Ting Tseng <yutingtseng@google.com> Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20240709070047.4055369-4-yutingtseng@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-30drm/xe/oa/uapi: Make bit masks unsignedGeert Uytterhoeven
When building with gcc-5: In function ‘decode_oa_format.isra.26’, inlined from ‘xe_oa_set_prop_oa_format’ at drivers/gpu/drm/xe/xe_oa.c:1664:6: ././include/linux/compiler_types.h:510:38: error: call to ‘__compiletime_assert_1336’ declared with attribute error: FIELD_GET: mask is not constant [...] ./include/linux/bitfield.h:155:3: note: in expansion of macro ‘__BF_FIELD_CHECK’ __BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \ ^ drivers/gpu/drm/xe/xe_oa.c:1573:18: note: in expansion of macro ‘FIELD_GET’ u32 bc_report = FIELD_GET(DRM_XE_OA_FORMAT_MASK_BC_REPORT, fmt); ^ Fixes: b6fd51c62119 ("drm/xe/oa/uapi: Define and parse OA stream properties") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240729092634.2227611-1-geert+renesas@glider.be Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-30Merge tag 'v6.11-rc1' into for-6.12Tejun Heo
Linux 6.11-rc1
2024-07-29x86/elf: Add a new FPU buffer layout info to x86 core filesVignesh Balasubramanian
Add a new .note section containing type, size, offset and flags of every xfeature that is present. This information will be used by debuggers to understand the XSAVE layout of the machine where the core file has been dumped, and to read XSAVE registers, especially during cross-platform debugging. The XSAVE layouts of modern AMD and Intel CPUs differ, especially since Memory Protection Keys and the AVX-512 features have been inculcated into the AMD CPUs. Since AMD never adopted (and hence never left room in the XSAVE layout for) the Intel MPX feature, tools like GDB had assumed a fixed XSAVE layout matching that of Intel (based on the XCR0 mask). Hence, core dumps from AMD CPUs didn't match the known size for the XCR0 mask. This resulted in GDB and other tools not being able to access the values of the AVX-512 and PKRU registers on AMD CPUs. To solve this, an interim solution has been accepted into GDB, and is already a part of GDB 14, see https://sourceware.org/pipermail/gdb-patches/2023-March/198081.html. But it depends on heuristics based on the total XSAVE register set size and the XCR0 mask to infer the layouts of the various register blocks for core dumps, and hence, is not a foolproof mechanism to determine the layout of the XSAVE area. Therefore, add a new core dump note in order to allow GDB/LLDB and other relevant tools to determine the layout of the XSAVE area of the machine where the corefile was dumped. The new core dump note (which is being proposed as a per-process .note section), NT_X86_XSAVE_LAYOUT (0x205) contains an array of structures. Each structure describes an individual extended feature containing offset, size and flags in this format: struct x86_xfeat_component { u32 type; u32 size; u32 offset; u32 flags; }; and in an independent manner, allowing for future extensions without depending on hw arch specifics like CPUID etc. [ bp: Massage commit message, zap trailing whitespace. ] Co-developed-by: Jini Susan George <jinisusan.george@amd.com> Signed-off-by: Jini Susan George <jinisusan.george@amd.com> Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Vignesh Balasubramanian <vigbalas@amd.com> Link: https://lore.kernel.org/r/20240725161017.112111-2-vigbalas@amd.com
2024-07-29Merge drm/drm-next into drm-misc-nextThomas Zimmermann
Backmerging to get a late RC of v6.10 before moving into v6.11. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-29spi: Enable controllers to extend the SPI protocol with MOSI idle configurationMarcelo Schmitt
The behavior of an SPI controller data output line (SDO or MOSI or COPI (Controller Output Peripheral Input) for disambiguation) is usually not specified when the controller is not clocking out data on SCLK edges. However, there do exist SPI peripherals that require specific MOSI line state when data is not being clocked out of the controller. Conventional SPI controllers may set the MOSI line on SCLK edges then bring it low when no data is going out or leave the line the state of the last transfer bit. More elaborated controllers are capable to set the MOSI idle state according to different configurable levels and thus are more suitable for interfacing with demanding peripherals. Add SPI mode bits to allow peripherals to request explicit MOSI idle state when needed. When supporting a particular MOSI idle configuration, the data output line state is expected to remain at the configured level when the controller is not clocking out data. When a device that needs a specific MOSI idle state is identified, its driver should request the MOSI idle configuration by setting the proper SPI mode bit. Acked-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Tested-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Marcelo Schmitt <marcelo.schmitt@analog.com> Link: https://patch.msgid.link/9802160b5e5baed7f83ee43ac819cb757a19be55.1720810545.git.marcelo.schmitt@analog.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-07-28nbd: add support for rotational devicesWouter Verhelst
The NBD protocol defines the flag NBD_FLAG_ROTATIONAL to flag that the export in use should be treated as a rotational device. Add support for that flag to the kernel driver. Signed-off-by: Wouter Verhelst <w@uter.be> Reviewed-by: Eric Blake <eblake@redhat.com> Link: https://lore.kernel.org/r/20240725164536.1275851-1-w@uter.be Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-25drm/amdkfd: allow users to target recommended SDMA enginesJonathan Kim
Certain GPUs have better copy performance over xGMI on specific SDMA engines depending on the source and destination GPU. Allow users to create SDMA queues on these recommended engines. Close to 2x overall performance has been observed with this optimization. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-25Merge tag 'net-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. A lot of networking people were at a conference last week, busy catching COVID, so relatively short PR. Current release - regressions: - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP Current release - new code bugs: - l2tp: protect session IDR and tunnel session list with one lock, make sure the state is coherent to avoid a warning - eth: bnxt_en: update xdp_rxq_info in queue restart logic - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field Previous releases - regressions: - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len, the field reuses previously un-validated pad Previous releases - always broken: - tap/tun: drop short frames to prevent crashes later in the stack - eth: ice: add a per-VF limit on number of FDIR filters - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash" * tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) tun: add missing verification for short frame tap: add missing verification for short frame mISDN: Fix a use after free in hfcmulti_tx() gve: Fix an edge case for TSO skb validity check bnxt_en: update xdp_rxq_info in queue restart logic tcp: process the 3rd ACK with sk_socket for TFO/MPTCP selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len bpf: Fix a segment issue when downgrading gso_size net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling MAINTAINERS: make Breno the netconsole maintainer MAINTAINERS: Update bonding entry net: nexthop: Initialize all fields in dumped nexthops net: stmmac: Correct byte order of perfect_match selftests: forwarding: skip if kernel not support setting bridge fdb learning limit tipc: Return non-zero value from tipc_udp_addr2str() on error netfilter: nft_set_pipapo_avx2: disable softinterrupts ice: Fix recipe read procedure ice: Add a per-VF limit on number of FDIR filters net: bonding: correctly annotate RCU in bond_should_notify_peers() ...
2024-07-25Merge tag 'uml-for-linus-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Support for preemption - i386 Rust support - Huge cleanup by Benjamin Berg - UBSAN support - Removal of dead code * tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (41 commits) um: vector: always reset vp->opened um: vector: remove vp->lock um: register power-off handler um: line: always fill *error_out in setup_one_line() um: remove pcap driver from documentation um: Enable preemption in UML um: refactor TLB update handling um: simplify and consolidate TLB updates um: remove force_flush_all from fork_handler um: Do not flush MM in flush_thread um: Delay flushing syscalls until the thread is restarted um: remove copy_context_skas0 um: remove LDT support um: compress memory related stub syscalls while adding them um: Rework syscall handling um: Add generic stub_syscall6 function um: Create signal stack memory assignment in stub_data um: Remove stub-data.h include from common-offsets.h um: time-travel: fix signal blocking race/hang um: time-travel: remove time_exit() ...
2024-07-25Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-07-25 We've added 14 non-merge commits during the last 8 day(s) which contain a total of 19 files changed, 177 insertions(+), 70 deletions(-). The main changes are: 1) Fix af_unix to disable MSG_OOB handling for sockets in BPF sockmap and BPF sockhash. Also add test coverage for this case, from Michal Luczaj. 2) Fix a segmentation issue when downgrading gso_size in the BPF helper bpf_skb_adjust_room(), from Fred Li. 3) Fix a compiler warning in resolve_btfids due to a missing type cast, from Liwei Song. 4) Fix stack allocation for arm64 to align the stack pointer at a 16 byte boundary in the fexit_sleep BPF selftest, from Puranjay Mohan. 5) Fix a xsk regression to require a flag when actuating tx_metadata_len, from Stanislav Fomichev. 6) Fix function prototype BTF dumping in libbpf for prototypes that have no input arguments, from Andrii Nakryiko. 7) Fix stacktrace symbol resolution in perf script for BPF programs containing subprograms, from Hou Tao. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len bpf: Fix a segment issue when downgrading gso_size tools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids bpf, events: Use prog to emit ksymbol event for main program selftests/bpf: Test sockmap redirect for AF_UNIX MSG_OOB selftests/bpf: Parametrize AF_UNIX redir functions to accept send() flags selftests/bpf: Support SOCK_STREAM in unix_inet_redir_to_connected() af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash bpftool: Fix typo in usage help libbpf: Fix no-args func prototype BTF dumping syntax MAINTAINERS: Update powerpc BPF JIT maintainers MAINTAINERS: Update email address of Naveen selftests/bpf: fexit_sleep: Fix stack allocation for arm64 ==================== Link: https://patch.msgid.link/20240725114312.32197-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-25xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_lenStanislav Fomichev
Julian reports that commit 341ac980eab9 ("xsk: Support tx_metadata_len") can break existing use cases which don't zero-initialize xdp_umem_reg padding. Introduce new XDP_UMEM_TX_METADATA_LEN to make sure we interpret the padding as tx_metadata_len only when being explicitly asked. Fixes: 341ac980eab9 ("xsk: Support tx_metadata_len") Reported-by: Julian Schindel <mail@arctic-alpaca.de> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20240713015253.121248-2-sdf@fomichev.me
2024-07-24drm/virtio: Add DRM capset definitionDmitry Osipenko
Define DRM native context capset in the VirtIO-GPU protocol header. Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240714205502.3409718-1-dmitry.osipenko@collabora.com
2024-07-24Merge tag 'random-6.11-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "This adds getrandom() support to the vDSO. First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which lets the kernel zero out pages anytime under memory pressure, which enables allocating memory that never gets swapped to disk but also doesn't count as being mlocked. Then, the vDSO implementation of getrandom() is introduced in a generic manner and hooked into random.c. Next, this is implemented on x86. (Also, though it's not ready for this pull, somebody has begun an arm64 implementation already) Finally, two vDSO selftests are added. There are also two housekeeping cleanup commits" * tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: MAINTAINERS: add random.h headers to RNG subsection random: note that RNDGETPOOL was removed in 2.6.9-rc2 selftests/vDSO: add tests for vgetrandom x86: vdso: Wire up getrandom() vDSO implementation random: introduce generic vDSO getrandom() implementation mm: add MAP_DROPPABLE for designating always lazily freeable mappings
2024-07-21Merge tag 'mm-stable-2024-07-21-14-50' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - In the series "mm: Avoid possible overflows in dirty throttling" Jan Kara addresses a couple of issues in the writeback throttling code. These fixes are also targetted at -stable kernels. - Ryusuke Konishi's series "nilfs2: fix potential issues related to reserved inodes" does that. This should actually be in the mm-nonmm-stable tree, along with the many other nilfs2 patches. My bad. - More folio conversions from Kefeng Wang in the series "mm: convert to folio_alloc_mpol()" - Kemeng Shi has sent some cleanups to the writeback code in the series "Add helper functions to remove repeated code and improve readability of cgroup writeback" - Kairui Song has made the swap code a little smaller and a little faster in the series "mm/swap: clean up and optimize swap cache index". - In the series "mm/memory: cleanly support zeropage in vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David Hildenbrand has reworked the rather sketchy handling of the use of the zeropage in MAP_SHARED mappings. I don't see any runtime effects here - more a cleanup/understandability/maintainablity thing. - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of higher addresses, for aarch64. The (poorly named) series is "Restructure va_high_addr_switch". - The core TLB handling code gets some cleanups and possible slight optimizations in Bang Li's series "Add update_mmu_tlb_range() to simplify code". - Jane Chu has improved the handling of our fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the series "Enhance soft hwpoison handling and injection". - Jeff Johnson has sent a billion patches everywhere to add MODULE_DESCRIPTION() to everything. Some landed in this pull. - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has simplified migration's use of hardware-offload memory copying. - Yosry Ahmed performs more folio API conversions in his series "mm: zswap: trivial folio conversions". - In the series "large folios swap-in: handle refault cases first", Chuanhua Han inches us forward in the handling of large pages in the swap code. This is a cleanup and optimization, working toward the end objective of full support of large folio swapin/out. - In the series "mm,swap: cleanup VMA based swap readahead window calculation", Huang Ying has contributed some cleanups and a possible fixlet to his VMA based swap readahead code. - In the series "add mTHP support for anonymous shmem" Baolin Wang has taught anonymous shmem mappings to use multisize THP. By default this is a no-op - users must opt in vis sysfs controls. Dramatic improvements in pagefault latency are realized. - David Hildenbrand has some cleanups to our remaining use of page_mapcount() in the series "fs/proc: move page_mapcount() to fs/proc/internal.h". - David also has some highmem accounting cleanups in the series "mm/highmem: don't track highmem pages manually". - Build-time fixes and cleanups from John Hubbard in the series "cleanups, fixes, and progress towards avoiding "make headers"". - Cleanups and consolidation of the core pagemap handling from Barry Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers and utilize them". - Lance Yang's series "Reclaim lazyfree THP without splitting" has reduced the latency of the reclaim of pmd-mapped THPs under fairly common circumstances. A 10x speedup is seen in a microbenchmark. It does this by punting to aother CPU but I guess that's a win unless all CPUs are pegged. - hugetlb_cgroup cleanups from Xiu Jianfeng in the series "mm/hugetlb_cgroup: rework on cftypes". - Miaohe Lin's series "Some cleanups for memory-failure" does just that thing. - Someone other than SeongJae has developed a DAMON feature in Honggyu Kim's series "DAMON based tiered memory management for CXL memory". This adds DAMON features which may be used to help determine the efficiency of our placement of CXL/PCIe attached DRAM. - DAMON user API centralization and simplificatio work in SeongJae Park's series "mm/damon: introduce DAMON parameters online commit function". - In the series "mm: page_type, zsmalloc and page_mapcount_reset()" David Hildenbrand does some maintenance work on zsmalloc - partially modernizing its use of pageframe fields. - Kefeng Wang provides more folio conversions in the series "mm: remove page_maybe_dma_pinned() and page_mkclean()". - More cleanup from David Hildenbrand, this time in the series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline() pages" and permits the removal of some virtio-mem hacks. - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap()" is a cleanup to the anon folio handling in preparation for mTHP (multisize THP) swapin. - Kefeng Wang's series "mm: improve clear and copy user folio" implements more folio conversions, this time in the area of large folio userspace copying. - The series "Docs/mm/damon/maintaier-profile: document a mailing tool and community meetup series" tells people how to get better involved with other DAMON developers. From SeongJae Park. - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does that. - David Hildenbrand sends along more cleanups, this time against the migration code. The series is "mm/migrate: move NUMA hinting fault folio isolation + checks under PTL". - Jan Kara has found quite a lot of strangenesses and minor errors in the readahead code. He addresses this in the series "mm: Fix various readahead quirks". - SeongJae Park's series "selftests/damon: test DAMOS tried regions and {min,max}_nr_regions" adds features and addresses errors in DAMON's self testing code. - Gavin Shan has found a userspace-triggerable WARN in the pagecache code. The series "mm/filemap: Limit page cache size to that supported by xarray" addresses this. The series is marked cc:stable. - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations and cleanup" cleans up and slightly optimizes KSM. - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of code motion. The series (which also makes the memcg-v1 code Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put under config option" and "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1" - Dan Schatzberg's series "Add swappiness argument to memory.reclaim" adds an additional feature to this cgroup-v2 control file. - The series "Userspace controls soft-offline pages" from Jiaqi Yan permits userspace to stop the kernel's automatic treatment of excessive correctable memory errors. In order to permit userspace to monitor and handle this situation. - Kefeng Wang's series "mm: migrate: support poison recover from migrate folio" teaches the kernel to appropriately handle migration from poisoned source folios rather than simply panicing. - SeongJae Park's series "Docs/damon: minor fixups and improvements" does those things. - In the series "mm/zsmalloc: change back to per-size_class lock" Chengming Zhou improves zsmalloc's scalability and memory utilization. - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare refcount increments. So these paes can first be moved aside if they reside in the movable zone or a CMA block. - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps for much faster reading of vma information. The series is "query VMAs from /proc/<pid>/maps". - In the series "mm: introduce per-order mTHP split counters" Lance Yang improves the kernel's presentation of developer information related to multisize THP splitting. - Michael Ellerman has developed the series "Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)". This permits userspace to use all available huge page sizes. - In the series "revert unconditional slab and page allocator fault injection calls" Vlastimil Babka removes a performance-affecting and not very useful feature from slab fault injection. * tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits) mm/mglru: fix ineffective protection calculation mm/zswap: fix a white space issue mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio mm/hugetlb: fix possible recursive locking detected warning mm/gup: clear the LRU flag of a page before adding to LRU batch mm/numa_balancing: teach mpol_to_str about the balancing mode mm: memcg1: convert charge move flags to unsigned long long alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting lib: reuse page_ext_data() to obtain codetag_ref lib: add missing newline character in the warning message mm/mglru: fix overshooting shrinker memory mm/mglru: fix div-by-zero in vmpressure_calc_level() mm/kmemleak: replace strncpy() with strscpy() mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB mm: ignore data-race in __swap_writepage hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr mm: shmem: rename mTHP shmem counters mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async() mm/migrate: putback split folios when numa hint migration fails ...
2024-07-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
2024-07-20Merge tag 'landlock-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This simplifies code and improves documentation" * tag 'landlock-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: landlock: Various documentation improvements landlock: Clarify documentation for struct landlock_ruleset_attr landlock: Use bit-fields for storing handled layer access masks
2024-07-19Merge tag 'char-misc-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc and other driver updates from Greg KH: "Here is the "big" set of char/misc and other driver subsystem changes for 6.11-rc1. Nothing major in here, just loads of new drivers and updates. Included in here are: - IIO api updates and new drivers added - wait_interruptable_timeout() api cleanups for some drivers - MODULE_DESCRIPTION() additions for loads of drivers - parport out-of-bounds fix - interconnect driver updates and additions - mhi driver updates and additions - w1 driver fixes - binder speedups and fixes - eeprom driver updates - coresight driver updates - counter driver update - new misc driver additions - other minor api updates All of these, EXCEPT for the final Kconfig build fix for 32bit systems, have been in linux-next for a while with no reported issues. The Kconfig fixup went in 29 hours ago, so might have missed the latest linux-next, but was acked by everyone involved" * tag 'char-misc-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (330 commits) misc: Kconfig: exclude mrvl-cn10k-dpi compilation for 32-bit systems misc: delete Makefile.rej binder: fix hang of unregistered readers misc: Kconfig: add a new dependency for MARVELL_CN10K_DPI virtio: add missing MODULE_DESCRIPTION() macro agp: uninorth: add missing MODULE_DESCRIPTION() macro spmi: add missing MODULE_DESCRIPTION() macros dev/parport: fix the array out-of-bounds risk samples: configfs: add missing MODULE_DESCRIPTION() macro misc: mrvl-cn10k-dpi: add Octeon CN10K DPI administrative driver misc: keba: Fix missing AUXILIARY_BUS dependency slimbus: Fix struct and documentation alignment in stream.c MAINTAINERS: CC dri-devel list on Qualcomm FastRPC patches misc: fastrpc: use coherent pool for untranslated Compute Banks misc: fastrpc: support complete DMA pool access to the DSP misc: fastrpc: add missing MODULE_DESCRIPTION() macro misc: fastrpc: Add missing dev_err newlines misc: fastrpc: Use memdup_user() nvmem: core: Implement force_ro sysfs attribute nvmem: Use sysfs_emit() for type attribute ...
2024-07-19Merge tag 'sound-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "Lots of changes in this cycle, but mostly for cleanups and refactoring. Significant amount of changes are about DT schema conversions for ASoC at this time while we see other usual suspects, too. Some highlights below: Core: - Re-introduction of PCM sync ID support API - MIDI2 time-base extension in ALSA sequencer API ASoC: - Syncing of features between simple-audio-card and the two audio-graph cards - Support for specifying the order of operations for components within cards to allow quirking for unusual systems - Lots of DT schema conversions - Continued SOF/Intel updates for topology, SoundWire, IPC3/4 - New support for Asahi Kasei AK4619, Cirrus Logic CS530x, Everest Semiconductors ES8311, NXP i.MX95 and LPC32xx, Qualcomm LPASS v2.5 and WCD937x, Realtek RT1318 and RT1320 and Texas Instruments PCM5242 HD-audio: - More quirks, Intel PantherLake support, senarytech codec support - Refactoring of Cirrus codec component-binding Others: - ALSA control kselftest improvements, and fixes for input value checks in various drivers" * tag 'sound-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (349 commits) kselftest/alsa: Log the PCM ID in pcm-test kselftest/alsa: Use card name rather than number in test names ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book Pro 360 ALSA: hda/tas2781: Add new quirk for Lenovo Hera2 Laptop ALSA: seq: ump: Skip useless ports for static blocks ALSA: pcm_dmaengine: Don't synchronize DMA channel when DMA is paused ALSA: usb: Use BIT() for bit values ALSA: usb: Fix UBSAN warning in parse_audio_unit() ALSA: hda/realtek: Enable headset mic on Positivo SU C1400 ASoC: tas2781: Add new Kontrol to set tas2563 digital Volume ASoC: codecs: wcd937x: Remove separate handling for vdd-buck supply ASoC: codecs: wcd937x: Remove the string compare in MIC BIAS widget settings ASoC: codecs: wcd937x-sdw: Fix Unbalanced pm_runtime_enable ASoC: dt-bindings: cirrus,cs42xx8: Convert to dtschema ASoC: cs530x: Remove bclk from private structure ASoC: cs530x: Calculate proper bclk rate using TDM ASoC: dt-bindings: cirrus,cs4270: Convert to dtschema firmware: cs_dsp: Rename fw_ver to wmfw_ver firmware: cs_dsp: Clarify wmfw format version log message firmware: cs_dsp: Make wmfw and bin filename arguments const char * ...
2024-07-19random: note that RNDGETPOOL was removed in 2.6.9-rc2Jason A. Donenfeld
RNDGETPOOL was thankfully removed twenty years ago, but it's stuck around in headers. Probably removing it from uapi headers isn't great in case there are some weird users out there, but we should at least mark this as having been removed, to save future readers the same goose chase I just went on. Link: https://lore.kernel.org/all/E1By1St-0001TS-Qj@thunk.org/ Link: https://lore.kernel.org/all/Pine.LNX.4.58.0409130937050.4094@ppc970.osdl.org/ Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>