summaryrefslogtreecommitdiff
path: root/include/uapi/linux
AgeCommit message (Collapse)Author
2021-02-11bpf: Be less specific about socket cookies guaranteesFlorent Revest
Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one" socket cookies are not guaranteed to be non-decreasing. The bpf_get_socket_cookie helper descriptions are currently specifying that cookies are non-decreasing but we don't want users to rely on that. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/bpf/20210210111406.785541-1-revest@chromium.org
2021-02-11bpf: Count the number of times recursion was preventedAlexei Starovoitov
Add per-program counter for number of times recursion prevention mechanism was triggered and expose it via show_fdinfo and bpf_prog_info. Teach bpftool to print it. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210210033634.62081-7-alexei.starovoitov@gmail.com
2021-02-11dm: add support for passing through inline crypto supportSatya Tangirala
Update the device-mapper core to support exposing the inline crypto support of the underlying device(s) through the device-mapper device. This works by creating a "passthrough keyslot manager" for the dm device, which declares support for encryption settings which all underlying devices support. When a supported setting is used, the bio cloning code handles cloning the crypto context to the bios for all the underlying devices. When an unsupported setting is used, the blk-crypto fallback is used as usual. Crypto support on each underlying device is ignored unless the corresponding dm target opts into exposing it. This is needed because for inline crypto to semantically operate on the original bio, the data must not be transformed by the dm target. Thus, targets like dm-linear can expose crypto support of the underlying device, but targets like dm-crypt can't. (dm-crypt could use inline crypto itself, though.) A DM device's table can only be changed if the "new" inline encryption capabilities are a (*not* necessarily strict) superset of the "old" inline encryption capabilities. Attempts to make changes to the table that result in some inline encryption capability becoming no longer supported will be rejected. For the sake of clarity, key eviction from underlying devices will be handled in a future patch. Co-developed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Satya Tangirala <satyat@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-10net/sched: cls_flower: Reject invalid ct_state flags ruleswenxu
Reject the unsupported and invalid ct_state flags of cls flower rules. Fixes: e0ace68af2ac ("net/sched: cls_flower: Add matching on conntrack info") Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
2021-02-10KVM: PPC: Book3S HV: Introduce new capability for 2nd DAWRRavi Bangoria
Introduce KVM_CAP_PPC_DAWR1 which can be used by QEMU to query whether KVM supports 2nd DAWR or not. The capability is by default disabled even when the underlying CPU supports 2nd DAWR. QEMU needs to check and enable it manually to use the feature. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2021-02-09uapi: map_to_7segment: Update example in documentationGeert Uytterhoeven
The device_attribute .show() and .store() methods gained an extra parameter in v2.6.13, but the example in the documentation for the 7-segment header file was never updated. Add the missing parameters. While at it, get rid of the (misspelled) deprecated symbolic permissions, and switch to DEVICE_ATTR_RW(), which was introduced in v3.11 Fixes: 54b6f35c99974e99 ("[PATCH] Driver core: change device_attribute callbacks") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20210207130543.2128980-1-geert@linux-m68k.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce irqfdShuo Liu
irqfd is a mechanism to inject a specific interrupt to a User VM using a decoupled eventfd mechanism. Vhost is a kernel-level virtio server which uses eventfd for interrupt injection. To support vhost on ACRN, irqfd is introduced in HSM. HSM provides ioctls to associate a virtual Message Signaled Interrupt (MSI) with an eventfd. The corresponding virtual MSI will be injected into a User VM once the eventfd got signal. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-17-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce ioeventfdShuo Liu
ioeventfd is a mechanism to register PIO/MMIO regions to trigger an eventfd signal when written to by a User VM. ACRN userspace can register any arbitrary I/O address with a corresponding eventfd and then pass the eventfd to a specific end-point of interest for handling. Vhost is a kernel-level virtio server which uses eventfd for signalling. To support vhost on ACRN, ioeventfd is introduced in HSM. A new I/O client dedicated to ioeventfd is associated with a User VM during VM creation. HSM provides ioctls to associate an I/O region with a eventfd. The I/O client signals a eventfd once its corresponding I/O region is matched with an I/O request. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-16-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce interfaces to query C-states and P-states allowed by ↵Shuo Liu
hypervisor The C-states and P-states data are used to support CPU power management. The hypervisor controls C-states and P-states for a User VM. ACRN userspace need to query the data from the hypervisor to build ACPI tables for a User VM. HSM provides ioctls for ACRN userspace to query C-states and P-states data obtained from the hypervisor. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-14-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce interrupt injection interfacesShuo Liu
ACRN userspace need to inject virtual interrupts into a User VM in devices emulation. HSM needs provide interfaces to do so. Introduce following interrupt injection interfaces: ioctl ACRN_IOCTL_SET_IRQLINE: Pass data from userspace to the hypervisor, and inform the hypervisor to inject a virtual IOAPIC GSI interrupt to a User VM. ioctl ACRN_IOCTL_INJECT_MSI: Pass data struct acrn_msi_entry from userspace to the hypervisor, and inform the hypervisor to inject a virtual MSI to a User VM. ioctl ACRN_IOCTL_VM_INTR_MONITOR: Set a 4-Kbyte aligned shared page for statistics information of interrupts of a User VM. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-13-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce interfaces for PCI device passthroughShuo Liu
PCI device passthrough enables an OS in a virtual machine to directly access a PCI device in the host. It promises almost the native performance, which is required in performance-critical scenarios of ACRN. HSM provides the following ioctls: - Assign - ACRN_IOCTL_ASSIGN_PCIDEV Pass data struct acrn_pcidev from userspace to the hypervisor, and inform the hypervisor to assign a PCI device to a User VM. - De-assign - ACRN_IOCTL_DEASSIGN_PCIDEV Pass data struct acrn_pcidev from userspace to the hypervisor, and inform the hypervisor to de-assign a PCI device from a User VM. - Set a interrupt of a passthrough device - ACRN_IOCTL_SET_PTDEV_INTR Pass data struct acrn_ptdev_irq from userspace to the hypervisor, and inform the hypervisor to map a INTx interrupt of passthrough device of User VM. - Reset passthrough device interrupt - ACRN_IOCTL_RESET_PTDEV_INTR Pass data struct acrn_ptdev_irq from userspace to the hypervisor, and inform the hypervisor to unmap a INTx interrupt of passthrough device of User VM. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-12-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce PCI configuration space PIO accesses combinerShuo Liu
A User VM can access its virtual PCI configuration spaces via port IO approach, which has two following steps: 1) writes address into port 0xCF8 2) put/get data in/from port 0xCFC To distribute a complete PCI configuration space access one time, HSM need to combine such two accesses together. Combine two paired PIO I/O requests into one PCI I/O request and continue the I/O request distribution. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-11-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce I/O request managementShuo Liu
An I/O request of a User VM, which is constructed by the hypervisor, is distributed by the ACRN Hypervisor Service Module to an I/O client corresponding to the address range of the I/O request. For each User VM, there is a shared 4-KByte memory region used for I/O requests communication between the hypervisor and Service VM. An I/O request is a 256-byte structure buffer, which is 'struct acrn_io_request', that is filled by an I/O handler of the hypervisor when a trapped I/O access happens in a User VM. ACRN userspace in the Service VM first allocates a 4-KByte page and passes the GPA (Guest Physical Address) of the buffer to the hypervisor. The buffer is used as an array of 16 I/O request slots with each I/O request slot being 256 bytes. This array is indexed by vCPU ID. An I/O client, which is 'struct acrn_ioreq_client', is responsible for handling User VM I/O requests whose accessed GPA falls in a certain range. Multiple I/O clients can be associated with each User VM. There is a special client associated with each User VM, called the default client, that handles all I/O requests that do not fit into the range of any other I/O clients. The ACRN userspace acts as the default client for each User VM. The state transitions of a ACRN I/O request are as follows. FREE -> PENDING -> PROCESSING -> COMPLETE -> FREE -> ... FREE: this I/O request slot is empty PENDING: a valid I/O request is pending in this slot PROCESSING: the I/O request is being processed COMPLETE: the I/O request has been processed An I/O request in COMPLETE or FREE state is owned by the hypervisor. HSM and ACRN userspace are in charge of processing the others. The processing flow of I/O requests are listed as following: a) The I/O handler of the hypervisor will fill an I/O request with PENDING state when a trapped I/O access happens in a User VM. b) The hypervisor makes an upcall, which is a notification interrupt, to the Service VM. c) The upcall handler schedules a worker to dispatch I/O requests. d) The worker looks for the PENDING I/O requests, assigns them to different registered clients based on the address of the I/O accesses, updates their state to PROCESSING, and notifies the corresponding client to handle. e) The notified client handles the assigned I/O requests. f) The HSM updates I/O requests states to COMPLETE and notifies the hypervisor of the completion via hypercalls. Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-10-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce EPT mapping managementShuo Liu
The HSM provides hypervisor services to the ACRN userspace. While launching a User VM, ACRN userspace needs to allocate memory and request the ACRN Hypervisor to set up the EPT mapping for the VM. A mapping cache is introduced for accelerating the translation between the Service VM kernel virtual address and User VM physical address. >From the perspective of the hypervisor, the types of GPA of User VM can be listed as following: 1) RAM region, which is used by User VM as system ram. 2) MMIO region, which is recognized by User VM as MMIO. MMIO region is used to be utilized for devices emulation. Generally, User VM RAM regions mapping is set up before VM started and is released in the User VM destruction. MMIO regions mapping may be set and unset dynamically during User VM running. To achieve this, ioctls ACRN_IOCTL_SET_MEMSEG and ACRN_IOCTL_UNSET_MEMSEG are introduced in HSM. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-9-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce an ioctl to set vCPU registers stateShuo Liu
A virtual CPU of User VM has different context due to the different registers state. ACRN userspace needs to set the virtual CPU registers state (e.g. giving a initial registers state to a virtual BSP of a User VM). HSM provides an ioctl ACRN_IOCTL_SET_VCPU_REGS to do the virtual CPU registers state setting. The ioctl passes the registers state from ACRN userspace to the hypervisor directly. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-8-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-09virt: acrn: Introduce VM management interfacesShuo Liu
The VM management interfaces expose several VM operations to ACRN userspace via ioctls. For example, creating VM, starting VM, destroying VM and so on. The ACRN Hypervisor needs to exchange data with the ACRN userspace during the VM operations. HSM provides VM operation ioctls to the ACRN userspace and communicates with the ACRN Hypervisor for VM operations via hypercalls. HSM maintains a list of User VM. Each User VM will be bound to an existing file descriptor of /dev/acrn_hsm. The User VM will be destroyed when the file descriptor is closed. Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yu Wang <yu1.wang@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210207031040.49576-7-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-08rtnetlink: Add RTM_F_OFFLOAD_FAILED flagAmit Cohen
The flag indicates to user space that route offload failed. Previous patch set added the ability to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed, but if the offload fails there is no indication to user-space. The flag will be used in subsequent patches by netdevsim and mlxsw to indicate to user space that route offload failed, so that users will have better visibility into the offload process. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-08Merge tag 'batadv-next-pullrequest-20210208' of ↵Jakub Kicinski
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This feature/cleanup patchset is an updated version of the pull request of Feb 2nd (batadv-next-pullrequest-20210202) and includes the following patches: - Bump version strings, by Simon Wunderlich (added commit log) - Drop publication years from copyright info, by Sven Eckelmann (replaced the previous patch which updated copyright years, as per our discussion) - Avoid sizeof on flexible structure, by Sven Eckelmann (unchanged) - Fix names for kernel-doc blocks, by Sven Eckelmann (unchanged) * tag 'batadv-next-pullrequest-20210208' of git://git.open-mesh.org/linux-merge: batman-adv: Fix names for kernel-doc blocks batman-adv: Avoid sizeof on flexible structure batman-adv: Drop publication years from copyright info batman-adv: Start new development cycle ==================== Link: https://lore.kernel.org/r/20210208165938.13262-1-sw@simonwunderlich.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-08gfs2: Add trusted xattr supportAndreas Gruenbacher
Add support for an additional filesystem version (sb_fs_format = 1802). When a filesystem with the new version is mounted, the filesystem supports "trusted.*" xattrs. In addition, version 1802 filesystems implement a form of forward compatibility for xattrs: when xattrs with an unknown prefix (ea_type) are found on a version 1802 filesystem, those attributes are not shown by listxattr, and they are not accessible by getxattr, setxattr, or removexattr. This mechanism might turn out to be what we need in the future, but if not, we can always bump the filesystem version and break compatibility instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Andrew Price <anprice@redhat.com>
2021-02-08Merge 5.11-rc7 into usb-nextGreg Kroah-Hartman
We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-07fs-verity: support reading signature with ioctlEric Biggers
Add support for FS_VERITY_METADATA_TYPE_SIGNATURE to FS_IOC_READ_VERITY_METADATA. This allows a userspace server program to retrieve the built-in signature (if present) of a verity file for serving to a client which implements fs-verity compatible verification. See the patch which introduced FS_IOC_READ_VERITY_METADATA for more details. The ability for userspace to read the built-in signatures is also useful because it allows a system that is using the in-kernel signature verification to migrate to userspace signature verification. This has been tested using a new xfstest which calls this ioctl via a new subcommand for the 'fsverity' program from fsverity-utils. Link: https://lore.kernel.org/r/20210115181819.34732-7-ebiggers@kernel.org Reviewed-by: Victor Hsieh <victorhsieh@google.com> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-02-07fs-verity: support reading descriptor with ioctlEric Biggers
Add support for FS_VERITY_METADATA_TYPE_DESCRIPTOR to FS_IOC_READ_VERITY_METADATA. This allows a userspace server program to retrieve the fs-verity descriptor of a file for serving to a client which implements fs-verity compatible verification. See the patch which introduced FS_IOC_READ_VERITY_METADATA for more details. "fs-verity descriptor" here means only the part that userspace cares about because it is hashed to produce the file digest. It doesn't include the signature which ext4 and f2fs append to the fsverity_descriptor struct when storing it on-disk, since that way of storing the signature is an implementation detail. The next patch adds a separate metadata_type value for retrieving the signature separately. This has been tested using a new xfstest which calls this ioctl via a new subcommand for the 'fsverity' program from fsverity-utils. Link: https://lore.kernel.org/r/20210115181819.34732-6-ebiggers@kernel.org Reviewed-by: Victor Hsieh <victorhsieh@google.com> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-02-07fs-verity: support reading Merkle tree with ioctlEric Biggers
Add support for FS_VERITY_METADATA_TYPE_MERKLE_TREE to FS_IOC_READ_VERITY_METADATA. This allows a userspace server program to retrieve the Merkle tree of a verity file for serving to a client which implements fs-verity compatible verification. See the patch which introduced FS_IOC_READ_VERITY_METADATA for more details. This has been tested using a new xfstest which calls this ioctl via a new subcommand for the 'fsverity' program from fsverity-utils. Link: https://lore.kernel.org/r/20210115181819.34732-5-ebiggers@kernel.org Reviewed-by: Victor Hsieh <victorhsieh@google.com> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-02-07fs-verity: add FS_IOC_READ_VERITY_METADATA ioctlEric Biggers
Add an ioctl FS_IOC_READ_VERITY_METADATA which will allow reading verity metadata from a file that has fs-verity enabled, including: - The Merkle tree - The fsverity_descriptor (not including the signature if present) - The built-in signature, if present This ioctl has similar semantics to pread(). It is passed the type of metadata to read (one of the above three), and a buffer, offset, and size. It returns the number of bytes read or an error. Separate patches will add support for each of the above metadata types. This patch just adds the ioctl itself. This ioctl doesn't make any assumption about where the metadata is stored on-disk. It does assume the metadata is in a stable format, but that's basically already the case: - The Merkle tree and fsverity_descriptor are defined by how fs-verity file digests are computed; see the "File digest computation" section of Documentation/filesystems/fsverity.rst. Technically, the way in which the levels of the tree are ordered relative to each other wasn't previously specified, but it's logical to put the root level first. - The built-in signature is the value passed to FS_IOC_ENABLE_VERITY. This ioctl is useful because it allows writing a server program that takes a verity file and serves it to a client program, such that the client can do its own fs-verity compatible verification of the file. This only makes sense if the client doesn't trust the server and if the server needs to provide the storage for the client. More concretely, there is interest in using this ability in Android to export APK files (which are protected by fs-verity) to "protected VMs". This would use Protected KVM (https://lwn.net/Articles/836693), which provides an isolated execution environment without having to trust the traditional "host". A "guest" VM can boot from a signed image and perform specific tasks in a minimum trusted environment using files that have fs-verity enabled on the host, without trusting the host or requiring that the guest has its own trusted storage. Technically, it would be possible to duplicate the metadata and store it in separate files for serving. However, that would be less efficient and would require extra care in userspace to maintain file consistency. In addition to the above, the ability to read the built-in signatures is useful because it allows a system that is using the in-kernel signature verification to migrate to userspace signature verification. Link: https://lore.kernel.org/r/20210115181819.34732-4-ebiggers@kernel.org Reviewed-by: Victor Hsieh <victorhsieh@google.com> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-02-07Merge tag 'core_urgent_for_v5.11_rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull syscall entry fixes from Borislav Petkov: - For syscall user dispatch, separate prctl operation from syscall redirection range specification before the API has been made official in 5.11. - Ensure tasks using the generic syscall code do trap after returning from a syscall when single-stepping is requested. * tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: entry: Use different define for selector variable in SUD entry: Ensure trap after single-step on system call return
2021-02-06batman-adv: Drop publication years from copyright infoSven Eckelmann
The batman-adv source code was using the year of publication (to net-next) as "last" year for the copyright statement. The whole source code mentioned in the MAINTAINERS "BATMAN ADVANCED" section was handled as a single entity regarding the publishing year. This avoided having outdated (in sense of year information - not copyright holder) publishing information inside several files. But since the simple "update copyright year" commit (without other changes) in the file was not well received in the upstream kernel, the option to not have a copyright year (for initial and last publication) in the files are chosen instead. More detailed information about the years can still be retrieved from the SCM system. Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2021-02-06entry: Use different define for selector variable in SUDGabriel Krisman Bertazi
Michael Kerrisk suggested that, from an API perspective, it is a bad idea to share the PR_SYS_DISPATCH_ defines between the prctl operation and the selector variable. Therefore, define two new constants to be used by SUD's selector variable and update the corresponding documentation and test cases. While this changes the API syscall user dispatch has never been part of a Linux release, it will show up for the first time in 5.11. Suggested-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210205184321.2062251-1-krisman@collabora.com
2021-02-04Revert "GTP: add support for flow based tunneling API"Jonas Bonn
This reverts commit 9ab7e76aefc97a9aa664accb59d6e8dc5e52514a. This patch was committed without maintainer approval and despite a number of unaddressed concerns from review. There are several issues that impede the acceptance of this patch and that make a reversion of this particular instance of these changes the best way forward: i) the patch contains several logically separate changes that would be better served as smaller patches (for review purposes) ii) functionality like the handling of end markers has been introduced without further explanation iii) symmetry between the handling of GTPv0 and GTPv1 has been unnecessarily broken iv) the patchset produces 'broken' packets when extension headers are included v) there are no available userspace tools to allow for testing this functionality vi) there is an unaddressed Coverity report against the patch concering memory leakage vii) most importantly, the patch contains a large amount of superfluous churn that impedes other ongoing work with this driver This patch will be reworked into a series that aligns with other ongoing work and facilitates review. Signed-off-by: Jonas Bonn <jonas@norrbonn.se> Acked-by: Harald Welte <laforge@gnumonks.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04KVM: x86: declare Xen HVM shared info capability and add test caseDavid Woodhouse
Instead of adding a plethora of new KVM_CAP_XEN_FOO capabilities, just add bits to the return value of KVM_CAP_XEN_HVM. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2021-02-04KVM: x86/xen: Add event channel interrupt vector upcallDavid Woodhouse
It turns out that we can't handle event channels *entirely* in userspace by delivering them as ExtINT, because KVM is a bit picky about when it accepts ExtINT interrupts from a legacy PIC. The in-kernel local APIC has to have LVT0 configured in APIC_MODE_EXTINT and unmasked, which isn't necessarily the case for Xen guests especially on secondary CPUs. To cope with this, add kvm_xen_get_interrupt() which checks the evtchn_pending_upcall field in the Xen vcpu_info, and delivers the Xen upcall vector (configured by KVM_XEN_ATTR_TYPE_UPCALL_VECTOR) if it's set regardless of LAPIC LVT0 configuration. This gives us the minimum support we need for completely userspace-based implementation of event channels. This does mean that vcpu_enter_guest() needs to check for the evtchn_pending_upcall flag being set, because it can't rely on someone having set KVM_REQ_EVENT unless we were to add some way for userspace to do so manually. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2021-02-04KVM: x86/xen: register vcpu time info regionJoao Martins
Allow the Xen emulated guest the ability to register secondary vcpu time information. On Xen guests this is used in order to be mapped to userspace and hence allow vdso gettimeofday to work. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2021-02-04KVM: x86/xen: register vcpu infoJoao Martins
The vcpu info supersedes the per vcpu area of the shared info page and the guest vcpus will use this instead. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2021-02-04KVM: x86/xen: Add KVM_XEN_VCPU_SET_ATTR/KVM_XEN_VCPU_GET_ATTRDavid Woodhouse
This will be used for per-vCPU setup such as runstate and vcpu_info. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2021-02-04KVM: x86/xen: register shared_info pageJoao Martins
Add KVM_XEN_ATTR_TYPE_SHARED_INFO to allow hypervisor to know where the guest's shared info page is. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2021-02-04KVM: x86/xen: latch long_mode when hypercall page is set upDavid Woodhouse
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2021-02-04KVM: x86/xen: add KVM_XEN_HVM_SET_ATTR/KVM_XEN_HVM_GET_ATTRJoao Martins
This will be used to set up shared info pages etc. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2021-02-04KVM: x86/xen: intercept xen hypercalls if enabledJoao Martins
Add a new exit reason for emulator to handle Xen hypercalls. Since this means KVM owns the ABI, dispense with the facility for the VMM to provide its own copy of the hypercall pages; just fill them in directly using VMCALL/VMMCALL as we do for the Hyper-V hypercall page. This behaviour is enabled by a new INTERCEPT_HCALL flag in the KVM_XEN_HVM_CONFIG ioctl structure, and advertised by the same flag being returned from the KVM_CAP_XEN_HVM check. Rename xen_hvm_config() to kvm_xen_write_hypercall_page() and move it to the nascent xen.c while we're at it, and add a test case. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2021-02-04KVM: VMX: Enable bus lock VM exitChenyi Qiang
Virtual Machine can exploit bus locks to degrade the performance of system. Bus lock can be caused by split locked access to writeback(WB) memory or by using locks on uncacheable(UC) memory. The bus lock is typically >1000 cycles slower than an atomic operation within a cache line. It also disrupts performance on other cores (which must wait for the bus lock to be released before their memory operations can complete). To address the threat, bus lock VM exit is introduced to notify the VMM when a bus lock was acquired, allowing it to enforce throttling or other policy based mitigations. A VMM can enable VM exit due to bus locks by setting a new "Bus Lock Detection" VM-execution control(bit 30 of Secondary Processor-based VM execution controls). If delivery of this VM exit was preempted by a higher priority VM exit (e.g. EPT misconfiguration, EPT violation, APIC access VM exit, APIC write VM exit, exception bitmap exiting), bit 26 of exit reason in vmcs field is set to 1. In current implementation, the KVM exposes this capability through KVM_CAP_X86_BUS_LOCK_EXIT. The user can get the supported mode bitmap (i.e. off and exit) and enable it explicitly (disabled by default). If bus locks in guest are detected by KVM, exit to user space even when current exit reason is handled by KVM internally. Set a new field KVM_RUN_BUS_LOCK in vcpu->run->flags to inform the user space that there is a bus lock detected in guest. Document for Bus Lock VM exit is now available at the latest "Intel Architecture Instruction Set Extensions Programming Reference". Document Link: https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20201106090315.18606-4-chenyi.qiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04KVM/SVM: add support for SEV attestation commandBrijesh Singh
The SEV FW version >= 0.23 added a new command that can be used to query the attestation report containing the SHA-256 digest of the guest memory encrypted through the KVM_SEV_LAUNCH_UPDATE_{DATA, VMSA} commands and sign the report with the Platform Endorsement Key (PEK). See the SEV FW API spec section 6.8 for more details. Note there already exist a command (KVM_SEV_LAUNCH_MEASURE) that can be used to get the SHA-256 digest. The main difference between the KVM_SEV_LAUNCH_MEASURE and KVM_SEV_ATTESTATION_REPORT is that the latter can be called while the guest is running and the measurement value is signed with PEK. Cc: James Bottomley <jejb@linux.ibm.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: David Rientjes <rientjes@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: John Allen <john.allen@amd.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: linux-crypto@vger.kernel.org Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: James Bottomley <jejb@linux.ibm.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Message-Id: <20210104151749.30248-1-brijesh.singh@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-03ethtool: Extend link modes settings uAPI with lanesDanielle Ratson
Currently, when auto negotiation is on, the user can advertise all the linkmodes which correspond to a specific speed, but does not have a similar selector for the number of lanes. This is significant when a specific speed can be achieved using different number of lanes. For example, 2x50 or 4x25. Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct ethtool_link_settings' with lanes field in order to implement a new lanes-selector that will enable the user to advertise a specific number of lanes as well. When auto negotiation is off, lanes parameter can be forced only if the driver supports it. Add a capability bit in 'struct ethtool_ops' that allows ethtool know if the driver can handle the lanes parameter when auto negotiation is off, so if it does not, an error message will be returned when trying to set lanes. Example: $ ethtool -s swp1 lanes 4 $ ethtool swp1 Settings for swp1: Supported ports: [ FIBRE ] Supported link modes: 1000baseKX/Full 10000baseKR/Full 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 25000baseCR/Full 25000baseSR/Full 50000baseCR2/Full 100000baseSR4/Full 100000baseCR4/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 100000baseSR4/Full 100000baseCR4/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Link detected: no Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-03Merge tag 'tee-housekeeping-for-v5.12' of ↵Arnd Bergmann
git://git.linaro.org:/people/jens.wiklander/linux-tee into arm/drivers TEE subsystem housekeeping - Fixes some comment typos in header files - Updates to use flexible-array member instead of zero-length array - Syncs internal OP-TEE headers with the ones from OP-TEE OS * tag 'tee-housekeeping-for-v5.12' of git://git.linaro.org:/people/jens.wiklander/linux-tee: optee: sync OP-TEE headers tee: optee: fix 'physical' typos drivers: optee: use flexible-array member instead of zero-length array tee: fix some comment typos in header files Link: https://lore.kernel.org/r/20210203120742.GA3624453@jade Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-02-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02tee: fix some comment typos in header filesElvira Khabirova
struct tee_param: revc -> recv. TEE_IOC_SUPPL_SEND: typo introduced by copy-pasting, replace invalid description with description from the according argument struct. Signed-off-by: Elvira Khabirova <e.khabirova@omprussia.ru> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
2021-02-01vfio: interfaces to update vaddrSteve Sistare
Define interfaces that allow the underlying memory object of an iova range to be mapped to a new host virtual address in the host process: - VFIO_DMA_UNMAP_FLAG_VADDR for VFIO_IOMMU_UNMAP_DMA - VFIO_DMA_MAP_FLAG_VADDR flag for VFIO_IOMMU_MAP_DMA - VFIO_UPDATE_VADDR extension for VFIO_CHECK_EXTENSION Unmap vaddr invalidates the host virtual address in an iova range, and blocks vfio translation of host virtual addresses. DMA to already-mapped pages continues. Map vaddr updates the base VA and resumes translation. See comments in uapi/linux/vfio.h for more details. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-02-01vfio: option to unmap allSteve Sistare
For the UNMAP_DMA ioctl, delete all mappings if VFIO_DMA_UNMAP_FLAG_ALL is set. Define the corresponding VFIO_UNMAP_ALL extension. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-02-01Merge tag 'media/v5.11-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "The rockship rkisp1 driver will be promoted from staging in 5.11. While not too late, do a few uAPI changes which are needed to better support its functionalities" * tag 'media/v5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: rockchip: rkisp1: extend uapi array sizes media: rockchip: rkisp1: carry ip version information media: rockchip: rkisp1: reduce number of histogram grid elements in uapi media: rkisp1: stats: mask the hist_bins values media: rkisp1: stats: remove a wrong cast to u8 media: rkisp1: uapi: change hist_bins array type from __u16 to __u32
2021-02-01io_uring: Add skip option for __io_sqe_files_updatenoah
This patch adds support for skipping a file descriptor when using IORING_REGISTER_FILES_UPDATE. __io_sqe_files_update will skip fds set to IORING_REGISTER_FILES_SKIP. IORING_REGISTER_FILES_SKIP is inturn added as a #define in io_uring.h Signed-off-by: noah <goldstein.w.n@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-01io_uring: rename file related variables to rsrcBijan Mottahedeh
This is a prep rename patch for subsequent patches to generalize file registration. [io_uring_rsrc_update:: rename fds -> data] Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> [leave io_uring_files_update as struct] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-01Merge branch 'work.namei' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into for-5.12/io_uring Merge RESOLVE_CACHED bits from Al, as the io_uring changes will build on top of that. * 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED fs: add support for LOOKUP_CACHED saner calling conventions for unlazy_child() fs: make unlazy_walk() error handling consistent fs/namei.c: Remove unlikely of status being -ECHILD in lookup_fast() do_tmpfile(): don't mess with finish_open()