summaryrefslogtreecommitdiff
path: root/include/uapi/linux
AgeCommit message (Collapse)Author
2020-03-30Merge tag 'usb-5.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / PHY updates from Greg KH: "Here are the big set of USB and PHY driver patches for 5.7-rc1. Nothing huge here, some new PHY drivers, loads of USB gadget fixes and updates, xhci updates, usb-serial driver updates and new device ids, and other minor things. Full details in the shortlog. All have been in linux-next for a while with no reported issues" * tag 'usb-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (239 commits) USB: cdc-acm: restore capability check order usb: cdns3: make signed 1 bit bitfields unsigned usb: gadget: fsl: remove unused variable 'driver_desc' usb: gadget: f_fs: Fix use after free issue as part of queue failure usb: typec: Correct the documentation for typec_cable_put() USB: serial: io_edgeport: fix slab-out-of-bounds read in edge_interrupt_callback USB: serial: option: add Wistron Neweb D19Q1 USB: serial: option: add BroadMobi BM806U USB: serial: option: add support for ASKEY WWHC050 usb: core: Add ACPI support for USB interface devices driver core: platform: Reimplement devm_platform_ioremap_resource usb: dwc2: convert to devm_platform_get_and_ioremap_resource usb: host: hisilicon: convert to devm_platform_get_and_ioremap_resource usb: host: xhci-plat: convert to devm_platform_get_and_ioremap_resource drivers: provide devm_platform_get_and_ioremap_resource() phy: qcom-qusb2: Add new overriding tuning parameters in QUSB2 V2 PHY phy: qcom-qusb2: Add support for overriding tuning parameters in QUSB2 V2 PHY dt-bindings: phy: qcom-qusb2: Add support for overriding Phy tuning parameters phy: qcom-qusb2: Add generic QUSB2 V2 PHY support dt-bindings: phy: qcom,qusb2: Add compatibles for QUSB2 V2 phy and SC7180 ...
2020-03-30bpf: Add socket assign supportJoe Stringer
Add support for TPROXY via a new bpf helper, bpf_sk_assign(). This helper requires the BPF program to discover the socket via a call to bpf_sk*_lookup_*(), then pass this socket to the new helper. The helper takes its own reference to the socket in addition to any existing reference that may or may not currently be obtained for the duration of BPF processing. For the destination socket to receive the traffic, the traffic must be routed towards that socket via local route. The simplest example route is below, but in practice you may want to route traffic more narrowly (eg by CIDR): $ ip route add local default dev lo This patch avoids trying to introduce an extra bit into the skb->sk, as that would require more invasive changes to all code interacting with the socket to ensure that the bit is handled correctly, such as all error-handling cases along the path from the helper in BPF through to the orphan path in the input. Instead, we opt to use the destructor variable to switch on the prefetch of the socket. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200329225342.16317-2-joe@wand.net.nz
2020-03-30Merge tag 'media/v5.7-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - New sensor driver: imx219 - Support for some new pixelformats - Support for Sun8i SoC - Added more codecs to meson vdec driver - Prepare for removing the legacy usbvision driver by moving it to staging. This driver has issues and use legacy core APIs. If nobody steps up to address those, it is time for its retirement. - Several cleanups and improvements on drivers, with the addition of new supported boards * tag 'media/v5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (236 commits) media: venus: firmware: Ignore secure call error on first resume media: mtk-vpu: load vpu firmware from the new location media: i2c: video-i2c: fix build errors due to 'imply hwmon' media: MAINTAINERS: add myself to co-maintain Hantro G1/G2 for i.MX8MQ media: hantro: add initial i.MX8MQ support media: dt-bindings: Document i.MX8MQ VPU bindings media: vivid: fix incorrect PA assignment to HDMI outputs media: hantro: Add linux-rockchip mailing list to MAINTAINERS media: cedrus: h264: Fix 4K decoding on H6 media: siano: Use scnprintf() for avoiding potential buffer overflow media: rc: Use scnprintf() for avoiding potential buffer overflow media: allegro: create new struct for channel parameters media: allegro: move mail definitions to separate file media: allegro: pass buffers through firmware media: allegro: verify source and destination buffer in VCU response media: allegro: handle dependency of bitrate and bitrate_peak media: allegro: read bitrate mode directly from control media: allegro: make QP configurable media: allegro: make frame rate configurable media: allegro: skip filler data if possible ...
2020-03-30Merge tag 'seccomp-v5.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "A couple of seccomp updates. They're both mostly bug fixes that I wanted to have sit in linux-next for a while: - allow TSYNC and USER_NOTIF together (Tycho Andersen) - add missing compat_ioctl for notify (Sven Schnelle)" * tag 'seccomp-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Add missing compat_ioctl for notify seccomp: allow TSYNC and USER_NOTIF together
2020-03-30Merge tag 'for-5.7/io_uring-2020-03-29' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring updates from Jens Axboe: "Here are the io_uring changes for this merge window. Light on new features this time around (just splice + buffer selection), lots of cleanups, fixes, and improvements to existing support. In particular, this contains: - Cleanup fixed file update handling for stack fallback (Hillf) - Re-work of how pollable async IO is handled, we no longer require thread offload to handle that. Instead we rely using poll to drive this, with task_work execution. - In conjunction with the above, allow expendable buffer selection, so that poll+recv (for example) no longer has to be a split operation. - Make sure we honor RLIMIT_FSIZE for buffered writes - Add support for splice (Pavel) - Linked work inheritance fixes and optimizations (Pavel) - Async work fixes and cleanups (Pavel) - Improve io-wq locking (Pavel) - Hashed link write improvements (Pavel) - SETUP_IOPOLL|SETUP_SQPOLL improvements (Xiaoguang)" * tag 'for-5.7/io_uring-2020-03-29' of git://git.kernel.dk/linux-block: (54 commits) io_uring: cleanup io_alloc_async_ctx() io_uring: fix missing 'return' in comment io-wq: handle hashed writes in chains io-uring: drop 'free_pfile' in struct io_file_put io-uring: drop completion when removing file io_uring: Fix ->data corruption on re-enqueue io-wq: close cancel gap for hashed linked work io_uring: make spdxcheck.py happy io_uring: honor original task RLIMIT_FSIZE io-wq: hash dependent work io-wq: split hashing and enqueueing io-wq: don't resched if there is no work io-wq: remove duplicated cancel code io_uring: fix truncated async read/readv and write/writev retry io_uring: dual license io_uring.h uapi header io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled io_uring: Fix unused function warnings io_uring: add end-of-bits marker and build time verify it io_uring: provide means of removing buffers io_uring: add IOSQE_BUFFER_SELECT support for IORING_OP_RECVMSG ...
2020-03-30Merge tag 'for-5.7/drivers-2020-03-29' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: - floppy driver cleanup series from Willy - NVMe updates and fixes (Various) - null_blk trace improvements (Chaitanya) - bcache fixes (Coly) - md fixes (via Song) - loop block size change optimizations (Martijn) - scnprintf() use (Takashi) * tag 'for-5.7/drivers-2020-03-29' of git://git.kernel.dk/linux-block: (81 commits) null_blk: add trace in null_blk_zoned.c null_blk: add tracepoint helpers for zoned mode block: add a zone condition debug helper nvme: cleanup namespace identifier reporting in nvme_init_ns_head nvme: rename __nvme_find_ns_head to nvme_find_ns_head nvme: refactor nvme_identify_ns_descs error handling nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl nvme: Fix controller creation races with teardown flow nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl nvme: Fix ctrl use-after-free during sysfs deletion nvme-pci: Re-order nvme_pci_free_ctrl nvme: Remove unused return code from nvme_delete_ctrl_sync nvme: Use nvme_state_terminal helper nvme: release ida resources nvme: Add compat_ioctl handler for NVME_IOCTL_SUBMIT_IO nvmet-tcp: optimize tcp stack TX when data digest is used nvme-fabrics: Use scnprintf() for avoiding potential buffer overflow nvme-multipath: do not reset on unknown status nvmet-rdma: allocate RW ctxs according to mdts ...
2020-03-30devlink: Add auto dump flag to health reporterEran Ben Elisha
On low memory system, run time dumps can consume too much memory. Add administrator ability to disable auto dumps per reporter as part of the error flow handle routine. This attribute is not relevant while executing DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET. By default, auto dump is activated for any reporter that has a dump method, as part of the reporter registration to devlink. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: sched: expose HW stats types per action used by driversJiri Pirko
It may be up to the driver (in case ANY HW stats is passed) to select which type of HW stats he is going to use. Add an infrastructure to expose this information to user. $ tc filter add dev enp3s0np1 ingress proto ip handle 1 pref 1 flower dst_ip 192.168.1.1 action drop $ tc -s filter show dev enp3s0np1 ingress filter protocol ip pref 1 flower chain 0 filter protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 dst_ip 192.168.1.1 in_hw in_hw_count 2 action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 installed 10 sec used 10 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 used_hw_stats immediate <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: provide timestamping information with TSINFO_GET requestMichal Kubecek
Implement TSINFO_GET request to get timestamping information for a network device. This is traditionally available via ETHTOOL_GET_TS_INFO ioctl request. Move part of ethtool_get_ts_info() into common.c so that ioctl and netlink code use the same logic to get timestamping information from the device. v3: use "TSINFO" rather than "TIMESTAMP", suggested by Richard Cochran Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: add timestamping related string setsMichal Kubecek
Add three string sets related to timestamping information: ETH_SS_SOF_TIMESTAMPING: SOF_TIMESTAMPING_* flags ETH_SS_TS_TX_TYPES: timestamping Tx types ETH_SS_TS_RX_FILTERS: timestamping Rx filters These will be used for TIMESTAMP_GET request. v2: avoid compiler warning ("enumeration value not handled in switch") in net_hwtstamp_validate() v3: omit dash in Tx type names ("one-step-*" -> "onestep-*"), suggested by Richard Cochran Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: add EEE_NTF notificationMichal Kubecek
Send ETHTOOL_MSG_EEE_NTF notification whenever EEE settings of a network device are modified using ETHTOOL_MSG_EEE_SET netlink message or ETHTOOL_SEEE ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: set EEE settings with EEE_SET requestMichal Kubecek
Implement EEE_SET netlink request to set EEE settings of a network device. These are traditionally set with ETHTOOL_SEEE ioctl request. The netlink interface allows setting the EEE status for all link modes supported by kernel but only first 32 link modes can be set at the moment as only those are supported by the ethtool_ops callback. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: provide EEE settings with EEE_GET requestMichal Kubecek
Implement EEE_GET request to get EEE settings of a network device. These are traditionally available via ETHTOOL_GEEE ioctl request. The netlink interface allows reporting EEE status for all link modes supported by kernel but only first 32 link modes are provided at the moment as only those are reported by the ethtool_ops callback and drivers. v2: fix alignment (whitespace only) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: add PAUSE_NTF notificationMichal Kubecek
Send ETHTOOL_MSG_PAUSE_NTF notification whenever pause parameters of a network device are modified using ETHTOOL_MSG_PAUSE_SET netlink message or ETHTOOL_SPAUSEPARAM ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: set pause parameters with PAUSE_SET requestMichal Kubecek
Implement PAUSE_SET netlink request to set pause parameters of a network device. Thease are traditionally set with ETHTOOL_SPAUSEPARAM ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: provide pause parameters with PAUSE_GET requestMichal Kubecek
Implement PAUSE_GET request to get pause parameters of a network device. These are traditionally available via ETHTOOL_GPAUSEPARAM ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: add COALESCE_NTF notificationMichal Kubecek
Send ETHTOOL_MSG_COALESCE_NTF notification whenever coalescing parameters of a network device are modified using ETHTOOL_MSG_COALESCE_SET netlink message or ETHTOOL_SCOALESCE ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: set coalescing parameters with COALESCE_SET requestMichal Kubecek
Implement COALESCE_SET netlink request to set coalescing parameters of a network device. These are traditionally set with ETHTOOL_SCOALESCE ioctl request. This commit adds only support for device coalescing parameters, not per queue coalescing parameters. Like the ioctl implementation, the generic ethtool code checks if only supported parameters are modified; if not, first offending attribute is reported using extack. v2: fix alignment (whitespace only) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: provide coalescing parameters with COALESCE_GET requestMichal Kubecek
Implement COALESCE_GET request to get coalescing parameters of a network device. These are traditionally available via ETHTOOL_GCOALESCE ioctl request. This commit adds only support for device coalescing parameters, not per queue coalescing parameters. Omit attributes with zero values unless they are declared as supported (i.e. the corresponding bit in ethtool_ops::supported_coalesce_params is set). Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29net: ipv6: add rpl sr tunnelAlexander Aring
This patch adds functionality to configure routes for RPL source routing functionality. There is no IPIP functionality yet implemented which can be added later when the cases when to use IPv6 encapuslation comes more clear. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29net: ipv6: add support for rpl sr exthdrAlexander Aring
This patch adds rpl source routing receive handling. Everything works only if sysconf "rpl_seg_enabled" and source routing is enabled. Mostly the same behaviour as IPv6 segmentation routing. To handle compression and uncompression a rpl.c file is created which contains the necessary functionality. The receive handling will also care about IPv6 encapsulated so far it's specified as possible nexthdr in RFC 6554. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29include: uapi: linux: add rpl sr header definitionAlexander Aring
This patch adds a uapi header for rpl struct definition. The segments data can be accessed over rpl_segaddr or rpl_segdata macros. In case of compri and compre is zero the segment data is not compressed and can be accessed by rpl_segaddr. In the other case the compressed data can be accessed by rpl_segdata and interpreted as byte array. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29mptcp: add netlink-based PMPaolo Abeni
Expose a new netlink family to userspace to control the PM, setting: - list of local addresses to be signalled. - list of local addresses used to created subflows. - maximum number of add_addr option to react When the msk is fully established, the PM netlink attempts to announce the 'signal' list via the ADD_ADDR option. Since we currently lack the ADD_ADDR echo (and related event) only the first addr is sent. After exhausting the 'announce' list, the PM tries to create subflow for each addr in 'local' list, waiting for each connection to be completed before attempting the next one. Idea is to add an additional PM hook for ADD_ADDR echo, to allow the PM netlink announcing multiple addresses, in sequence. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29mptcp: allow dumping subflow context to userspaceDavide Caratti
add ulp-specific diagnostic functions, so that subflow information can be dumped to userspace programs like 'ss'. v2 -> v3: - uapi: use bit macros appropriate for userspace Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29net: macsec: add support for specifying offload upon link creationMark Starovoytov
This patch adds new netlink attribute to allow a user to (optionally) specify the desired offload mode immediately upon MACSec link creation. Separate iproute patch will be required to support this from user space. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Minor comment conflict in mac80211. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30bpf: Introduce BPF_PROG_TYPE_LSMKP Singh
Introduce types and configs for bpf programs that can be attached to LSM hooks. The programs can be enabled by the config option CONFIG_BPF_LSM. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Florent Revest <revest@google.com> Reviewed-by: Thomas Garnier <thgarnie@google.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-2-kpsingh@chromium.org
2020-03-29um: Implement time-travel=extJohannes Berg
This implements synchronized time-travel mode which - using a special application on a unix socket - lets multiple machines take part in a time-travelling simulation together. The protocol for the unix domain socket is defined in the new file include/uapi/linux/um_timetravel.h. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-03-28xdp: Support specifying expected existing program when attaching XDPToke Høiland-Jørgensen
While it is currently possible for userspace to specify that an existing XDP program should not be replaced when attaching to an interface, there is no mechanism to safely replace a specific XDP program with another. This patch adds a new netlink attribute, IFLA_XDP_EXPECTED_FD, which can be set along with IFLA_XDP_FD. If set, the kernel will check that the program currently loaded on the interface matches the expected one, and fail the operation if it does not. This corresponds to a 'cmpxchg' memory operation. Setting the new attribute with a negative value means that no program is expected to be attached, which corresponds to setting the UPDATE_IF_NOEXIST flag. A new companion flag, XDP_FLAGS_REPLACE, is also added to explicitly request checking of the EXPECTED_FD attribute. This is needed for userspace to discover whether the kernel supports the new attribute. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/158515700640.92963.3551295145441017022.stgit@toke.dk
2020-03-27bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor idDaniel Borkmann
Enable the bpf_get_current_cgroup_id() helper for connect(), sendmsg(), recvmsg() and bind-related hooks in order to retrieve the cgroup v2 context which can then be used as part of the key for BPF map lookups, for example. Given these hooks operate in process context 'current' is always valid and pointing to the app that is performing mentioned syscalls if it's subject to a v2 cgroup. Also with same motivation of commit 7723628101aa ("bpf: Introduce bpf_skb_ancestor_cgroup_id helper") enable retrieval of ancestor from current so the cgroup id can be used for policy lookups which can then forbid connect() / bind(), for example. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/d2a7ef42530ad299e3cbb245e6c12374b72145ef.1585323121.git.daniel@iogearbox.net
2020-03-27bpf: Add netns cookie and enable it for bpf cgroup hooksDaniel Borkmann
In Cilium we're mainly using BPF cgroup hooks today in order to implement kube-proxy free Kubernetes service translation for ClusterIP, NodePort (*), ExternalIP, and LoadBalancer as well as HostPort mapping [0] for all traffic between Cilium managed nodes. While this works in its current shape and avoids packet-level NAT for inter Cilium managed node traffic, there is one major limitation we're facing today, that is, lack of netns awareness. In Kubernetes, the concept of Pods (which hold one or multiple containers) has been built around network namespaces, so while we can use the global scope of attaching to root BPF cgroup hooks also to our advantage (e.g. for exposing NodePort ports on loopback addresses), we also have the need to differentiate between initial network namespaces and non-initial one. For example, ExternalIP services mandate that non-local service IPs are not to be translated from the host (initial) network namespace as one example. Right now, we have an ugly work-around in place where non-local service IPs for ExternalIP services are not xlated from connect() and friends BPF hooks but instead via less efficient packet-level NAT on the veth tc ingress hook for Pod traffic. On top of determining whether we're in initial or non-initial network namespace we also have a need for a socket-cookie like mechanism for network namespaces scope. Socket cookies have the nice property that they can be combined as part of the key structure e.g. for BPF LRU maps without having to worry that the cookie could be recycled. We are planning to use this for our sessionAffinity implementation for services. Therefore, add a new bpf_get_netns_cookie() helper which would resolve both use cases at once: bpf_get_netns_cookie(NULL) would provide the cookie for the initial network namespace while passing the context instead of NULL would provide the cookie from the application's network namespace. We're using a hole, so no size increase; the assignment happens only once. Therefore this allows for a comparison on initial namespace as well as regular cookie usage as we have today with socket cookies. We could later on enable this helper for other program types as well as we would see need. (*) Both externalTrafficPolicy={Local|Cluster} types [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/c47d2346982693a9cf9da0e12690453aded4c788.1585323121.git.daniel@iogearbox.net
2020-03-27Merge tag 'v5.6-rc7' into develLinus Walleij
Linux 5.6-rc7
2020-03-27netfilter: flowtable: add counter supportPablo Neira Ayuso
Add a new flag to turn on flowtable counters which are stored in the conntrack entry. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-27netfilter: nf_tables: add enum nft_flowtable_flags to uapiPablo Neira Ayuso
Expose the NFT_FLOWTABLE_HW_OFFLOAD flag through uapi. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-27perf/core: Add PERF_SAMPLE_CGROUP featureNamhyung Kim
The PERF_SAMPLE_CGROUP bit is to save (perf_event) cgroup information in the sample. It will add a 64-bit id to identify current cgroup and it's the file handle in the cgroup file system. Userspace should use this information with PERF_RECORD_CGROUP event to match which cgroup it belongs. I put it before PERF_SAMPLE_AUX for simplicity since it just needs a 64-bit word. But if we want bigger samples, I can work on that direction too. Committer testing: $ pahole perf_sample_data | grep -w cgroup -B5 -A5 /* --- cacheline 4 boundary (256 bytes) was 56 bytes ago --- */ struct perf_regs regs_intr; /* 312 16 */ /* --- cacheline 5 boundary (320 bytes) was 8 bytes ago --- */ u64 stack_user_size; /* 328 8 */ u64 phys_addr; /* 336 8 */ u64 cgroup; /* 344 8 */ /* size: 384, cachelines: 6, members: 22 */ /* padding: 32 */ }; $ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lore.kernel.org/lkml/20200325124536.2800725-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-27perf/core: Add PERF_RECORD_CGROUP eventNamhyung Kim
To support cgroup tracking, add CGROUP event to save a link between cgroup path and id number. This is needed since cgroups can go away when userspace tries to read the cgroup info (from the id) later. The attr.cgroup bit was also added to enable cgroup tracking from userspace. This event will be generated when a new cgroup becomes active. Userspace might need to synthesize those events for existing cgroups. Committer testing: From the resulting kernel, using /sys/kernel/btf/vmlinux: $ pahole perf_event_attr | grep -w cgroup -B5 -A1 __u64 write_backward:1; /* 40:27 8 */ __u64 namespaces:1; /* 40:28 8 */ __u64 ksymbol:1; /* 40:29 8 */ __u64 bpf_event:1; /* 40:30 8 */ __u64 aux_output:1; /* 40:31 8 */ __u64 cgroup:1; /* 40:32 8 */ __u64 __reserved_1:31; /* 40:33 8 */ $ Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> [staticize perf_event_cgroup function] Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lore.kernel.org/lkml/20200325124536.2800725-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-27Merge branches 'iommu/fixes', 'arm/qcom', 'arm/omap', 'arm/smmu', 'x86/amd', ↵Joerg Roedel
'x86/vt-d', 'virtio' and 'core' into next
2020-03-27iommu/virtio: Fix sparse warningJean-Philippe Brucker
We copied the virtio_iommu_config from the virtio-iommu specification, which declares the fields using little-endian annotations (for example le32). Unfortunately this causes sparse to warn about comparison between little- and cpu-endian, because of the typecheck() in virtio_cread(): drivers/iommu/virtio-iommu.c:1024:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1024:9: sparse: restricted __le64 * drivers/iommu/virtio-iommu.c:1024:9: sparse: unsigned long long * drivers/iommu/virtio-iommu.c:1036:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1036:9: sparse: restricted __le64 * drivers/iommu/virtio-iommu.c:1036:9: sparse: unsigned long long * drivers/iommu/virtio-iommu.c:1040:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1040:9: sparse: restricted __le64 * drivers/iommu/virtio-iommu.c:1040:9: sparse: unsigned long long * drivers/iommu/virtio-iommu.c:1044:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1044:9: sparse: restricted __le32 * drivers/iommu/virtio-iommu.c:1044:9: sparse: unsigned int * drivers/iommu/virtio-iommu.c:1048:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1048:9: sparse: restricted __le32 * drivers/iommu/virtio-iommu.c:1048:9: sparse: unsigned int * drivers/iommu/virtio-iommu.c:1052:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1052:9: sparse: restricted __le32 * drivers/iommu/virtio-iommu.c:1052:9: sparse: unsigned int * Although virtio_cread() does convert virtio-endian (in our case little-endian) to cpu-endian, the typecheck() needs the two arguments to have the same endianness. Do as UAPI headers of other virtio devices do, and remove the endian annotation from the device config. Even though we change the UAPI this shouldn't cause any regression since QEMU, the existing implementation of virtio-iommu that uses this header, already removes the annotations when importing headers. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20200326093558.2641019-2-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-03-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - a fix to generate proper timestamps on key autorepeat events that were broken recently - a fix for Synaptics driver to only activate reduced reporting mode when explicitly requested - a new keycode for "selective screenshot" function - other assorted fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: fix stale timestamp on key autorepeat events Input: move the new KEY_SELECTIVE_SCREENSHOT keycode Input: avoid BIT() macro usage in the serio.h UAPI header Input: synaptics-rmi4 - set reduced reporting mode only when requested Input: synaptics - enable RMI on HP Envy 13-ad105ng Input: allocate keycode for "Selective Screenshot" key Input: tm2-touchkey - add support for Coreriver TC360 variant dt-bindings: input: add Coreriver TC360 binding dt-bindings: vendor-prefixes: Add Coreriver vendor prefix Input: raydium_i2c_ts - fix error codes in raydium_i2c_boot_trigger()
2020-03-26net: macsec: add support for offloading to the MACAntoine Tenart
This patch adds a new MACsec offloading option, MACSEC_OFFLOAD_MAC, allowing a user to select a MAC as a provider for MACsec offloading operations. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26taprio: do not use BIT() in TCA_TAPRIO_ATTR_FLAG_* definitionsEugene Syromiatnikov
BIT() macro definition is internal to the Linux kernel and is not to be used in UAPI headers; replace its usage with the _BITUL() macro that is already used elsewhere in the header. Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26rtc: make definitions in include/uapi/linux/rtc.h actually useful for user spaceEugene Syromiatnikov
BIT() macro is not defined in UAPI headers; there is, however, similarly defined _BITUL() macro present in include/uapi/linux/const.h; use it instead and include <linux/const.h> and <linux/ioctl.h> in order to make the definitions provided in the header useful. Fixes: 3431ca4837bf ("rtc: define RTC_VL_READ values") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Link: https://lore.kernel.org/r/20200324041209.GA30727@asgard.redhat.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2020-03-26Input: move the new KEY_SELECTIVE_SCREENSHOT keycodeDmitry Torokhov
We should try to keep keycodes sequential unless there is a reason to leave a gap in numbering, so let's move it from 0x280 to 0x27a while we still can. Fixes: 3b059da9835c ("Input: allocate keycode for Selective Screenshot key") Acked-by: Rajat Jain <rajatja@google.com> Link: https://lore.kernel.org/r/20200326182711.GA259753@dtor-ws Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-03-26coresight: do not use the BIT() macro in the UAPI headerEugene Syromiatnikov
The BIT() macro definition is not available for the UAPI headers (moreover, it can be defined differently in the user space); replace its usage with the _BITUL() macro that is defined in <linux/const.h>. Fixes: 237483aa5cf4 ("coresight: stm: adding driver for CoreSight STM component") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/20200324042213.GA10452@asgard.redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-26KVM: PPC: Book3S HV: Add a capability for enabling secure guestsPaul Mackerras
At present, on Power systems with Protected Execution Facility hardware and an ultravisor, a KVM guest can transition to being a secure guest at will. Userspace (QEMU) has no way of knowing whether a host system is capable of running secure guests. This will present a problem in future when the ultravisor is capable of migrating secure guests from one host to another, because virtualization management software will have no way to ensure that secure guests only run in domains where all of the hosts can support secure guests. This adds a VM capability which has two functions: (a) userspace can query it to find out whether the host can support secure guests, and (b) userspace can enable it for a guest, which allows that guest to become a secure guest. If userspace does not enable it, KVM will return an error when the ultravisor does the hypercall that indicates that the guest is starting to transition to a secure guest. The ultravisor will then abort the transition and the guest will terminate. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Ram Pai <linuxram@us.ibm.com>
2020-03-25fanotify: report name info for FAN_DIR_MODIFY eventAmir Goldstein
Report event FAN_DIR_MODIFY with name in a variable length record similar to how fid's are reported. With name info reporting implemented, setting FAN_DIR_MODIFY in mark mask is now allowed. When events are reported with name, the reported fid identifies the directory and the name follows the fid. The info record type for this event info is FAN_EVENT_INFO_TYPE_DFID_NAME. For now, all reported events have at most one info record which is either FAN_EVENT_INFO_TYPE_FID or FAN_EVENT_INFO_TYPE_DFID_NAME (for FAN_DIR_MODIFY). Later on, events "on child" will report both records. There are several ways that an application can use this information: 1. When watching a single directory, the name is always relative to the watched directory, so application need to fstatat(2) the name relative to the watched directory. 2. When watching a set of directories, the application could keep a map of dirfd for all watched directories and hash the map by fid obtained with name_to_handle_at(2). When getting a name event, the fid in the event info could be used to lookup the base dirfd in the map and then call fstatat(2) with that dirfd. 3. When watching a filesystem (FAN_MARK_FILESYSTEM) or a large set of directories, the application could use open_by_handle_at(2) with the fid in event info to obtain dirfd for the directory where event happened and call fstatat(2) with this dirfd. The last option scales better for a large number of watched directories. The first two options may be available in the future also for non privileged fanotify watchers, because open_by_handle_at(2) requires the CAP_DAC_READ_SEARCH capability. Link: https://lore.kernel.org/r/20200319151022.31456-15-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-03-25fanotify: send FAN_DIR_MODIFY event flavor with dir inode and nameAmir Goldstein
Dirent events are going to be supported in two flavors: 1. Directory fid info + mask that includes the specific event types (e.g. FAN_CREATE) and an optional FAN_ONDIR flag. 2. Directory fid info + name + mask that includes only FAN_DIR_MODIFY. To request the second event flavor, user needs to set the event type FAN_DIR_MODIFY in the mark mask. The first flavor is supported since kernel v5.1 for groups initialized with flag FAN_REPORT_FID. It is intended to be used for watching directories in "batch mode" - the watcher is notified when directory is changed and re-scans the directory content in response. This event flavor is stored more compactly in the event queue, so it is optimal for workloads with frequent directory changes. The second event flavor is intended to be used for watching large directories, where the cost of re-scan of the directory on every change is considered too high. The watcher getting the event with the directory fid and entry name is expected to call fstatat(2) to query the content of the entry after the change. Legacy inotify events are reported with name and event mask (e.g. "foo", FAN_CREATE | FAN_ONDIR). That can lead users to the conclusion that there is *currently* an entry "foo" that is a sub-directory, when in fact "foo" may be negative or non-dir by the time user gets the event. To make it clear that the current state of the named entry is unknown, when reporting an event with name info, fanotify obfuscates the specific event types (e.g. create,delete,rename) and uses a common event type - FAN_DIR_MODIFY to describe the change. This should make it harder for users to make wrong assumptions and write buggy filesystem monitors. At this point, name info reporting is not yet implemented, so trying to set FAN_DIR_MODIFY in mark mask will return -EINVAL. Link: https://lore.kernel.org/r/20200319151022.31456-12-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-03-25gpio: uapi: Improve phrasing around arrays representing empty stringsJonathan Neuschäfer
Character arrays can be considered empty strings (if they are immediately terminated), but they cannot be NULL. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2020-03-24Input: avoid BIT() macro usage in the serio.h UAPI headerEugene Syromiatnikov
The commit 19ba1eb15a2a ("Input: psmouse - add a custom serio protocol to send extra information") introduced usage of the BIT() macro for SERIO_* flags; this macro is not provided in UAPI headers. Replace if with similarly defined _BITUL() macro defined in <linux/const.h>. Fixes: 19ba1eb15a2a ("Input: psmouse - add a custom serio protocol to send extra information") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Cc: <stable@vger.kernel.org> # v5.0+ Link: https://lore.kernel.org/r/20200324041341.GA32335@asgard.redhat.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-03-24vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first userAlex Williamson
The VFIO_DEVICE_FEATURE ioctl is meant to be a general purpose, device agnostic ioctl for setting, retrieving, and probing device features. This implementation provides a 16-bit field for specifying a feature index, where the data porition of the ioctl is determined by the semantics for the given feature. Additional flag bits indicate the direction and nature of the operation; SET indicates user data is provided into the device feature, GET indicates the device feature is written out into user data. The PROBE flag augments determining whether the given feature is supported, and if provided, whether the given operation on the feature is supported. The first user of this ioctl is for setting the vfio-pci VF token, where the user provides a shared secret key (UUID) on a SR-IOV PF device, which users must provide when opening associated VF devices. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>