summaryrefslogtreecommitdiff
path: root/include/uapi
AgeCommit message (Collapse)Author
2020-03-30bpf: Implement bpf_prog replacement for an active bpf_cgroup_linkAndrii Nakryiko
Add new operation (LINK_UPDATE), which allows to replace active bpf_prog from under given bpf_link. Currently this is only supported for bpf_cgroup_link, but will be extended to other kinds of bpf_links in follow-up patches. For bpf_cgroup_link, implemented functionality matches existing semantics for direct bpf_prog attachment (including BPF_F_REPLACE flag). User can either unconditionally set new bpf_prog regardless of which bpf_prog is currently active under given bpf_link, or, optionally, can specify expected active bpf_prog. If active bpf_prog doesn't match expected one, no changes are performed, old bpf_link stays intact and attached, operation returns a failure. cgroup_bpf_replace() operation is resolving race between auto-detachment and bpf_prog update in the same fashion as it's done for bpf_link detachment, except in this case update has no way of succeeding because of target cgroup marked as dying. So in this case error is returned. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200330030001.2312810-3-andriin@fb.com
2020-03-30bpf: Implement bpf_link-based cgroup BPF program attachmentAndrii Nakryiko
Implement new sub-command to attach cgroup BPF programs and return FD-based bpf_link back on success. bpf_link, once attached to cgroup, cannot be replaced, except by owner having its FD. Cgroup bpf_link supports only BPF_F_ALLOW_MULTI semantics. Both link-based and prog-based BPF_F_ALLOW_MULTI attachments can be freely intermixed. To prevent bpf_cgroup_link from keeping cgroup alive past the point when no BPF program can be executed, implement auto-detachment of link. When cgroup_bpf_release() is called, all attached bpf_links are forced to release cgroup refcounts, but they leave bpf_link otherwise active and allocated, as well as still owning underlying bpf_prog. This is because user-space might still have FDs open and active, so bpf_link as a user-referenced object can't be freed yet. Once last active FD is closed, bpf_link will be freed and underlying bpf_prog refcount will be dropped. But cgroup refcount won't be touched, because cgroup is released already. The inherent race between bpf_cgroup_link release (from closing last FD) and cgroup_bpf_release() is resolved by both operations taking cgroup_mutex. So the only additional check required is when bpf_cgroup_link attempts to detach itself from cgroup. At that time we need to check whether there is still cgroup associated with that link. And if not, exit with success, because bpf_cgroup_link was already successfully detached. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Link: https://lore.kernel.org/bpf/20200330030001.2312810-2-andriin@fb.com
2020-03-30Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main changes in this cycle were: Kernel side changes: - A couple of x86/cpu cleanups and changes were grandfathered in due to patch dependencies. These clean up the set of CPU model/family matching macros with a consistent namespace and C99 initializer style. - A bunch of updates to various low level PMU drivers: * AMD Family 19h L3 uncore PMU * Intel Tiger Lake uncore support * misc fixes to LBR TOS sampling - optprobe fixes - perf/cgroup: optimize cgroup event sched-in processing - misc cleanups and fixes Tooling side changes are to: - perf {annotate,expr,record,report,stat,test} - perl scripting - libapi, libperf and libtraceevent - vendor events on Intel and S390, ARM cs-etm - Intel PT updates - Documentation changes and updates to core facilities - misc cleanups, fixes and other enhancements" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (89 commits) cpufreq/intel_pstate: Fix wrong macro conversion x86/cpu: Cleanup the now unused CPU match macros hwrng: via_rng: Convert to new X86 CPU match macros crypto: Convert to new CPU match macros ASoC: Intel: Convert to new X86 CPU match macros powercap/intel_rapl: Convert to new X86 CPU match macros PCI: intel-mid: Convert to new X86 CPU match macros mmc: sdhci-acpi: Convert to new X86 CPU match macros intel_idle: Convert to new X86 CPU match macros extcon: axp288: Convert to new X86 CPU match macros thermal: Convert to new X86 CPU match macros hwmon: Convert to new X86 CPU match macros platform/x86: Convert to new CPU match macros EDAC: Convert to new X86 CPU match macros cpufreq: Convert to new X86 CPU match macros ACPI: Convert to new X86 CPU match macros x86/platform: Convert to new CPU match macros x86/kernel: Convert to new CPU match macros x86/kvm: Convert to new CPU match macros x86/perf/events: Convert to new CPU match macros ...
2020-03-30Merge tag 'usb-5.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / PHY updates from Greg KH: "Here are the big set of USB and PHY driver patches for 5.7-rc1. Nothing huge here, some new PHY drivers, loads of USB gadget fixes and updates, xhci updates, usb-serial driver updates and new device ids, and other minor things. Full details in the shortlog. All have been in linux-next for a while with no reported issues" * tag 'usb-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (239 commits) USB: cdc-acm: restore capability check order usb: cdns3: make signed 1 bit bitfields unsigned usb: gadget: fsl: remove unused variable 'driver_desc' usb: gadget: f_fs: Fix use after free issue as part of queue failure usb: typec: Correct the documentation for typec_cable_put() USB: serial: io_edgeport: fix slab-out-of-bounds read in edge_interrupt_callback USB: serial: option: add Wistron Neweb D19Q1 USB: serial: option: add BroadMobi BM806U USB: serial: option: add support for ASKEY WWHC050 usb: core: Add ACPI support for USB interface devices driver core: platform: Reimplement devm_platform_ioremap_resource usb: dwc2: convert to devm_platform_get_and_ioremap_resource usb: host: hisilicon: convert to devm_platform_get_and_ioremap_resource usb: host: xhci-plat: convert to devm_platform_get_and_ioremap_resource drivers: provide devm_platform_get_and_ioremap_resource() phy: qcom-qusb2: Add new overriding tuning parameters in QUSB2 V2 PHY phy: qcom-qusb2: Add support for overriding tuning parameters in QUSB2 V2 PHY dt-bindings: phy: qcom-qusb2: Add support for overriding Phy tuning parameters phy: qcom-qusb2: Add generic QUSB2 V2 PHY support dt-bindings: phy: qcom,qusb2: Add compatibles for QUSB2 V2 phy and SC7180 ...
2020-03-30bpf: Add socket assign supportJoe Stringer
Add support for TPROXY via a new bpf helper, bpf_sk_assign(). This helper requires the BPF program to discover the socket via a call to bpf_sk*_lookup_*(), then pass this socket to the new helper. The helper takes its own reference to the socket in addition to any existing reference that may or may not currently be obtained for the duration of BPF processing. For the destination socket to receive the traffic, the traffic must be routed towards that socket via local route. The simplest example route is below, but in practice you may want to route traffic more narrowly (eg by CIDR): $ ip route add local default dev lo This patch avoids trying to introduce an extra bit into the skb->sk, as that would require more invasive changes to all code interacting with the socket to ensure that the bit is handled correctly, such as all error-handling cases along the path from the helper in BPF through to the orphan path in the input. Instead, we opt to use the destructor variable to switch on the prefetch of the socket. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200329225342.16317-2-joe@wand.net.nz
2020-03-30Merge tag 'media/v5.7-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - New sensor driver: imx219 - Support for some new pixelformats - Support for Sun8i SoC - Added more codecs to meson vdec driver - Prepare for removing the legacy usbvision driver by moving it to staging. This driver has issues and use legacy core APIs. If nobody steps up to address those, it is time for its retirement. - Several cleanups and improvements on drivers, with the addition of new supported boards * tag 'media/v5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (236 commits) media: venus: firmware: Ignore secure call error on first resume media: mtk-vpu: load vpu firmware from the new location media: i2c: video-i2c: fix build errors due to 'imply hwmon' media: MAINTAINERS: add myself to co-maintain Hantro G1/G2 for i.MX8MQ media: hantro: add initial i.MX8MQ support media: dt-bindings: Document i.MX8MQ VPU bindings media: vivid: fix incorrect PA assignment to HDMI outputs media: hantro: Add linux-rockchip mailing list to MAINTAINERS media: cedrus: h264: Fix 4K decoding on H6 media: siano: Use scnprintf() for avoiding potential buffer overflow media: rc: Use scnprintf() for avoiding potential buffer overflow media: allegro: create new struct for channel parameters media: allegro: move mail definitions to separate file media: allegro: pass buffers through firmware media: allegro: verify source and destination buffer in VCU response media: allegro: handle dependency of bitrate and bitrate_peak media: allegro: read bitrate mode directly from control media: allegro: make QP configurable media: allegro: make frame rate configurable media: allegro: skip filler data if possible ...
2020-03-30Merge tag 'seccomp-v5.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "A couple of seccomp updates. They're both mostly bug fixes that I wanted to have sit in linux-next for a while: - allow TSYNC and USER_NOTIF together (Tycho Andersen) - add missing compat_ioctl for notify (Sven Schnelle)" * tag 'seccomp-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Add missing compat_ioctl for notify seccomp: allow TSYNC and USER_NOTIF together
2020-03-30Merge tag 'for-5.7/io_uring-2020-03-29' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring updates from Jens Axboe: "Here are the io_uring changes for this merge window. Light on new features this time around (just splice + buffer selection), lots of cleanups, fixes, and improvements to existing support. In particular, this contains: - Cleanup fixed file update handling for stack fallback (Hillf) - Re-work of how pollable async IO is handled, we no longer require thread offload to handle that. Instead we rely using poll to drive this, with task_work execution. - In conjunction with the above, allow expendable buffer selection, so that poll+recv (for example) no longer has to be a split operation. - Make sure we honor RLIMIT_FSIZE for buffered writes - Add support for splice (Pavel) - Linked work inheritance fixes and optimizations (Pavel) - Async work fixes and cleanups (Pavel) - Improve io-wq locking (Pavel) - Hashed link write improvements (Pavel) - SETUP_IOPOLL|SETUP_SQPOLL improvements (Xiaoguang)" * tag 'for-5.7/io_uring-2020-03-29' of git://git.kernel.dk/linux-block: (54 commits) io_uring: cleanup io_alloc_async_ctx() io_uring: fix missing 'return' in comment io-wq: handle hashed writes in chains io-uring: drop 'free_pfile' in struct io_file_put io-uring: drop completion when removing file io_uring: Fix ->data corruption on re-enqueue io-wq: close cancel gap for hashed linked work io_uring: make spdxcheck.py happy io_uring: honor original task RLIMIT_FSIZE io-wq: hash dependent work io-wq: split hashing and enqueueing io-wq: don't resched if there is no work io-wq: remove duplicated cancel code io_uring: fix truncated async read/readv and write/writev retry io_uring: dual license io_uring.h uapi header io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled io_uring: Fix unused function warnings io_uring: add end-of-bits marker and build time verify it io_uring: provide means of removing buffers io_uring: add IOSQE_BUFFER_SELECT support for IORING_OP_RECVMSG ...
2020-03-30Merge tag 'for-5.7/drivers-2020-03-29' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: - floppy driver cleanup series from Willy - NVMe updates and fixes (Various) - null_blk trace improvements (Chaitanya) - bcache fixes (Coly) - md fixes (via Song) - loop block size change optimizations (Martijn) - scnprintf() use (Takashi) * tag 'for-5.7/drivers-2020-03-29' of git://git.kernel.dk/linux-block: (81 commits) null_blk: add trace in null_blk_zoned.c null_blk: add tracepoint helpers for zoned mode block: add a zone condition debug helper nvme: cleanup namespace identifier reporting in nvme_init_ns_head nvme: rename __nvme_find_ns_head to nvme_find_ns_head nvme: refactor nvme_identify_ns_descs error handling nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl nvme: Fix controller creation races with teardown flow nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl nvme: Fix ctrl use-after-free during sysfs deletion nvme-pci: Re-order nvme_pci_free_ctrl nvme: Remove unused return code from nvme_delete_ctrl_sync nvme: Use nvme_state_terminal helper nvme: release ida resources nvme: Add compat_ioctl handler for NVME_IOCTL_SUBMIT_IO nvmet-tcp: optimize tcp stack TX when data digest is used nvme-fabrics: Use scnprintf() for avoiding potential buffer overflow nvme-multipath: do not reset on unknown status nvmet-rdma: allocate RW ctxs according to mdts ...
2020-03-30devlink: Add auto dump flag to health reporterEran Ben Elisha
On low memory system, run time dumps can consume too much memory. Add administrator ability to disable auto dumps per reporter as part of the error flow handle routine. This attribute is not relevant while executing DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET. By default, auto dump is activated for any reporter that has a dump method, as part of the reporter registration to devlink. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30net: sched: expose HW stats types per action used by driversJiri Pirko
It may be up to the driver (in case ANY HW stats is passed) to select which type of HW stats he is going to use. Add an infrastructure to expose this information to user. $ tc filter add dev enp3s0np1 ingress proto ip handle 1 pref 1 flower dst_ip 192.168.1.1 action drop $ tc -s filter show dev enp3s0np1 ingress filter protocol ip pref 1 flower chain 0 filter protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 dst_ip 192.168.1.1 in_hw in_hw_count 2 action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 installed 10 sec used 10 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 used_hw_stats immediate <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: provide timestamping information with TSINFO_GET requestMichal Kubecek
Implement TSINFO_GET request to get timestamping information for a network device. This is traditionally available via ETHTOOL_GET_TS_INFO ioctl request. Move part of ethtool_get_ts_info() into common.c so that ioctl and netlink code use the same logic to get timestamping information from the device. v3: use "TSINFO" rather than "TIMESTAMP", suggested by Richard Cochran Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: add timestamping related string setsMichal Kubecek
Add three string sets related to timestamping information: ETH_SS_SOF_TIMESTAMPING: SOF_TIMESTAMPING_* flags ETH_SS_TS_TX_TYPES: timestamping Tx types ETH_SS_TS_RX_FILTERS: timestamping Rx filters These will be used for TIMESTAMP_GET request. v2: avoid compiler warning ("enumeration value not handled in switch") in net_hwtstamp_validate() v3: omit dash in Tx type names ("one-step-*" -> "onestep-*"), suggested by Richard Cochran Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: add EEE_NTF notificationMichal Kubecek
Send ETHTOOL_MSG_EEE_NTF notification whenever EEE settings of a network device are modified using ETHTOOL_MSG_EEE_SET netlink message or ETHTOOL_SEEE ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: set EEE settings with EEE_SET requestMichal Kubecek
Implement EEE_SET netlink request to set EEE settings of a network device. These are traditionally set with ETHTOOL_SEEE ioctl request. The netlink interface allows setting the EEE status for all link modes supported by kernel but only first 32 link modes can be set at the moment as only those are supported by the ethtool_ops callback. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: provide EEE settings with EEE_GET requestMichal Kubecek
Implement EEE_GET request to get EEE settings of a network device. These are traditionally available via ETHTOOL_GEEE ioctl request. The netlink interface allows reporting EEE status for all link modes supported by kernel but only first 32 link modes are provided at the moment as only those are reported by the ethtool_ops callback and drivers. v2: fix alignment (whitespace only) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: add PAUSE_NTF notificationMichal Kubecek
Send ETHTOOL_MSG_PAUSE_NTF notification whenever pause parameters of a network device are modified using ETHTOOL_MSG_PAUSE_SET netlink message or ETHTOOL_SPAUSEPARAM ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: set pause parameters with PAUSE_SET requestMichal Kubecek
Implement PAUSE_SET netlink request to set pause parameters of a network device. Thease are traditionally set with ETHTOOL_SPAUSEPARAM ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: provide pause parameters with PAUSE_GET requestMichal Kubecek
Implement PAUSE_GET request to get pause parameters of a network device. These are traditionally available via ETHTOOL_GPAUSEPARAM ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: add COALESCE_NTF notificationMichal Kubecek
Send ETHTOOL_MSG_COALESCE_NTF notification whenever coalescing parameters of a network device are modified using ETHTOOL_MSG_COALESCE_SET netlink message or ETHTOOL_SCOALESCE ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: set coalescing parameters with COALESCE_SET requestMichal Kubecek
Implement COALESCE_SET netlink request to set coalescing parameters of a network device. These are traditionally set with ETHTOOL_SCOALESCE ioctl request. This commit adds only support for device coalescing parameters, not per queue coalescing parameters. Like the ioctl implementation, the generic ethtool code checks if only supported parameters are modified; if not, first offending attribute is reported using extack. v2: fix alignment (whitespace only) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29ethtool: provide coalescing parameters with COALESCE_GET requestMichal Kubecek
Implement COALESCE_GET request to get coalescing parameters of a network device. These are traditionally available via ETHTOOL_GCOALESCE ioctl request. This commit adds only support for device coalescing parameters, not per queue coalescing parameters. Omit attributes with zero values unless they are declared as supported (i.e. the corresponding bit in ethtool_ops::supported_coalesce_params is set). Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29net: ipv6: add rpl sr tunnelAlexander Aring
This patch adds functionality to configure routes for RPL source routing functionality. There is no IPIP functionality yet implemented which can be added later when the cases when to use IPv6 encapuslation comes more clear. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29net: ipv6: add support for rpl sr exthdrAlexander Aring
This patch adds rpl source routing receive handling. Everything works only if sysconf "rpl_seg_enabled" and source routing is enabled. Mostly the same behaviour as IPv6 segmentation routing. To handle compression and uncompression a rpl.c file is created which contains the necessary functionality. The receive handling will also care about IPv6 encapsulated so far it's specified as possible nexthdr in RFC 6554. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29include: uapi: linux: add rpl sr header definitionAlexander Aring
This patch adds a uapi header for rpl struct definition. The segments data can be accessed over rpl_segaddr or rpl_segdata macros. In case of compri and compre is zero the segment data is not compressed and can be accessed by rpl_segaddr. In the other case the compressed data can be accessed by rpl_segdata and interpreted as byte array. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29mptcp: add netlink-based PMPaolo Abeni
Expose a new netlink family to userspace to control the PM, setting: - list of local addresses to be signalled. - list of local addresses used to created subflows. - maximum number of add_addr option to react When the msk is fully established, the PM netlink attempts to announce the 'signal' list via the ADD_ADDR option. Since we currently lack the ADD_ADDR echo (and related event) only the first addr is sent. After exhausting the 'announce' list, the PM tries to create subflow for each addr in 'local' list, waiting for each connection to be completed before attempting the next one. Idea is to add an additional PM hook for ADD_ADDR echo, to allow the PM netlink announcing multiple addresses, in sequence. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29mptcp: allow dumping subflow context to userspaceDavide Caratti
add ulp-specific diagnostic functions, so that subflow information can be dumped to userspace programs like 'ss'. v2 -> v3: - uapi: use bit macros appropriate for userspace Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29net: macsec: add support for specifying offload upon link creationMark Starovoytov
This patch adds new netlink attribute to allow a user to (optionally) specify the desired offload mode immediately upon MACSec link creation. Separate iproute patch will be required to support this from user space. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Minor comment conflict in mac80211. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30bpf: Introduce BPF_PROG_TYPE_LSMKP Singh
Introduce types and configs for bpf programs that can be attached to LSM hooks. The programs can be enabled by the config option CONFIG_BPF_LSM. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Florent Revest <revest@google.com> Reviewed-by: Thomas Garnier <thgarnie@google.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-2-kpsingh@chromium.org
2020-03-29um: Implement time-travel=extJohannes Berg
This implements synchronized time-travel mode which - using a special application on a unix socket - lets multiple machines take part in a time-travelling simulation together. The protocol for the unix domain socket is defined in the new file include/uapi/linux/um_timetravel.h. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-03-28xdp: Support specifying expected existing program when attaching XDPToke Høiland-Jørgensen
While it is currently possible for userspace to specify that an existing XDP program should not be replaced when attaching to an interface, there is no mechanism to safely replace a specific XDP program with another. This patch adds a new netlink attribute, IFLA_XDP_EXPECTED_FD, which can be set along with IFLA_XDP_FD. If set, the kernel will check that the program currently loaded on the interface matches the expected one, and fail the operation if it does not. This corresponds to a 'cmpxchg' memory operation. Setting the new attribute with a negative value means that no program is expected to be attached, which corresponds to setting the UPDATE_IF_NOEXIST flag. A new companion flag, XDP_FLAGS_REPLACE, is also added to explicitly request checking of the EXPECTED_FD attribute. This is needed for userspace to discover whether the kernel supports the new attribute. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/158515700640.92963.3551295145441017022.stgit@toke.dk
2020-03-27bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor idDaniel Borkmann
Enable the bpf_get_current_cgroup_id() helper for connect(), sendmsg(), recvmsg() and bind-related hooks in order to retrieve the cgroup v2 context which can then be used as part of the key for BPF map lookups, for example. Given these hooks operate in process context 'current' is always valid and pointing to the app that is performing mentioned syscalls if it's subject to a v2 cgroup. Also with same motivation of commit 7723628101aa ("bpf: Introduce bpf_skb_ancestor_cgroup_id helper") enable retrieval of ancestor from current so the cgroup id can be used for policy lookups which can then forbid connect() / bind(), for example. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/d2a7ef42530ad299e3cbb245e6c12374b72145ef.1585323121.git.daniel@iogearbox.net
2020-03-27bpf: Add netns cookie and enable it for bpf cgroup hooksDaniel Borkmann
In Cilium we're mainly using BPF cgroup hooks today in order to implement kube-proxy free Kubernetes service translation for ClusterIP, NodePort (*), ExternalIP, and LoadBalancer as well as HostPort mapping [0] for all traffic between Cilium managed nodes. While this works in its current shape and avoids packet-level NAT for inter Cilium managed node traffic, there is one major limitation we're facing today, that is, lack of netns awareness. In Kubernetes, the concept of Pods (which hold one or multiple containers) has been built around network namespaces, so while we can use the global scope of attaching to root BPF cgroup hooks also to our advantage (e.g. for exposing NodePort ports on loopback addresses), we also have the need to differentiate between initial network namespaces and non-initial one. For example, ExternalIP services mandate that non-local service IPs are not to be translated from the host (initial) network namespace as one example. Right now, we have an ugly work-around in place where non-local service IPs for ExternalIP services are not xlated from connect() and friends BPF hooks but instead via less efficient packet-level NAT on the veth tc ingress hook for Pod traffic. On top of determining whether we're in initial or non-initial network namespace we also have a need for a socket-cookie like mechanism for network namespaces scope. Socket cookies have the nice property that they can be combined as part of the key structure e.g. for BPF LRU maps without having to worry that the cookie could be recycled. We are planning to use this for our sessionAffinity implementation for services. Therefore, add a new bpf_get_netns_cookie() helper which would resolve both use cases at once: bpf_get_netns_cookie(NULL) would provide the cookie for the initial network namespace while passing the context instead of NULL would provide the cookie from the application's network namespace. We're using a hole, so no size increase; the assignment happens only once. Therefore this allows for a comparison on initial namespace as well as regular cookie usage as we have today with socket cookies. We could later on enable this helper for other program types as well as we would see need. (*) Both externalTrafficPolicy={Local|Cluster} types [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/c47d2346982693a9cf9da0e12690453aded4c788.1585323121.git.daniel@iogearbox.net
2020-03-27Merge tag 'v5.6-rc7' into develLinus Walleij
Linux 5.6-rc7
2020-03-27netfilter: flowtable: add counter supportPablo Neira Ayuso
Add a new flag to turn on flowtable counters which are stored in the conntrack entry. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-27netfilter: nf_tables: add enum nft_flowtable_flags to uapiPablo Neira Ayuso
Expose the NFT_FLOWTABLE_HW_OFFLOAD flag through uapi. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-27Merge branch 'asoc-5.7' into asoc-nextMark Brown
2020-03-27Merge branch 'mlx5_tx_steering' into rdma.git for-nextJason Gunthorpe
Leon Romanovsky says: ==================== Those two patches from Michael extends mlx5_core and mlx5_ib flow steering to support RDMA TX in similar way to already supported RDMA RX. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Due to dependencies * branch 'mlx5_tx_steering': RDMA/mlx5: Add support for RDMA TX flow table net/mlx5: Add support for RDMA TX steering
2020-03-27RDMA/mlx5: Add support for RDMA TX flow tableMichael Guralnik
Enable user application to add rules for RDMA TX steering table. Rules in this steering table will allow to steer transmitted RDMA traffic. Link: https://lore.kernel.org/r/20200324061425.1570190-3-leon@kernel.org Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27IB/mlx5: Move to fully dynamic UAR mode once user space supports itYishai Hadas
Move to fully dynamic UAR mode once user space supports it. In this case we prevent any legacy mode of UARs on the allocated context and prevent redundant allocation of the static ones. Link: https://lore.kernel.org/r/20200324060143.1569116-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27IB/mlx5: Extend QP creation to get uar page index from user spaceYishai Hadas
Extend QP creation to get uar page index from user space, this mode can be used with the UAR dynamic mode APIs to allocate/destroy a UAR object. As part of enabling this option blocked the weird/un-supported cross channel option which uses index 0 hard-coded. This QP flag wasn't exposed to user space as part of any formal upstream release, the dynamic option can allow having valid UAR page index instead. Link: https://lore.kernel.org/r/20200324060143.1569116-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27IB/mlx5: Extend CQ creation to get uar page index from user spaceYishai Hadas
Extend CQ creation to get uar page index from user space, this mode can be used with the UAR dynamic mode APIs to allocate/destroy a UAR object. Link: https://lore.kernel.org/r/20200324060143.1569116-3-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27IB/mlx5: Expose UAR object and its alloc/destroy commandsYishai Hadas
Expose UAR object and its alloc/destroy commands to be used over the ioctl interface by user space applications. This API supports both BF & NC modes and enables a dynamic allocation of UARs once really needed. As the number of driver objects were limited by the core ones when the merged tree is prepared, had to decrease the number of core objects to enable the new UAR object usage. Link: https://lore.kernel.org/r/20200324060143.1569116-2-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-27perf/core: Add PERF_SAMPLE_CGROUP featureNamhyung Kim
The PERF_SAMPLE_CGROUP bit is to save (perf_event) cgroup information in the sample. It will add a 64-bit id to identify current cgroup and it's the file handle in the cgroup file system. Userspace should use this information with PERF_RECORD_CGROUP event to match which cgroup it belongs. I put it before PERF_SAMPLE_AUX for simplicity since it just needs a 64-bit word. But if we want bigger samples, I can work on that direction too. Committer testing: $ pahole perf_sample_data | grep -w cgroup -B5 -A5 /* --- cacheline 4 boundary (256 bytes) was 56 bytes ago --- */ struct perf_regs regs_intr; /* 312 16 */ /* --- cacheline 5 boundary (320 bytes) was 8 bytes ago --- */ u64 stack_user_size; /* 328 8 */ u64 phys_addr; /* 336 8 */ u64 cgroup; /* 344 8 */ /* size: 384, cachelines: 6, members: 22 */ /* padding: 32 */ }; $ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lore.kernel.org/lkml/20200325124536.2800725-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-27perf/core: Add PERF_RECORD_CGROUP eventNamhyung Kim
To support cgroup tracking, add CGROUP event to save a link between cgroup path and id number. This is needed since cgroups can go away when userspace tries to read the cgroup info (from the id) later. The attr.cgroup bit was also added to enable cgroup tracking from userspace. This event will be generated when a new cgroup becomes active. Userspace might need to synthesize those events for existing cgroups. Committer testing: From the resulting kernel, using /sys/kernel/btf/vmlinux: $ pahole perf_event_attr | grep -w cgroup -B5 -A1 __u64 write_backward:1; /* 40:27 8 */ __u64 namespaces:1; /* 40:28 8 */ __u64 ksymbol:1; /* 40:29 8 */ __u64 bpf_event:1; /* 40:30 8 */ __u64 aux_output:1; /* 40:31 8 */ __u64 cgroup:1; /* 40:32 8 */ __u64 __reserved_1:31; /* 40:33 8 */ $ Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> [staticize perf_event_cgroup function] Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lore.kernel.org/lkml/20200325124536.2800725-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-27drm/i915/perf: add new open param to configure polling of OA bufferLionel Landwerlin
This new parameter let's the application choose how often the OA buffer should be checked on the CPU side for data availability. Longer polling period tend to reduce CPU overhead if the application does not care about somewhat real time data collection. v2: Allow disabling polling completely with 0 value (Lionel) v3: Version the new parameter (Joonas) v4: Rebase (Umesh) v5: Make poll delay value of 0 invalid (Umesh) v6: - Describe poll_oa_period (Ashutosh) - Fix comment for new poll parameter (Lionel) - Drop open_flags in read_properties_unlocked (Lionel) - Rename uapi parameter (Ashutosh) v7: Reword the comment in uapi (Ashutosh) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200324185457.14635-4-umesh.nerlige.ramappa@intel.com
2020-03-27Merge branches 'iommu/fixes', 'arm/qcom', 'arm/omap', 'arm/smmu', 'x86/amd', ↵Joerg Roedel
'x86/vt-d', 'virtio' and 'core' into next
2020-03-27iommu/virtio: Fix sparse warningJean-Philippe Brucker
We copied the virtio_iommu_config from the virtio-iommu specification, which declares the fields using little-endian annotations (for example le32). Unfortunately this causes sparse to warn about comparison between little- and cpu-endian, because of the typecheck() in virtio_cread(): drivers/iommu/virtio-iommu.c:1024:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1024:9: sparse: restricted __le64 * drivers/iommu/virtio-iommu.c:1024:9: sparse: unsigned long long * drivers/iommu/virtio-iommu.c:1036:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1036:9: sparse: restricted __le64 * drivers/iommu/virtio-iommu.c:1036:9: sparse: unsigned long long * drivers/iommu/virtio-iommu.c:1040:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1040:9: sparse: restricted __le64 * drivers/iommu/virtio-iommu.c:1040:9: sparse: unsigned long long * drivers/iommu/virtio-iommu.c:1044:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1044:9: sparse: restricted __le32 * drivers/iommu/virtio-iommu.c:1044:9: sparse: unsigned int * drivers/iommu/virtio-iommu.c:1048:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1048:9: sparse: restricted __le32 * drivers/iommu/virtio-iommu.c:1048:9: sparse: unsigned int * drivers/iommu/virtio-iommu.c:1052:9: sparse: sparse: incompatible types in comparison expression (different base types): drivers/iommu/virtio-iommu.c:1052:9: sparse: restricted __le32 * drivers/iommu/virtio-iommu.c:1052:9: sparse: unsigned int * Although virtio_cread() does convert virtio-endian (in our case little-endian) to cpu-endian, the typecheck() needs the two arguments to have the same endianness. Do as UAPI headers of other virtio devices do, and remove the endian annotation from the device config. Even though we change the UAPI this shouldn't cause any regression since QEMU, the existing implementation of virtio-iommu that uses this header, already removes the annotations when importing headers. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20200326093558.2641019-2-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-03-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - a fix to generate proper timestamps on key autorepeat events that were broken recently - a fix for Synaptics driver to only activate reduced reporting mode when explicitly requested - a new keycode for "selective screenshot" function - other assorted fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: fix stale timestamp on key autorepeat events Input: move the new KEY_SELECTIVE_SCREENSHOT keycode Input: avoid BIT() macro usage in the serio.h UAPI header Input: synaptics-rmi4 - set reduced reporting mode only when requested Input: synaptics - enable RMI on HP Envy 13-ad105ng Input: allocate keycode for "Selective Screenshot" key Input: tm2-touchkey - add support for Coreriver TC360 variant dt-bindings: input: add Coreriver TC360 binding dt-bindings: vendor-prefixes: Add Coreriver vendor prefix Input: raydium_i2c_ts - fix error codes in raydium_i2c_boot_trigger()