From e7096c131e5161fa3b8e52a650d7719d2857adfd Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Mon, 9 Dec 2019 00:27:34 +0100 Subject: net: WireGuard secure network tunnel WireGuard is a layer 3 secure networking tunnel made specifically for the kernel, that aims to be much simpler and easier to audit than IPsec. Extensive documentation and description of the protocol and considerations, along with formal proofs of the cryptography, are available at: * https://www.wireguard.com/ * https://www.wireguard.com/papers/wireguard.pdf This commit implements WireGuard as a simple network device driver, accessible in the usual RTNL way used by virtual network drivers. It makes use of the udp_tunnel APIs, GRO, GSO, NAPI, and the usual set of networking subsystem APIs. It has a somewhat novel multicore queueing system designed for maximum throughput and minimal latency of encryption operations, but it is implemented modestly using workqueues and NAPI. Configuration is done via generic Netlink, and following a review from the Netlink maintainer a year ago, several high profile userspace tools have already implemented the API. This commit also comes with several different tests, both in-kernel tests and out-of-kernel tests based on network namespaces, taking profit of the fact that sockets used by WireGuard intentionally stay in the namespace the WireGuard interface was originally created, exactly like the semantics of userspace tun devices. See wireguard.com/netns/ for pictures and examples. The source code is fairly short, but rather than combining everything into a single file, WireGuard is developed as cleanly separable files, making auditing and comprehension easier. Things are laid out as follows: * noise.[ch], cookie.[ch], messages.h: These implement the bulk of the cryptographic aspects of the protocol, and are mostly data-only in nature, taking in buffers of bytes and spitting out buffers of bytes. They also handle reference counting for their various shared pieces of data, like keys and key lists. * ratelimiter.[ch]: Used as an integral part of cookie.[ch] for ratelimiting certain types of cryptographic operations in accordance with particular WireGuard semantics. * allowedips.[ch], peerlookup.[ch]: The main lookup structures of WireGuard, the former being trie-like with particular semantics, an integral part of the design of the protocol, and the latter just being nice helper functions around the various hashtables we use. * device.[ch]: Implementation of functions for the netdevice and for rtnl, responsible for maintaining the life of a given interface and wiring it up to the rest of WireGuard. * peer.[ch]: Each interface has a list of peers, with helper functions available here for creation, destruction, and reference counting. * socket.[ch]: Implementation of functions related to udp_socket and the general set of kernel socket APIs, for sending and receiving ciphertext UDP packets, and taking care of WireGuard-specific sticky socket routing semantics for the automatic roaming. * netlink.[ch]: Userspace API entry point for configuring WireGuard peers and devices. The API has been implemented by several userspace tools and network management utility, and the WireGuard project distributes the basic wg(8) tool. * queueing.[ch]: Shared function on the rx and tx path for handling the various queues used in the multicore algorithms. * send.c: Handles encrypting outgoing packets in parallel on multiple cores, before sending them in order on a single core, via workqueues and ring buffers. Also handles sending handshake and cookie messages as part of the protocol, in parallel. * receive.c: Handles decrypting incoming packets in parallel on multiple cores, before passing them off in order to be ingested via the rest of the networking subsystem with GRO via the typical NAPI poll function. Also handles receiving handshake and cookie messages as part of the protocol, in parallel. * timers.[ch]: Uses the timer wheel to implement protocol particular event timeouts, and gives a set of very simple event-driven entry point functions for callers. * main.c, version.h: Initialization and deinitialization of the module. * selftest/*.h: Runtime unit tests for some of the most security sensitive functions. * tools/testing/selftests/wireguard/netns.sh: Aforementioned testing script using network namespaces. This commit aims to be as self-contained as possible, implementing WireGuard as a standalone module not needing much special handling or coordination from the network subsystem. I expect for future optimizations to the network stack to positively improve WireGuard, and vice-versa, but for the time being, this exists as intentionally standalone. We introduce a menu option for CONFIG_WIREGUARD, as well as providing a verbose debug log and self-tests via CONFIG_WIREGUARD_DEBUG. Signed-off-by: Jason A. Donenfeld Cc: David Miller Cc: Greg KH Cc: Linus Torvalds Cc: Herbert Xu Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller --- include/uapi/linux/wireguard.h | 196 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 196 insertions(+) create mode 100644 include/uapi/linux/wireguard.h (limited to 'include/uapi') diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h new file mode 100644 index 000000000000..dd8a47c4ad11 --- /dev/null +++ b/include/uapi/linux/wireguard.h @@ -0,0 +1,196 @@ +/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR MIT */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * Documentation + * ============= + * + * The below enums and macros are for interfacing with WireGuard, using generic + * netlink, with family WG_GENL_NAME and version WG_GENL_VERSION. It defines two + * methods: get and set. Note that while they share many common attributes, + * these two functions actually accept a slightly different set of inputs and + * outputs. + * + * WG_CMD_GET_DEVICE + * ----------------- + * + * May only be called via NLM_F_REQUEST | NLM_F_DUMP. The command should contain + * one but not both of: + * + * WGDEVICE_A_IFINDEX: NLA_U32 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * + * The kernel will then return several messages (NLM_F_MULTI) containing the + * following tree of nested items: + * + * WGDEVICE_A_IFINDEX: NLA_U32 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * WGDEVICE_A_PRIVATE_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGDEVICE_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGDEVICE_A_LISTEN_PORT: NLA_U16 + * WGDEVICE_A_FWMARK: NLA_U32 + * WGDEVICE_A_PEERS: NLA_NESTED + * 0: NLA_NESTED + * WGPEER_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGPEER_A_PRESHARED_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGPEER_A_ENDPOINT: NLA_MIN_LEN(struct sockaddr), struct sockaddr_in or struct sockaddr_in6 + * WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL: NLA_U16 + * WGPEER_A_LAST_HANDSHAKE_TIME: NLA_EXACT_LEN, struct __kernel_timespec + * WGPEER_A_RX_BYTES: NLA_U64 + * WGPEER_A_TX_BYTES: NLA_U64 + * WGPEER_A_ALLOWEDIPS: NLA_NESTED + * 0: NLA_NESTED + * WGALLOWEDIP_A_FAMILY: NLA_U16 + * WGALLOWEDIP_A_IPADDR: NLA_MIN_LEN(struct in_addr), struct in_addr or struct in6_addr + * WGALLOWEDIP_A_CIDR_MASK: NLA_U8 + * 0: NLA_NESTED + * ... + * 0: NLA_NESTED + * ... + * ... + * WGPEER_A_PROTOCOL_VERSION: NLA_U32 + * 0: NLA_NESTED + * ... + * ... + * + * It is possible that all of the allowed IPs of a single peer will not + * fit within a single netlink message. In that case, the same peer will + * be written in the following message, except it will only contain + * WGPEER_A_PUBLIC_KEY and WGPEER_A_ALLOWEDIPS. This may occur several + * times in a row for the same peer. It is then up to the receiver to + * coalesce adjacent peers. Likewise, it is possible that all peers will + * not fit within a single message. So, subsequent peers will be sent + * in following messages, except those will only contain WGDEVICE_A_IFNAME + * and WGDEVICE_A_PEERS. It is then up to the receiver to coalesce these + * messages to form the complete list of peers. + * + * Since this is an NLA_F_DUMP command, the final message will always be + * NLMSG_DONE, even if an error occurs. However, this NLMSG_DONE message + * contains an integer error code. It is either zero or a negative error + * code corresponding to the errno. + * + * WG_CMD_SET_DEVICE + * ----------------- + * + * May only be called via NLM_F_REQUEST. The command should contain the + * following tree of nested items, containing one but not both of + * WGDEVICE_A_IFINDEX and WGDEVICE_A_IFNAME: + * + * WGDEVICE_A_IFINDEX: NLA_U32 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * WGDEVICE_A_FLAGS: NLA_U32, 0 or WGDEVICE_F_REPLACE_PEERS if all current + * peers should be removed prior to adding the list below. + * WGDEVICE_A_PRIVATE_KEY: len WG_KEY_LEN, all zeros to remove + * WGDEVICE_A_LISTEN_PORT: NLA_U16, 0 to choose randomly + * WGDEVICE_A_FWMARK: NLA_U32, 0 to disable + * WGDEVICE_A_PEERS: NLA_NESTED + * 0: NLA_NESTED + * WGPEER_A_PUBLIC_KEY: len WG_KEY_LEN + * WGPEER_A_FLAGS: NLA_U32, 0 and/or WGPEER_F_REMOVE_ME if the + * specified peer should not exist at the end of the + * operation, rather than added/updated and/or + * WGPEER_F_REPLACE_ALLOWEDIPS if all current allowed + * IPs of this peer should be removed prior to adding + * the list below and/or WGPEER_F_UPDATE_ONLY if the + * peer should only be set if it already exists. + * WGPEER_A_PRESHARED_KEY: len WG_KEY_LEN, all zeros to remove + * WGPEER_A_ENDPOINT: struct sockaddr_in or struct sockaddr_in6 + * WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL: NLA_U16, 0 to disable + * WGPEER_A_ALLOWEDIPS: NLA_NESTED + * 0: NLA_NESTED + * WGALLOWEDIP_A_FAMILY: NLA_U16 + * WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr + * WGALLOWEDIP_A_CIDR_MASK: NLA_U8 + * 0: NLA_NESTED + * ... + * 0: NLA_NESTED + * ... + * ... + * WGPEER_A_PROTOCOL_VERSION: NLA_U32, should not be set or used at + * all by most users of this API, as the + * most recent protocol will be used when + * this is unset. Otherwise, must be set + * to 1. + * 0: NLA_NESTED + * ... + * ... + * + * It is possible that the amount of configuration data exceeds that of + * the maximum message length accepted by the kernel. In that case, several + * messages should be sent one after another, with each successive one + * filling in information not contained in the prior. Note that if + * WGDEVICE_F_REPLACE_PEERS is specified in the first message, it probably + * should not be specified in fragments that come after, so that the list + * of peers is only cleared the first time but appened after. Likewise for + * peers, if WGPEER_F_REPLACE_ALLOWEDIPS is specified in the first message + * of a peer, it likely should not be specified in subsequent fragments. + * + * If an error occurs, NLMSG_ERROR will reply containing an errno. + */ + +#ifndef _WG_UAPI_WIREGUARD_H +#define _WG_UAPI_WIREGUARD_H + +#define WG_GENL_NAME "wireguard" +#define WG_GENL_VERSION 1 + +#define WG_KEY_LEN 32 + +enum wg_cmd { + WG_CMD_GET_DEVICE, + WG_CMD_SET_DEVICE, + __WG_CMD_MAX +}; +#define WG_CMD_MAX (__WG_CMD_MAX - 1) + +enum wgdevice_flag { + WGDEVICE_F_REPLACE_PEERS = 1U << 0, + __WGDEVICE_F_ALL = WGDEVICE_F_REPLACE_PEERS +}; +enum wgdevice_attribute { + WGDEVICE_A_UNSPEC, + WGDEVICE_A_IFINDEX, + WGDEVICE_A_IFNAME, + WGDEVICE_A_PRIVATE_KEY, + WGDEVICE_A_PUBLIC_KEY, + WGDEVICE_A_FLAGS, + WGDEVICE_A_LISTEN_PORT, + WGDEVICE_A_FWMARK, + WGDEVICE_A_PEERS, + __WGDEVICE_A_LAST +}; +#define WGDEVICE_A_MAX (__WGDEVICE_A_LAST - 1) + +enum wgpeer_flag { + WGPEER_F_REMOVE_ME = 1U << 0, + WGPEER_F_REPLACE_ALLOWEDIPS = 1U << 1, + WGPEER_F_UPDATE_ONLY = 1U << 2, + __WGPEER_F_ALL = WGPEER_F_REMOVE_ME | WGPEER_F_REPLACE_ALLOWEDIPS | + WGPEER_F_UPDATE_ONLY +}; +enum wgpeer_attribute { + WGPEER_A_UNSPEC, + WGPEER_A_PUBLIC_KEY, + WGPEER_A_PRESHARED_KEY, + WGPEER_A_FLAGS, + WGPEER_A_ENDPOINT, + WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL, + WGPEER_A_LAST_HANDSHAKE_TIME, + WGPEER_A_RX_BYTES, + WGPEER_A_TX_BYTES, + WGPEER_A_ALLOWEDIPS, + WGPEER_A_PROTOCOL_VERSION, + __WGPEER_A_LAST +}; +#define WGPEER_A_MAX (__WGPEER_A_LAST - 1) + +enum wgallowedip_attribute { + WGALLOWEDIP_A_UNSPEC, + WGALLOWEDIP_A_FAMILY, + WGALLOWEDIP_A_IPADDR, + WGALLOWEDIP_A_CIDR_MASK, + __WGALLOWEDIP_A_LAST +}; +#define WGALLOWEDIP_A_MAX (__WGALLOWEDIP_A_LAST - 1) + +#endif /* _WG_UAPI_WIREGUARD_H */ -- cgit From e27cca96cd68fa2c6814c90f9a1cfd36bb68c593 Mon Sep 17 00:00:00 2001 From: Sabrina Dubroca Date: Mon, 25 Nov 2019 14:49:02 +0100 Subject: xfrm: add espintcp (RFC 8229) TCP encapsulation of IKE and IPsec messages (RFC 8229) is implemented as a TCP ULP, overriding in particular the sendmsg and recvmsg operations. A Stream Parser is used to extract messages out of the TCP stream using the first 2 bytes as length marker. Received IKE messages are put on "ike_queue", waiting to be dequeued by the custom recvmsg implementation. Received ESP messages are sent to XFRM, like with UDP encapsulation. Some of this code is taken from the original submission by Herbert Xu. Currently, only IPv4 is supported, like for UDP encapsulation. Co-developed-by: Herbert Xu Signed-off-by: Herbert Xu Signed-off-by: Sabrina Dubroca Acked-by: David S. Miller Signed-off-by: Steffen Klassert --- include/uapi/linux/udp.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h index 30baccb6c9c4..4828794efcf8 100644 --- a/include/uapi/linux/udp.h +++ b/include/uapi/linux/udp.h @@ -42,5 +42,6 @@ struct udphdr { #define UDP_ENCAP_GTP0 4 /* GSM TS 09.60 */ #define UDP_ENCAP_GTP1U 5 /* 3GPP TS 29.060 */ #define UDP_ENCAP_RXRPC 6 +#define TCP_ENCAP_ESPINTCP 7 /* Yikes, this is really xfrm encap types. */ #endif /* _UAPI_LINUX_UDP_H */ -- cgit From bae141f54be83b06652c1d47e50e4e75ed4e9c7e Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Fri, 6 Dec 2019 22:49:34 +0100 Subject: bpf: Emit audit messages upon successful prog load and unload Allow for audit messages to be emitted upon BPF program load and unload for having a timeline of events. The load itself is in syscall context, so additional info about the process initiating the BPF prog creation can be logged and later directly correlated to the unload event. The only info really needed from BPF side is the globally unique prog ID where then audit user space tooling can query / dump all info needed about the specific BPF program right upon load event and enrich the record, thus these changes needed here can be kept small and non-intrusive to the core. Raw example output: # auditctl -D # auditctl -a always,exit -F arch=x86_64 -S bpf # ausearch --start recent -m 1334 ... ---- time->Wed Nov 27 16:04:13 2019 type=PROCTITLE msg=audit(1574867053.120:84664): proctitle="./bpf" type=SYSCALL msg=audit(1574867053.120:84664): arch=c000003e syscall=321 \ success=yes exit=3 a0=5 a1=7ffea484fbe0 a2=70 a3=0 items=0 ppid=7477 \ pid=12698 auid=1001 uid=1001 gid=1001 euid=1001 suid=1001 fsuid=1001 \ egid=1001 sgid=1001 fsgid=1001 tty=pts2 ses=4 comm="bpf" \ exe="/home/jolsa/auditd/audit-testsuite/tests/bpf/bpf" \ subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null) type=UNKNOWN[1334] msg=audit(1574867053.120:84664): prog-id=76 op=LOAD ---- time->Wed Nov 27 16:04:13 2019 type=UNKNOWN[1334] msg=audit(1574867053.120:84665): prog-id=76 op=UNLOAD ... Signed-off-by: Daniel Borkmann Co-developed-by: Jiri Olsa Signed-off-by: Jiri Olsa Acked-by: Paul Moore Link: https://lore.kernel.org/bpf/20191206214934.11319-1-jolsa@kernel.org --- include/uapi/linux/audit.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h index 3ad935527177..a534d71e689a 100644 --- a/include/uapi/linux/audit.h +++ b/include/uapi/linux/audit.h @@ -116,6 +116,7 @@ #define AUDIT_FANOTIFY 1331 /* Fanotify access decision */ #define AUDIT_TIME_INJOFFSET 1332 /* Timekeeping offset injected */ #define AUDIT_TIME_ADJNTPVAL 1333 /* NTP value adjustment */ +#define AUDIT_BPF 1334 /* BPF subsystem */ #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */ #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */ -- cgit From ef343b35d46667668a099655fca4a5b2e43a5dfe Mon Sep 17 00:00:00 2001 From: Stefano Garzarella Date: Tue, 10 Dec 2019 11:43:03 +0100 Subject: vsock: add VMADDR_CID_LOCAL definition The VMADDR_CID_RESERVED (1) was used by VMCI, but now it is not used anymore, so we can reuse it for local communication (loopback) adding the new well-know CID: VMADDR_CID_LOCAL. Cc: Jorgen Hansen Reviewed-by: Stefan Hajnoczi Reviewed-by: Jorgen Hansen Signed-off-by: Stefano Garzarella Signed-off-by: David S. Miller --- include/uapi/linux/vm_sockets.h | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/vm_sockets.h b/include/uapi/linux/vm_sockets.h index 68d57c5e99bc..fd0ed7221645 100644 --- a/include/uapi/linux/vm_sockets.h +++ b/include/uapi/linux/vm_sockets.h @@ -99,11 +99,13 @@ #define VMADDR_CID_HYPERVISOR 0 -/* This CID is specific to VMCI and can be considered reserved (even VMCI - * doesn't use it anymore, it's a legacy value from an older release). +/* Use this as the destination CID in an address when referring to the + * local communication (loopback). + * (This was VMADDR_CID_RESERVED, but even VMCI doesn't use it anymore, + * it was a legacy value from an older release). */ -#define VMADDR_CID_RESERVED 1 +#define VMADDR_CID_LOCAL 1 /* Use this as the destination CID in an address when referring to the host * (any process other than the hypervisor). VMCI relies on it being 2, but -- cgit From f74877a5457d34d604dba6dbbb13c4c05bac8b93 Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Wed, 11 Dec 2019 10:58:14 +0100 Subject: rtnetlink: provide permanent hardware address in RTM_NEWLINK Permanent hardware address of a network device was traditionally provided via ethtool ioctl interface but as Jiri Pirko pointed out in a review of ethtool netlink interface, rtnetlink is much more suitable for it so let's add it to the RTM_NEWLINK message. Add IFLA_PERM_ADDRESS attribute to RTM_NEWLINK messages unless the permanent address is all zeros (i.e. device driver did not fill it). As permanent address is not modifiable, reject userspace requests containing IFLA_PERM_ADDRESS attribute. Note: we already provide permanent hardware address for bond slaves; unfortunately we cannot drop that attribute for backward compatibility reasons. v5 -> v6: only add the attribute if permanent address is not zero Signed-off-by: Michal Kubecek Acked-by: Jiri Pirko Acked-by: Stephen Hemminger Signed-off-by: David S. Miller --- include/uapi/linux/if_link.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 8aec8769d944..1d69f637c5d6 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -169,6 +169,7 @@ enum { IFLA_MAX_MTU, IFLA_PROP_LIST, IFLA_ALT_IFNAME, /* Alternative ifname */ + IFLA_PERM_ADDRESS, __IFLA_MAX }; -- cgit From 428c122f5f6b54f40bd51c47495104b534b5a57c Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Wed, 11 Dec 2019 10:58:34 +0100 Subject: ethtool: provide link mode names as a string set Unlike e.g. netdev features, the ethtool ioctl interface requires link mode table to be in sync between kernel and userspace for userspace to be able to display and set all link modes supported by kernel. The way arbitrary length bitsets are implemented in netlink interface, this will be no longer needed. To allow userspace to access all link modes running kernel supports, add table of ethernet link mode names and make it available as a string set to userspace GET_STRSET requests. Add build time check to make sure names are defined for all modes declared in enum ethtool_link_mode_bit_indices. Once the string set is available, make it also accessible via ioctl. Signed-off-by: Michal Kubecek Reviewed-by: Andrew Lunn Reviewed-by: Jiri Pirko Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index d4591792f0b4..f44155840b07 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -593,6 +593,7 @@ struct ethtool_pauseparam { * @ETH_SS_RSS_HASH_FUNCS: RSS hush function names * @ETH_SS_PHY_STATS: Statistic names, for use with %ETHTOOL_GPHYSTATS * @ETH_SS_PHY_TUNABLES: PHY tunable names + * @ETH_SS_LINK_MODES: link mode names */ enum ethtool_stringset { ETH_SS_TEST = 0, @@ -604,6 +605,7 @@ enum ethtool_stringset { ETH_SS_TUNABLES, ETH_SS_PHY_STATS, ETH_SS_PHY_TUNABLES, + ETH_SS_LINK_MODES, }; /** -- cgit From 826f66b30c2e3d4df533e4224cb8c734afa0a17c Mon Sep 17 00:00:00 2001 From: Andy Roulin Date: Wed, 11 Dec 2019 14:30:58 -0800 Subject: bonding: move 802.3ad port state flags to uapi The bond slave actor/partner operating state is exported as bitfield to userspace, which lacks a way to interpret it, e.g., iproute2 only prints the state as a number: ad_actor_oper_port_state 15 For userspace to interpret the bitfield, the bitfield definitions should be part of the uapi. The bitfield itself is defined in the 802.3ad standard. This commit moves the 802.3ad bitfield definitions to uapi. Related iproute2 patches, soon to be posted upstream, use the new uapi headers to pretty-print bond slave state, e.g., with ip -d link show ad_actor_oper_port_state_str Signed-off-by: Andy Roulin Acked-by: Roopa Prabhu Acked-by: Jay Vosburgh Signed-off-by: Jakub Kicinski --- include/uapi/linux/if_bonding.h | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/if_bonding.h b/include/uapi/linux/if_bonding.h index 790585f0e61b..6829213a54c5 100644 --- a/include/uapi/linux/if_bonding.h +++ b/include/uapi/linux/if_bonding.h @@ -95,6 +95,16 @@ #define BOND_XMIT_POLICY_ENCAP23 3 /* encapsulated layer 2+3 */ #define BOND_XMIT_POLICY_ENCAP34 4 /* encapsulated layer 3+4 */ +/* 802.3ad port state definitions (43.4.2.2 in the 802.3ad standard) */ +#define AD_STATE_LACP_ACTIVITY 0x1 +#define AD_STATE_LACP_TIMEOUT 0x2 +#define AD_STATE_AGGREGATION 0x4 +#define AD_STATE_SYNCHRONIZATION 0x8 +#define AD_STATE_COLLECTING 0x10 +#define AD_STATE_DISTRIBUTING 0x20 +#define AD_STATE_DEFAULTED 0x40 +#define AD_STATE_EXPIRED 0x80 + typedef struct ifbond { __s32 bond_mode; __s32 num_slaves; -- cgit From de1799667b002331609e8ee6b924ccfd51968d99 Mon Sep 17 00:00:00 2001 From: Vivien Didelot Date: Wed, 11 Dec 2019 20:07:10 -0500 Subject: net: bridge: add STP xstats This adds rx_bpdu, tx_bpdu, rx_tcn, tx_tcn, transition_blk, transition_fwd xstats counters to the bridge ports copied over via netlink, providing useful information for STP. Signed-off-by: Vivien Didelot Acked-by: Nikolay Aleksandrov Signed-off-by: Jakub Kicinski --- include/uapi/linux/if_bridge.h | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h index 1b3c2b643a02..4a58e3d7de46 100644 --- a/include/uapi/linux/if_bridge.h +++ b/include/uapi/linux/if_bridge.h @@ -156,6 +156,15 @@ struct bridge_vlan_xstats { __u32 pad2; }; +struct bridge_stp_xstats { + __u64 transition_blk; + __u64 transition_fwd; + __u64 rx_bpdu; + __u64 tx_bpdu; + __u64 rx_tcn; + __u64 tx_tcn; +}; + /* Bridge multicast database attributes * [MDBA_MDB] = { * [MDBA_MDB_ENTRY] = { @@ -262,6 +271,7 @@ enum { BRIDGE_XSTATS_VLAN, BRIDGE_XSTATS_MCAST, BRIDGE_XSTATS_PAD, + BRIDGE_XSTATS_STP, __BRIDGE_XSTATS_MAX }; #define BRIDGE_XSTATS_MAX (__BRIDGE_XSTATS_MAX - 1) -- cgit From 166750bc1dd256b2184b22588fb9fe6d3fbb93ae Mon Sep 17 00:00:00 2001 From: Andrii Nakryiko Date: Fri, 13 Dec 2019 17:47:08 -0800 Subject: libbpf: Support libbpf-provided extern variables Add support for extern variables, provided to BPF program by libbpf. Currently the following extern variables are supported: - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte long; - CONFIG_xxx values; a set of values of actual kernel config. Tristate, boolean, strings, and integer values are supported. Set of possible values is determined by declared type of extern variable. Supported types of variables are: - Tristate values. Are represented as `enum libbpf_tristate`. Accepted values are **strictly** 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO, or TRI_MODULE, respectively. - Boolean values. Are represented as bool (_Bool) types. Accepted values are 'y' and 'n' only, turning into true/false values, respectively. - Single-character values. Can be used both as a substritute for bool/tristate, or as a small-range integer: - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm'; - integers in a range [-128, 127] or [0, 255] (depending on signedness of char in target architecture) are recognized and represented with respective values of char type. - Strings. String values are declared as fixed-length char arrays. String of up to that length will be accepted and put in first N bytes of char array, with the rest of bytes zeroed out. If config string value is longer than space alloted, it will be truncated and warning message emitted. Char array is always zero terminated. String literals in config have to be enclosed in double quotes, just like C-style string literals. - Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and unsigned variants. Libbpf enforces parsed config value to be in the supported range of corresponding integer type. Integers values in config can be: - decimal integers, with optional + and - signs; - hexadecimal integers, prefixed with 0x or 0X; - octal integers, starting with 0. Config file itself is searched in /boot/config-$(uname -r) location with fallback to /proc/config.gz, unless config path is specified explicitly through bpf_object_open_opts' kernel_config_path option. Both gzipped and plain text formats are supported. Libbpf adds explicit dependency on zlib because of this, but this shouldn't be a problem, given libelf already depends on zlib. All detected extern variables, are put into a separate .extern internal map. It, similarly to .rodata map, is marked as read-only from BPF program side, as well as is frozen on load. This allows BPF verifier to track extern values as constants and perform enhanced branch prediction and dead code elimination. This can be relied upon for doing kernel version/feature detection and using potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF program, while still having a single version of BPF program running on old and new kernels. Selftests are validating this explicitly for unexisting BPF helper. Signed-off-by: Andrii Nakryiko Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com --- include/uapi/linux/btf.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h index c02dec97e1ce..1a2898c482ee 100644 --- a/include/uapi/linux/btf.h +++ b/include/uapi/linux/btf.h @@ -142,7 +142,8 @@ struct btf_param { enum { BTF_VAR_STATIC = 0, - BTF_VAR_GLOBAL_ALLOCATED, + BTF_VAR_GLOBAL_ALLOCATED = 1, + BTF_VAR_GLOBAL_EXTERN = 2, }; /* BTF_KIND_VAR is followed by a single "struct btf_var" to describe -- cgit From a2ec8b5706944d228181c8b91d815f41d6dd8e7b Mon Sep 17 00:00:00 2001 From: Josh Soref Date: Sun, 15 Dec 2019 22:08:02 +0100 Subject: wireguard: global: fix spelling mistakes in comments This fixes two spelling errors in source code comments. Signed-off-by: Josh Soref [Jason: rewrote commit message] Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller --- include/uapi/linux/wireguard.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h index dd8a47c4ad11..ae88be14c947 100644 --- a/include/uapi/linux/wireguard.h +++ b/include/uapi/linux/wireguard.h @@ -18,13 +18,13 @@ * one but not both of: * * WGDEVICE_A_IFINDEX: NLA_U32 - * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1 * * The kernel will then return several messages (NLM_F_MULTI) containing the * following tree of nested items: * * WGDEVICE_A_IFINDEX: NLA_U32 - * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1 * WGDEVICE_A_PRIVATE_KEY: NLA_EXACT_LEN, len WG_KEY_LEN * WGDEVICE_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN * WGDEVICE_A_LISTEN_PORT: NLA_U16 @@ -77,7 +77,7 @@ * WGDEVICE_A_IFINDEX and WGDEVICE_A_IFNAME: * * WGDEVICE_A_IFINDEX: NLA_U32 - * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1 * WGDEVICE_A_FLAGS: NLA_U32, 0 or WGDEVICE_F_REPLACE_PEERS if all current * peers should be removed prior to adding the list below. * WGDEVICE_A_PRIVATE_KEY: len WG_KEY_LEN, all zeros to remove @@ -121,7 +121,7 @@ * filling in information not contained in the prior. Note that if * WGDEVICE_F_REPLACE_PEERS is specified in the first message, it probably * should not be specified in fragments that come after, so that the list - * of peers is only cleared the first time but appened after. Likewise for + * of peers is only cleared the first time but appended after. Likewise for * peers, if WGPEER_F_REPLACE_ALLOWEDIPS is specified in the first message * of a peer, it likely should not be specified in subsequent fragments. * -- cgit From dcc68b4d8084e1ac9af0d4022d6b1aff6a139a33 Mon Sep 17 00:00:00 2001 From: Petr Machata Date: Wed, 18 Dec 2019 14:55:13 +0000 Subject: net: sch_ets: Add a new Qdisc Introduces a new Qdisc, which is based on 802.1Q-2014 wording. It is PRIO-like in how it is configured, meaning one needs to specify how many bands there are, how many are strict and how many are dwrr, quanta for the latter, and priomap. The new Qdisc operates like the PRIO / DRR combo would when configured as per the standard. The strict classes, if any, are tried for traffic first. When there's no traffic in any of the strict queues, the ETS ones (if any) are treated in the same way as in DRR. Signed-off-by: Petr Machata Acked-by: Jiri Pirko Signed-off-by: David S. Miller --- include/uapi/linux/pkt_sched.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h index 9f1a72876212..bf5a5b1dfb0b 100644 --- a/include/uapi/linux/pkt_sched.h +++ b/include/uapi/linux/pkt_sched.h @@ -1187,4 +1187,21 @@ enum { #define TCA_TAPRIO_ATTR_MAX (__TCA_TAPRIO_ATTR_MAX - 1) +/* ETS */ + +#define TCQ_ETS_MAX_BANDS 16 + +enum { + TCA_ETS_UNSPEC, + TCA_ETS_NBANDS, /* u8 */ + TCA_ETS_NSTRICT, /* u8 */ + TCA_ETS_QUANTA, /* nested TCA_ETS_QUANTA_BAND */ + TCA_ETS_QUANTA_BAND, /* u32 */ + TCA_ETS_PRIOMAP, /* nested TCA_ETS_PRIOMAP_BAND */ + TCA_ETS_PRIOMAP_BAND, /* u8 */ + __TCA_ETS_MAX, +}; + +#define TCA_ETS_MAX (__TCA_ETS_MAX - 1) + #endif -- cgit From 7dd68b3279f1792103d12e69933db3128c6d416e Mon Sep 17 00:00:00 2001 From: Andrey Ignatov Date: Wed, 18 Dec 2019 23:44:35 -0800 Subject: bpf: Support replacing cgroup-bpf program in MULTI mode The common use-case in production is to have multiple cgroup-bpf programs per attach type that cover multiple use-cases. Such programs are attached with BPF_F_ALLOW_MULTI and can be maintained by different people. Order of programs usually matters, for example imagine two egress programs: the first one drops packets and the second one counts packets. If they're swapped the result of counting program will be different. It brings operational challenges with updating cgroup-bpf program(s) attached with BPF_F_ALLOW_MULTI since there is no way to replace a program: * One way to update is to detach all programs first and then attach the new version(s) again in the right order. This introduces an interruption in the work a program is doing and may not be acceptable (e.g. if it's egress firewall); * Another way is attach the new version of a program first and only then detach the old version. This introduces the time interval when two versions of same program are working, what may not be acceptable if a program is not idempotent. It also imposes additional burden on program developers to make sure that two versions of their program can co-exist. Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to replace That way user can replace any program among those attached with BPF_F_ALLOW_MULTI flag without the problems described above. Details of the new API: * If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid descriptor of BPF program, BPF_PROG_ATTACH will return corresponding error (EINVAL or EBADF). * If replace_bpf_fd has valid descriptor of BPF program but such a program is not attached to specified cgroup, BPF_PROG_ATTACH will return ENOENT. BPF_F_REPLACE is introduced to make the user intent clear, since replace_bpf_fd alone can't be used for this (its default value, 0, is a valid fd). BPF_F_REPLACE also makes it possible to extend the API in the future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed). Signed-off-by: Andrey Ignatov Signed-off-by: Alexei Starovoitov Acked-by: Martin KaFai Lau Acked-by: Andrii Narkyiko Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com --- include/uapi/linux/bpf.h | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index dbbcf0b02970..7df436da542d 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -231,6 +231,11 @@ enum bpf_attach_type { * When children program makes decision (like picking TCP CA or sock bind) * parent program has a chance to override it. * + * With BPF_F_ALLOW_MULTI a new program is added to the end of the list of + * programs for a cgroup. Though it's possible to replace an old program at + * any position by also specifying BPF_F_REPLACE flag and position itself in + * replace_bpf_fd attribute. Old program at this position will be released. + * * A cgroup with MULTI or OVERRIDE flag allows any attach flags in sub-cgroups. * A cgroup with NONE doesn't allow any programs in sub-cgroups. * Ex1: @@ -249,6 +254,7 @@ enum bpf_attach_type { */ #define BPF_F_ALLOW_OVERRIDE (1U << 0) #define BPF_F_ALLOW_MULTI (1U << 1) +#define BPF_F_REPLACE (1U << 2) /* If BPF_F_STRICT_ALIGNMENT is used in BPF_PROG_LOAD command, the * verifier will perform strict alignment checking as if the kernel @@ -442,6 +448,10 @@ union bpf_attr { __u32 attach_bpf_fd; /* eBPF program to attach */ __u32 attach_type; __u32 attach_flags; + __u32 replace_bpf_fd; /* previously attached eBPF + * program to replace if + * BPF_F_REPLACE is used + */ }; struct { /* anonymous struct used by BPF_PROG_TEST_RUN command */ -- cgit From e1b5e598e5a51b453328879682b178b4acc15105 Mon Sep 17 00:00:00 2001 From: John Rutherford Date: Thu, 19 Dec 2019 16:03:57 +1100 Subject: tipc: make legacy address flag readable over netlink To enable iproute2/tipc to generate backwards compatible printouts and validate command parameters for nodes using a node address, it needs to be able to read the legacy address flag from the kernel. The legacy address flag records the way in which the node identity was originally specified. The legacy address flag is requested by the netlink message TIPC_NL_ADDR_LEGACY_GET. If the flag is set the attribute TIPC_NLA_NET_ADDR_LEGACY is set in the return message. Signed-off-by: John Rutherford Acked-by: Jon Maloy Signed-off-by: David S. Miller --- include/uapi/linux/tipc_netlink.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/tipc_netlink.h b/include/uapi/linux/tipc_netlink.h index 6c2194ab745b..dc0d23a50e69 100644 --- a/include/uapi/linux/tipc_netlink.h +++ b/include/uapi/linux/tipc_netlink.h @@ -65,6 +65,7 @@ enum { TIPC_NL_UDP_GET_REMOTEIP, TIPC_NL_KEY_SET, TIPC_NL_KEY_FLUSH, + TIPC_NL_ADDR_LEGACY_GET, __TIPC_NL_CMD_MAX, TIPC_NL_CMD_MAX = __TIPC_NL_CMD_MAX - 1 @@ -176,6 +177,7 @@ enum { TIPC_NLA_NET_ADDR, /* u32 */ TIPC_NLA_NET_NODEID, /* u64 */ TIPC_NLA_NET_NODEID_W1, /* u64 */ + TIPC_NLA_NET_ADDR_LEGACY, /* flag */ __TIPC_NLA_NET_MAX, TIPC_NLA_NET_MAX = __TIPC_NLA_NET_MAX - 1 -- cgit From f66b53fdbb22ced1a323b22b9de84a61aacd8d18 Mon Sep 17 00:00:00 2001 From: Martin Varghese Date: Sat, 21 Dec 2019 08:50:46 +0530 Subject: openvswitch: New MPLS actions for layer 2 tunnelling The existing PUSH MPLS action inserts MPLS header between ethernet header and the IP header. Though this behaviour is fine for L3 VPN where an IP packet is encapsulated inside a MPLS tunnel, it does not suffice the L2 VPN (l2 tunnelling) requirements. In L2 VPN the MPLS header should encapsulate the ethernet packet. The new mpls action ADD_MPLS inserts MPLS header at the start of the packet or at the start of the l3 header depending on the value of l3 tunnel flag in the ADD_MPLS arguments. POP_MPLS action is extended to support ethertype 0x6558. Signed-off-by: Martin Varghese Acked-by: Pravin B Shelar Signed-off-by: David S. Miller --- include/uapi/linux/openvswitch.h | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h index a87b44cd5590..ae2bff14e7e1 100644 --- a/include/uapi/linux/openvswitch.h +++ b/include/uapi/linux/openvswitch.h @@ -672,6 +672,32 @@ struct ovs_action_push_mpls { __be16 mpls_ethertype; /* Either %ETH_P_MPLS_UC or %ETH_P_MPLS_MC */ }; +/** + * struct ovs_action_add_mpls - %OVS_ACTION_ATTR_ADD_MPLS action + * argument. + * @mpls_lse: MPLS label stack entry to push. + * @mpls_ethertype: Ethertype to set in the encapsulating ethernet frame. + * @tun_flags: MPLS tunnel attributes. + * + * The only values @mpls_ethertype should ever be given are %ETH_P_MPLS_UC and + * %ETH_P_MPLS_MC, indicating MPLS unicast or multicast. Other are rejected. + */ +struct ovs_action_add_mpls { + __be32 mpls_lse; + __be16 mpls_ethertype; /* Either %ETH_P_MPLS_UC or %ETH_P_MPLS_MC */ + __u16 tun_flags; +}; + +#define OVS_MPLS_L3_TUNNEL_FLAG_MASK (1 << 0) /* Flag to specify the place of + * insertion of MPLS header. + * When false, the MPLS header + * will be inserted at the start + * of the packet. + * When true, the MPLS header + * will be inserted at the start + * of the l3 header. + */ + /** * struct ovs_action_push_vlan - %OVS_ACTION_ATTR_PUSH_VLAN action argument. * @vlan_tpid: Tag protocol identifier (TPID) to push. @@ -892,6 +918,10 @@ struct check_pkt_len_arg { * @OVS_ACTION_ATTR_CHECK_PKT_LEN: Check the packet length and execute a set * of actions if greater than the specified packet length, else execute * another set of actions. + * @OVS_ACTION_ATTR_ADD_MPLS: Push a new MPLS label stack entry at the + * start of the packet or at the start of the l3 header depending on the value + * of l3 tunnel flag in the tun_flags field of OVS_ACTION_ATTR_ADD_MPLS + * argument. * * Only a single header can be set with a single %OVS_ACTION_ATTR_SET. Not all * fields within a header are modifiable, e.g. the IPv4 protocol and fragment @@ -927,6 +957,7 @@ enum ovs_action_attr { OVS_ACTION_ATTR_METER, /* u32 meter ID. */ OVS_ACTION_ATTR_CLONE, /* Nested OVS_CLONE_ATTR_*. */ OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */ + OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */ __OVS_ACTION_ATTR_MAX, /* Nothing past this will be accepted * from userspace. */ -- cgit From b6fd7b96366769651ab23988607ce9c5c9042cdb Mon Sep 17 00:00:00 2001 From: Richard Cochran Date: Wed, 25 Dec 2019 18:16:19 -0800 Subject: net: Introduce peer to peer one step PTP time stamping. The 1588 standard defines one step operation for both Sync and PDelay_Resp messages. Up until now, hardware with P2P one step has been rare, and kernel support was lacking. This patch adds support of the mode in anticipation of new hardware developments. Signed-off-by: Richard Cochran Signed-off-by: David S. Miller --- include/uapi/linux/net_tstamp.h | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h index e5b39721c6e4..f96e650d0af9 100644 --- a/include/uapi/linux/net_tstamp.h +++ b/include/uapi/linux/net_tstamp.h @@ -90,6 +90,14 @@ enum hwtstamp_tx_types { * queue. */ HWTSTAMP_TX_ONESTEP_SYNC, + + /* + * Same as HWTSTAMP_TX_ONESTEP_SYNC, but also enables time + * stamp insertion directly into PDelay_Resp packets. In this + * case, neither transmitted Sync nor PDelay_Resp packets will + * receive a time stamp via the socket error queue. + */ + HWTSTAMP_TX_ONESTEP_P2P, }; /* possible values for hwtstamp_config->rx_filter */ -- cgit From c14ceb0ec727187f71a487a592ffa91767fed66e Mon Sep 17 00:00:00 2001 From: Florian Westphal Date: Wed, 18 Dec 2019 12:05:21 +0100 Subject: netfilter: nft_meta: add support for slave device ifindex matching Allow to match on vrf slave ifindex or name. In case there was no slave interface involved, store 0 in the destination register just like existing iif/oif matching. sdif(name) is restricted to the ipv4/ipv6 input and forward hooks, as it depends on ip(6) stack parsing/storing info in skb->cb[]. Cc: Martin Willi Cc: David Ahern Cc: Shrijeet Mukherjee Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso --- include/uapi/linux/netfilter/nf_tables.h | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index bb9b049310df..e237ecbdcd8a 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -805,6 +805,8 @@ enum nft_exthdr_attributes { * @NFT_META_TIME_NS: time since epoch (in nanoseconds) * @NFT_META_TIME_DAY: day of week (from 0 = Sunday to 6 = Saturday) * @NFT_META_TIME_HOUR: hour of day (in seconds) + * @NFT_META_SDIF: slave device interface index + * @NFT_META_SDIFNAME: slave device interface name */ enum nft_meta_keys { NFT_META_LEN, @@ -840,6 +842,8 @@ enum nft_meta_keys { NFT_META_TIME_NS, NFT_META_TIME_DAY, NFT_META_TIME_HOUR, + NFT_META_SDIF, + NFT_META_SDIFNAME, }; /** -- cgit From c1e469902640ed0a82d0b8df5951199c17a49e3c Mon Sep 17 00:00:00 2001 From: Andy Roulin Date: Thu, 26 Dec 2019 05:41:57 -0800 Subject: bonding: rename AD_STATE_* to LACP_STATE_* As the LACP actor/partner state is now part of the uapi, rename the 3ad state defines with LACP prefix. The LACP prefix is preferred over BOND_3AD as the LACP standard moved to 802.1AX. Fixes: 826f66b30c2e3 ("bonding: move 802.3ad port state flags to uapi") Signed-off-by: Andy Roulin Signed-off-by: David S. Miller --- include/uapi/linux/if_bonding.h | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/if_bonding.h b/include/uapi/linux/if_bonding.h index 6829213a54c5..45f3750aa861 100644 --- a/include/uapi/linux/if_bonding.h +++ b/include/uapi/linux/if_bonding.h @@ -96,14 +96,14 @@ #define BOND_XMIT_POLICY_ENCAP34 4 /* encapsulated layer 3+4 */ /* 802.3ad port state definitions (43.4.2.2 in the 802.3ad standard) */ -#define AD_STATE_LACP_ACTIVITY 0x1 -#define AD_STATE_LACP_TIMEOUT 0x2 -#define AD_STATE_AGGREGATION 0x4 -#define AD_STATE_SYNCHRONIZATION 0x8 -#define AD_STATE_COLLECTING 0x10 -#define AD_STATE_DISTRIBUTING 0x20 -#define AD_STATE_DEFAULTED 0x40 -#define AD_STATE_EXPIRED 0x80 +#define LACP_STATE_LACP_ACTIVITY 0x1 +#define LACP_STATE_LACP_TIMEOUT 0x2 +#define LACP_STATE_AGGREGATION 0x4 +#define LACP_STATE_SYNCHRONIZATION 0x8 +#define LACP_STATE_COLLECTING 0x10 +#define LACP_STATE_DISTRIBUTING 0x20 +#define LACP_STATE_DEFAULTED 0x40 +#define LACP_STATE_EXPIRED 0x80 typedef struct ifbond { __s32 bond_mode; -- cgit From 2b4a8990b7df55875745a80a609a1ceaaf51f322 Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:55:18 +0100 Subject: ethtool: introduce ethtool netlink interface Basic genetlink and init infrastructure for the netlink interface, register genetlink family "ethtool". Add CONFIG_ETHTOOL_NETLINK Kconfig option to make the build optional. Add initial overall interface description into Documentation/networking/ethtool-netlink.rst, further patches will add more detailed information. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 include/uapi/linux/ethtool_netlink.h (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h new file mode 100644 index 000000000000..3c93276ba066 --- /dev/null +++ b/include/uapi/linux/ethtool_netlink.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */ +/* + * include/uapi/linux/ethtool_netlink.h - netlink interface for ethtool + * + * See Documentation/networking/ethtool-netlink.txt in kernel source tree for + * doucumentation of the interface. + */ + +#ifndef _UAPI_LINUX_ETHTOOL_NETLINK_H_ +#define _UAPI_LINUX_ETHTOOL_NETLINK_H_ + +#include + +/* message types - userspace to kernel */ +enum { + ETHTOOL_MSG_USER_NONE, + + /* add new constants above here */ + __ETHTOOL_MSG_USER_CNT, + ETHTOOL_MSG_USER_MAX = __ETHTOOL_MSG_USER_CNT - 1 +}; + +/* message types - kernel to userspace */ +enum { + ETHTOOL_MSG_KERNEL_NONE, + + /* add new constants above here */ + __ETHTOOL_MSG_KERNEL_CNT, + ETHTOOL_MSG_KERNEL_MAX = __ETHTOOL_MSG_KERNEL_CNT - 1 +}; + +/* generic netlink info */ +#define ETHTOOL_GENL_NAME "ethtool" +#define ETHTOOL_GENL_VERSION 1 + +#endif /* _UAPI_LINUX_ETHTOOL_NETLINK_H_ */ -- cgit From 041b1c5d4a53e97fc9e029ae32469552ca12cb9b Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:55:23 +0100 Subject: ethtool: helper functions for netlink interface Add common request/reply header definition and helpers to parse request header and fill reply header. Provide ethnl_update_* helpers to update structure members from request attributes (to be used for *_SET requests). Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 3c93276ba066..82fc3b5f41c9 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -29,6 +29,27 @@ enum { ETHTOOL_MSG_KERNEL_MAX = __ETHTOOL_MSG_KERNEL_CNT - 1 }; +/* request header */ + +/* use compact bitsets in reply */ +#define ETHTOOL_FLAG_COMPACT_BITSETS (1 << 0) +/* provide optional reply for SET or ACT requests */ +#define ETHTOOL_FLAG_OMIT_REPLY (1 << 1) + +#define ETHTOOL_FLAG_ALL (ETHTOOL_FLAG_COMPACT_BITSETS | \ + ETHTOOL_FLAG_OMIT_REPLY) + +enum { + ETHTOOL_A_HEADER_UNSPEC, + ETHTOOL_A_HEADER_DEV_INDEX, /* u32 */ + ETHTOOL_A_HEADER_DEV_NAME, /* string */ + ETHTOOL_A_HEADER_FLAGS, /* u32 - ETHTOOL_FLAG_* */ + + /* add new constants above here */ + __ETHTOOL_A_HEADER_CNT, + ETHTOOL_A_HEADER_MAX = __ETHTOOL_A_HEADER_CNT - 1 +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 -- cgit From 10b518d4e6dd5390e40f7d8de0f08753c1195a7e Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:55:28 +0100 Subject: ethtool: netlink bitset handling The ethtool netlink code uses common framework for passing arbitrary length bit sets to allow future extensions. A bitset can be a list (only one bitmap) or can consist of value and mask pair (used e.g. when client want to modify only some bits). A bitset can use one of two formats: verbose (bit by bit) or compact. Verbose format consists of bitset size (number of bits), list flag and an array of bit nests, telling which bits are part of the list or which bits are in the mask and which of them are to be set. In requests, bits can be identified by index (position) or by name. In replies, kernel provides both index and name. Verbose format is suitable for "one shot" applications like standard ethtool command as it avoids the need to either keep bit names (e.g. link modes) in sync with kernel or having to add an extra roundtrip for string set request (e.g. for private flags). Compact format uses one (list) or two (value/mask) arrays of 32-bit words to store the bitmap(s). It is more suitable for long running applications (ethtool in monitor mode or network management daemons) which can retrieve the names once and then pass only compact bitmaps to save space. Userspace requests can use either format; ETHTOOL_FLAG_COMPACT_BITSETS flag in request header tells kernel which format to use in reply. Notifications always use compact format. As some code uses arrays of unsigned long for internal representation and some arrays of u32 (or even a single u32), two sets of parse/compose helpers are introduced. To avoid code duplication, helpers for unsigned long arrays are implemented as wrappers around helpers for u32 arrays. There are two reasons for this choice: (1) u32 arrays are more frequent in ethtool code and (2) unsigned long array can be always interpreted as an u32 array on little endian 64-bit and all 32-bit architectures while we would need special handling for odd number of u32 words in the opposite direction. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 82fc3b5f41c9..951203049615 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -50,6 +50,41 @@ enum { ETHTOOL_A_HEADER_MAX = __ETHTOOL_A_HEADER_CNT - 1 }; +/* bit sets */ + +enum { + ETHTOOL_A_BITSET_BIT_UNSPEC, + ETHTOOL_A_BITSET_BIT_INDEX, /* u32 */ + ETHTOOL_A_BITSET_BIT_NAME, /* string */ + ETHTOOL_A_BITSET_BIT_VALUE, /* flag */ + + /* add new constants above here */ + __ETHTOOL_A_BITSET_BIT_CNT, + ETHTOOL_A_BITSET_BIT_MAX = __ETHTOOL_A_BITSET_BIT_CNT - 1 +}; + +enum { + ETHTOOL_A_BITSET_BITS_UNSPEC, + ETHTOOL_A_BITSET_BITS_BIT, /* nest - _A_BITSET_BIT_* */ + + /* add new constants above here */ + __ETHTOOL_A_BITSET_BITS_CNT, + ETHTOOL_A_BITSET_BITS_MAX = __ETHTOOL_A_BITSET_BITS_CNT - 1 +}; + +enum { + ETHTOOL_A_BITSET_UNSPEC, + ETHTOOL_A_BITSET_NOMASK, /* flag */ + ETHTOOL_A_BITSET_SIZE, /* u32 */ + ETHTOOL_A_BITSET_BITS, /* nest - _A_BITSET_BITS_* */ + ETHTOOL_A_BITSET_VALUE, /* binary */ + ETHTOOL_A_BITSET_MASK, /* binary */ + + /* add new constants above here */ + __ETHTOOL_A_BITSET_CNT, + ETHTOOL_A_BITSET_MAX = __ETHTOOL_A_BITSET_CNT - 1 +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 -- cgit From 6b08d6c146f4c5ed451c45339c10feb06d619db2 Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:55:33 +0100 Subject: ethtool: support for netlink notifications Add infrastructure for ethtool netlink notifications. There is only one multicast group "monitor" which is used to notify userspace about changes and actions performed. Notification messages (types using suffix _NTF) share the format with replies to GET requests. Notifications are supposed to be broadcasted on every configuration change, whether it is done using the netlink interface or ioctl one. Netlink SET requests only trigger a notification if some data is actually changed. To trigger an ethtool notification, both ethtool netlink and external code use ethtool_notify() helper. This helper requires RTNL to be held and may sleep. Handlers sending messages for specific notification message types are registered in ethnl_notify_handlers array. As notifications can be triggered from other code, ethnl_ok flag is used to prevent an attempt to send notification before genetlink family is registered. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 951203049615..d530ccb6f7e6 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -89,4 +89,6 @@ enum { #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 +#define ETHTOOL_MCGRP_MONITOR_NAME "monitor" + #endif /* _UAPI_LINUX_ETHTOOL_NETLINK_H_ */ -- cgit From 71921690f9745fef60a2bad425f30adf8cdc9da0 Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:55:43 +0100 Subject: ethtool: provide string sets with STRSET_GET request Requests a contents of one or more string sets, i.e. indexed arrays of strings; this information is provided by ETHTOOL_GSSET_INFO and ETHTOOL_GSTRINGS commands of ioctl interface. Unlike ioctl interface, all information can be retrieved with one request and mulitple string sets can be requested at once. There are three types of requests: - no NLM_F_DUMP, no device: get "global" stringsets - no NLM_F_DUMP, with device: get string sets related to the device - NLM_F_DUMP, no device: get device related string sets for all devices Client can request either all string sets of given type (global or device related) or only specific sets. With ETHTOOL_A_STRSET_COUNTS flag set, only set sizes (numbers of strings) are returned. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool.h | 3 ++ include/uapi/linux/ethtool_netlink.h | 56 ++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index f44155840b07..116bcbf09c74 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -606,6 +606,9 @@ enum ethtool_stringset { ETH_SS_PHY_STATS, ETH_SS_PHY_TUNABLES, ETH_SS_LINK_MODES, + + /* add new constants above here */ + ETH_SS_COUNT }; /** diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index d530ccb6f7e6..cabef1fec42a 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -14,6 +14,7 @@ /* message types - userspace to kernel */ enum { ETHTOOL_MSG_USER_NONE, + ETHTOOL_MSG_STRSET_GET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -23,6 +24,7 @@ enum { /* message types - kernel to userspace */ enum { ETHTOOL_MSG_KERNEL_NONE, + ETHTOOL_MSG_STRSET_GET_REPLY, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@ -85,6 +87,60 @@ enum { ETHTOOL_A_BITSET_MAX = __ETHTOOL_A_BITSET_CNT - 1 }; +/* string sets */ + +enum { + ETHTOOL_A_STRING_UNSPEC, + ETHTOOL_A_STRING_INDEX, /* u32 */ + ETHTOOL_A_STRING_VALUE, /* string */ + + /* add new constants above here */ + __ETHTOOL_A_STRING_CNT, + ETHTOOL_A_STRING_MAX = __ETHTOOL_A_STRING_CNT - 1 +}; + +enum { + ETHTOOL_A_STRINGS_UNSPEC, + ETHTOOL_A_STRINGS_STRING, /* nest - _A_STRINGS_* */ + + /* add new constants above here */ + __ETHTOOL_A_STRINGS_CNT, + ETHTOOL_A_STRINGS_MAX = __ETHTOOL_A_STRINGS_CNT - 1 +}; + +enum { + ETHTOOL_A_STRINGSET_UNSPEC, + ETHTOOL_A_STRINGSET_ID, /* u32 */ + ETHTOOL_A_STRINGSET_COUNT, /* u32 */ + ETHTOOL_A_STRINGSET_STRINGS, /* nest - _A_STRINGS_* */ + + /* add new constants above here */ + __ETHTOOL_A_STRINGSET_CNT, + ETHTOOL_A_STRINGSET_MAX = __ETHTOOL_A_STRINGSET_CNT - 1 +}; + +enum { + ETHTOOL_A_STRINGSETS_UNSPEC, + ETHTOOL_A_STRINGSETS_STRINGSET, /* nest - _A_STRINGSET_* */ + + /* add new constants above here */ + __ETHTOOL_A_STRINGSETS_CNT, + ETHTOOL_A_STRINGSETS_MAX = __ETHTOOL_A_STRINGSETS_CNT - 1 +}; + +/* STRSET */ + +enum { + ETHTOOL_A_STRSET_UNSPEC, + ETHTOOL_A_STRSET_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_STRSET_STRINGSETS, /* nest - _A_STRINGSETS_* */ + ETHTOOL_A_STRSET_COUNTS_ONLY, /* flag */ + + /* add new constants above here */ + __ETHTOOL_A_STRSET_CNT, + ETHTOOL_A_STRSET_MAX = __ETHTOOL_A_STRSET_CNT - 1 +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 -- cgit From 459e0b81b37043545d90629fdfb243444151e77d Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:55:48 +0100 Subject: ethtool: provide link settings with LINKINFO_GET request Implement LINKINFO_GET netlink request to get basic link settings provided by ETHTOOL_GLINKSETTINGS and ETHTOOL_GSET ioctl commands. This request provides settings not directly related to autonegotiation and link mode selection: physical port, phy MDIO address, MDI(-X) status, MDI(-X) control and transceiver. LINKINFO_GET request can be used with NLM_F_DUMP (without device identification) to request the information for all devices in current network namespace providing the data. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index cabef1fec42a..1966532993e5 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -15,6 +15,7 @@ enum { ETHTOOL_MSG_USER_NONE, ETHTOOL_MSG_STRSET_GET, + ETHTOOL_MSG_LINKINFO_GET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -25,6 +26,7 @@ enum { enum { ETHTOOL_MSG_KERNEL_NONE, ETHTOOL_MSG_STRSET_GET_REPLY, + ETHTOOL_MSG_LINKINFO_GET_REPLY, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@ -141,6 +143,22 @@ enum { ETHTOOL_A_STRSET_MAX = __ETHTOOL_A_STRSET_CNT - 1 }; +/* LINKINFO */ + +enum { + ETHTOOL_A_LINKINFO_UNSPEC, + ETHTOOL_A_LINKINFO_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_LINKINFO_PORT, /* u8 */ + ETHTOOL_A_LINKINFO_PHYADDR, /* u8 */ + ETHTOOL_A_LINKINFO_TP_MDIX, /* u8 */ + ETHTOOL_A_LINKINFO_TP_MDIX_CTRL, /* u8 */ + ETHTOOL_A_LINKINFO_TRANSCEIVER, /* u8 */ + + /* add new constants above here */ + __ETHTOOL_A_LINKINFO_CNT, + ETHTOOL_A_LINKINFO_MAX = __ETHTOOL_A_LINKINFO_CNT - 1 +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 -- cgit From a53f3d41e4d3df46aba254ba51e7fbe45470fece Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:55:53 +0100 Subject: ethtool: set link settings with LINKINFO_SET request Implement LINKINFO_SET netlink request to set link settings queried by LINKINFO_GET message. Only physical port, phy MDIO address and MDI(-X) control can be set, attempt to modify MDI(-X) status and transceiver is rejected. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 1966532993e5..5b7806a5bef8 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -16,6 +16,7 @@ enum { ETHTOOL_MSG_USER_NONE, ETHTOOL_MSG_STRSET_GET, ETHTOOL_MSG_LINKINFO_GET, + ETHTOOL_MSG_LINKINFO_SET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, -- cgit From 73286734c1b0d009fd317c2a74e173cb22233dad Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:56:03 +0100 Subject: ethtool: add LINKINFO_NTF notification Send ETHTOOL_MSG_LINKINFO_NTF notification message whenever device link settings are modified using ETHTOOL_MSG_LINKINFO_SET netlink message or ETHTOOL_SLINKSETTINGS or ETHTOOL_SSET ioctl commands. The notification message has the same format as reply to LINKINFO_GET request. ETHTOOL_MSG_LINKINFO_SET netlink request only triggers the notification if there is a change but the ioctl command handlers do not check if there is an actual change and trigger the notification whenever the commands are executed. As all work is done by ethnl_default_notify() handler and callback functions introduced to handle LINKINFO_GET requests, all that remains is adding entries for ETHTOOL_MSG_LINKINFO_NTF into ethnl_notify_handlers and ethnl_default_notify_ops lookup tables and calls to ethtool_notify() where needed. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 5b7806a5bef8..d530fa30de36 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -28,6 +28,7 @@ enum { ETHTOOL_MSG_KERNEL_NONE, ETHTOOL_MSG_STRSET_GET_REPLY, ETHTOOL_MSG_LINKINFO_GET_REPLY, + ETHTOOL_MSG_LINKINFO_NTF, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, -- cgit From f625aa9be8c10f2e4dc677837e240730a25feda7 Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:56:08 +0100 Subject: ethtool: provide link mode information with LINKMODES_GET request Implement LINKMODES_GET netlink request to get link modes related information provided by ETHTOOL_GLINKSETTINGS and ETHTOOL_GSET ioctl commands. This request provides supported, advertised and peer advertised link modes, autonegotiation flag, speed and duplex. LINKMODES_GET request can be used with NLM_F_DUMP (without device identification) to request the information for all devices in current network namespace providing the data. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index d530fa30de36..dc1cae052eee 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -17,6 +17,7 @@ enum { ETHTOOL_MSG_STRSET_GET, ETHTOOL_MSG_LINKINFO_GET, ETHTOOL_MSG_LINKINFO_SET, + ETHTOOL_MSG_LINKMODES_GET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -29,6 +30,7 @@ enum { ETHTOOL_MSG_STRSET_GET_REPLY, ETHTOOL_MSG_LINKINFO_GET_REPLY, ETHTOOL_MSG_LINKINFO_NTF, + ETHTOOL_MSG_LINKMODES_GET_REPLY, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@ -161,6 +163,22 @@ enum { ETHTOOL_A_LINKINFO_MAX = __ETHTOOL_A_LINKINFO_CNT - 1 }; +/* LINKMODES */ + +enum { + ETHTOOL_A_LINKMODES_UNSPEC, + ETHTOOL_A_LINKMODES_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_LINKMODES_AUTONEG, /* u8 */ + ETHTOOL_A_LINKMODES_OURS, /* bitset */ + ETHTOOL_A_LINKMODES_PEER, /* bitset */ + ETHTOOL_A_LINKMODES_SPEED, /* u32 */ + ETHTOOL_A_LINKMODES_DUPLEX, /* u8 */ + + /* add new constants above here */ + __ETHTOOL_A_LINKMODES_CNT, + ETHTOOL_A_LINKMODES_MAX = __ETHTOOL_A_LINKMODES_CNT - 1 +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 -- cgit From bfbcfe2032e70bd8598d680d39ac177d507e39ac Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:56:13 +0100 Subject: ethtool: set link modes related data with LINKMODES_SET request Implement LINKMODES_SET netlink request to set advertised linkmodes and related attributes as ETHTOOL_SLINKSETTINGS and ETHTOOL_SSET commands do. The request allows setting autonegotiation flag, speed, duplex and advertised link modes. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index dc1cae052eee..cddf978b98df 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -18,6 +18,7 @@ enum { ETHTOOL_MSG_LINKINFO_GET, ETHTOOL_MSG_LINKINFO_SET, ETHTOOL_MSG_LINKMODES_GET, + ETHTOOL_MSG_LINKMODES_SET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, -- cgit From 1b1b1847c8505df2e1dd8804838526bed22a8bd4 Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:56:18 +0100 Subject: ethtool: add LINKMODES_NTF notification Send ETHTOOL_MSG_LINKMODES_NTF notification message whenever device link settings or advertised modes are modified using ETHTOOL_MSG_LINKMODES_SET netlink message or ETHTOOL_SLINKSETTINGS or ETHTOOL_SSET ioctl commands. The notification message has the same format as reply to LINKMODES_GET request. ETHTOOL_MSG_LINKMODES_SET netlink request only triggers the notification if there is a change but the ioctl command handlers do not check if there is an actual change and trigger the notification whenever the commands are executed. As all work is done by ethnl_default_notify() handler and callback functions introduced to handle LINKMODES_GET requests, all that remains is adding entries for ETHTOOL_MSG_LINKMODES_NTF into ethnl_notify_handlers and ethnl_default_notify_ops lookup tables and calls to ethtool_notify() where needed. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index cddf978b98df..35948df6d6e3 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -32,6 +32,7 @@ enum { ETHTOOL_MSG_LINKINFO_GET_REPLY, ETHTOOL_MSG_LINKINFO_NTF, ETHTOOL_MSG_LINKMODES_GET_REPLY, + ETHTOOL_MSG_LINKMODES_NTF, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, -- cgit From 3d2b847fb99cf2b28aa046e486636e555bc6ed1c Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Fri, 27 Dec 2019 15:56:23 +0100 Subject: ethtool: provide link state with LINKSTATE_GET request Implement LINKSTATE_GET netlink request to get link state information. At the moment, only link up flag as provided by ETHTOOL_GLINK ioctl command is returned. LINKSTATE_GET request can be used with NLM_F_DUMP (without device identification) to request the information for all devices in current network namespace providing the data. Signed-off-by: Michal Kubecek Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 35948df6d6e3..02f82f42a889 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -19,6 +19,7 @@ enum { ETHTOOL_MSG_LINKINFO_SET, ETHTOOL_MSG_LINKMODES_GET, ETHTOOL_MSG_LINKMODES_SET, + ETHTOOL_MSG_LINKSTATE_GET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -33,6 +34,7 @@ enum { ETHTOOL_MSG_LINKINFO_NTF, ETHTOOL_MSG_LINKMODES_GET_REPLY, ETHTOOL_MSG_LINKMODES_NTF, + ETHTOOL_MSG_LINKSTATE_GET_REPLY, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@ -181,6 +183,18 @@ enum { ETHTOOL_A_LINKMODES_MAX = __ETHTOOL_A_LINKMODES_CNT - 1 }; +/* LINKSTATE */ + +enum { + ETHTOOL_A_LINKSTATE_UNSPEC, + ETHTOOL_A_LINKSTATE_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_LINKSTATE_LINK, /* u8 */ + + /* add new constants above here */ + __ETHTOOL_A_LINKSTATE_CNT, + ETHTOOL_A_LINKSTATE_MAX = __ETHTOOL_A_LINKSTATE_CNT - 1 +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 -- cgit From 68e039f966cb577c91649a02591646ac3919f8c9 Mon Sep 17 00:00:00 2001 From: Sven Eckelmann Date: Wed, 1 Jan 2020 00:00:01 +0100 Subject: batman-adv: Update copyright years for 2020 Signed-off-by: Sven Eckelmann Signed-off-by: Simon Wunderlich --- include/uapi/linux/batadv_packet.h | 2 +- include/uapi/linux/batman_adv.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/batadv_packet.h b/include/uapi/linux/batadv_packet.h index 2a15f01c2243..0ae34c85ef9e 100644 --- a/include/uapi/linux/batadv_packet.h +++ b/include/uapi/linux/batadv_packet.h @@ -1,5 +1,5 @@ /* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) */ -/* Copyright (C) 2007-2019 B.A.T.M.A.N. contributors: +/* Copyright (C) 2007-2020 B.A.T.M.A.N. contributors: * * Marek Lindner, Simon Wunderlich */ diff --git a/include/uapi/linux/batman_adv.h b/include/uapi/linux/batman_adv.h index 67f4636758af..617c180ff0c8 100644 --- a/include/uapi/linux/batman_adv.h +++ b/include/uapi/linux/batman_adv.h @@ -1,5 +1,5 @@ /* SPDX-License-Identifier: MIT */ -/* Copyright (C) 2016-2019 B.A.T.M.A.N. contributors: +/* Copyright (C) 2016-2020 B.A.T.M.A.N. contributors: * * Matthias Schiffer */ -- cgit From 6b102db50cdde3ba2f78631ed21222edf3a5fb51 Mon Sep 17 00:00:00 2001 From: David Ahern Date: Mon, 30 Dec 2019 14:14:29 -0800 Subject: net: Add device index to tcp_md5sig Add support for userspace to specify a device index to limit the scope of an entry via the TCP_MD5SIG_EXT setsockopt. The existing __tcpm_pad is renamed to tcpm_ifindex and the new field is only checked if the new TCP_MD5SIG_FLAG_IFINDEX is set in tcpm_flags. For now, the device index must point to an L3 master device (e.g., VRF). The API and error handling are setup to allow the constraint to be relaxed in the future to any device index. Signed-off-by: David Ahern Signed-off-by: David S. Miller --- include/uapi/linux/tcp.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index 74af1f759cee..d87184e673ca 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -317,14 +317,15 @@ enum { #define TCP_MD5SIG_MAXKEYLEN 80 /* tcp_md5sig extension flags for TCP_MD5SIG_EXT */ -#define TCP_MD5SIG_FLAG_PREFIX 1 /* address prefix length */ +#define TCP_MD5SIG_FLAG_PREFIX 0x1 /* address prefix length */ +#define TCP_MD5SIG_FLAG_IFINDEX 0x2 /* ifindex set */ struct tcp_md5sig { struct __kernel_sockaddr_storage tcpm_addr; /* address associated */ __u8 tcpm_flags; /* extension flags */ __u8 tcpm_prefixlen; /* address prefix */ __u16 tcpm_keylen; /* key length */ - __u32 __tcpm_pad; /* zero */ + int tcpm_ifindex; /* device index for scope */ __u8 tcpm_key[TCP_MD5SIG_MAXKEYLEN]; /* key (binary) */ }; -- cgit From 6c930994503d9b5bd34b1329427dd7d3d6d37cd4 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean Date: Mon, 6 Jan 2020 03:34:09 +0200 Subject: mii: Add helpers for parsing SGMII auto-negotiation Typically a MAC PCS auto-configures itself after it receives the negotiated copper-side link settings from the PHY, but some MAC devices are more special and need manual interpretation of the SGMII AN result. In other cases, the PCS exposes the entire tx_config_reg base page as it is transmitted on the wire during auto-negotiation, so it makes sense to be able to decode the equivalent lp_advertised bit mask from the raw u16 (of course, "lp" considering the PCS to be the local PHY). Therefore, add the bit definitions for the SGMII registers 4 and 5 (local device ability, link partner ability), as well as a link_mode conversion helper that can be used to feed the AN results into phy_resolve_aneg_linkmode. Signed-off-by: Vladimir Oltean Signed-off-by: David S. Miller --- include/uapi/linux/mii.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/mii.h b/include/uapi/linux/mii.h index 51b48e4be1f2..0b9c3beda345 100644 --- a/include/uapi/linux/mii.h +++ b/include/uapi/linux/mii.h @@ -131,6 +131,18 @@ #define NWAYTEST_LOOPBACK 0x0100 /* Enable loopback for N-way */ #define NWAYTEST_RESV2 0xfe00 /* Unused... */ +/* MAC and PHY tx_config_Reg[15:0] for SGMII in-band auto-negotiation.*/ +#define ADVERTISE_SGMII 0x0001 /* MAC can do SGMII */ +#define LPA_SGMII 0x0001 /* PHY can do SGMII */ +#define LPA_SGMII_DPX_SPD_MASK 0x1C00 /* SGMII duplex and speed bits */ +#define LPA_SGMII_10HALF 0x0000 /* Can do 10mbps half-duplex */ +#define LPA_SGMII_10FULL 0x1000 /* Can do 10mbps full-duplex */ +#define LPA_SGMII_100HALF 0x0400 /* Can do 100mbps half-duplex */ +#define LPA_SGMII_100FULL 0x1400 /* Can do 100mbps full-duplex */ +#define LPA_SGMII_1000HALF 0x0800 /* Can do 1000mbps half-duplex */ +#define LPA_SGMII_1000FULL 0x1800 /* Can do 1000mbps full-duplex */ +#define LPA_SGMII_LINK 0x8000 /* PHY link with copper-side partner */ + /* 1000BASE-T Control register */ #define ADVERTISE_1000FULL 0x0200 /* Advertise 1000BASE-T full duplex */ #define ADVERTISE_1000HALF 0x0100 /* Advertise 1000BASE-T half duplex */ -- cgit From 27ae7997a66174cb8afd6a75b3989f5e0c1b9e5a Mon Sep 17 00:00:00 2001 From: Martin KaFai Lau Date: Wed, 8 Jan 2020 16:35:03 -0800 Subject: bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS This patch allows the kernel's struct ops (i.e. func ptr) to be implemented in BPF. The first use case in this series is the "struct tcp_congestion_ops" which will be introduced in a latter patch. This patch introduces a new prog type BPF_PROG_TYPE_STRUCT_OPS. The BPF_PROG_TYPE_STRUCT_OPS prog is verified against a particular func ptr of a kernel struct. The attr->attach_btf_id is the btf id of a kernel struct. The attr->expected_attach_type is the member "index" of that kernel struct. The first member of a struct starts with member index 0. That will avoid ambiguity when a kernel struct has multiple func ptrs with the same func signature. For example, a BPF_PROG_TYPE_STRUCT_OPS prog is written to implement the "init" func ptr of the "struct tcp_congestion_ops". The attr->attach_btf_id is the btf id of the "struct tcp_congestion_ops" of the _running_ kernel. The attr->expected_attach_type is 3. The ctx of BPF_PROG_TYPE_STRUCT_OPS is an array of u64 args saved by arch_prepare_bpf_trampoline that will be done in the next patch when introducing BPF_MAP_TYPE_STRUCT_OPS. "struct bpf_struct_ops" is introduced as a common interface for the kernel struct that supports BPF_PROG_TYPE_STRUCT_OPS prog. The supporting kernel struct will need to implement an instance of the "struct bpf_struct_ops". The supporting kernel struct also needs to implement a bpf_verifier_ops. During BPF_PROG_LOAD, bpf_struct_ops_find() will find the right bpf_verifier_ops by searching the attr->attach_btf_id. A new "btf_struct_access" is also added to the bpf_verifier_ops such that the supporting kernel struct can optionally provide its own specific check on accessing the func arg (e.g. provide limited write access). After btf_vmlinux is parsed, the new bpf_struct_ops_init() is called to initialize some values (e.g. the btf id of the supporting kernel struct) and it can only be done once the btf_vmlinux is available. The R0 checks at BPF_EXIT is excluded for the BPF_PROG_TYPE_STRUCT_OPS prog if the return type of the prog->aux->attach_func_proto is "void". Signed-off-by: Martin KaFai Lau Signed-off-by: Alexei Starovoitov Acked-by: Andrii Nakryiko Acked-by: Yonghong Song Link: https://lore.kernel.org/bpf/20200109003503.3855825-1-kafai@fb.com --- include/uapi/linux/bpf.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 7df436da542d..c1eeb3e0e116 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -174,6 +174,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, BPF_PROG_TYPE_CGROUP_SOCKOPT, BPF_PROG_TYPE_TRACING, + BPF_PROG_TYPE_STRUCT_OPS, }; enum bpf_attach_type { -- cgit From 85d33df357b634649ddbe0a20fd2d0fc5732c3cb Mon Sep 17 00:00:00 2001 From: Martin KaFai Lau Date: Wed, 8 Jan 2020 16:35:05 -0800 Subject: bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS The patch introduces BPF_MAP_TYPE_STRUCT_OPS. The map value is a kernel struct with its func ptr implemented in bpf prog. This new map is the interface to register/unregister/introspect a bpf implemented kernel struct. The kernel struct is actually embedded inside another new struct (or called the "value" struct in the code). For example, "struct tcp_congestion_ops" is embbeded in: struct bpf_struct_ops_tcp_congestion_ops { refcount_t refcnt; enum bpf_struct_ops_state state; struct tcp_congestion_ops data; /* <-- kernel subsystem struct here */ } The map value is "struct bpf_struct_ops_tcp_congestion_ops". The "bpftool map dump" will then be able to show the state ("inuse"/"tobefree") and the number of subsystem's refcnt (e.g. number of tcp_sock in the tcp_congestion_ops case). This "value" struct is created automatically by a macro. Having a separate "value" struct will also make extending "struct bpf_struct_ops_XYZ" easier (e.g. adding "void (*init)(void)" to "struct bpf_struct_ops_XYZ" to do some initialization works before registering the struct_ops to the kernel subsystem). The libbpf will take care of finding and populating the "struct bpf_struct_ops_XYZ" from "struct XYZ". Register a struct_ops to a kernel subsystem: 1. Load all needed BPF_PROG_TYPE_STRUCT_OPS prog(s) 2. Create a BPF_MAP_TYPE_STRUCT_OPS with attr->btf_vmlinux_value_type_id set to the btf id "struct bpf_struct_ops_tcp_congestion_ops" of the running kernel. Instead of reusing the attr->btf_value_type_id, btf_vmlinux_value_type_id s added such that attr->btf_fd can still be used as the "user" btf which could store other useful sysadmin/debug info that may be introduced in the furture, e.g. creation-date/compiler-details/map-creator...etc. 3. Create a "struct bpf_struct_ops_tcp_congestion_ops" object as described in the running kernel btf. Populate the value of this object. The function ptr should be populated with the prog fds. 4. Call BPF_MAP_UPDATE with the object created in (3) as the map value. The key is always "0". During BPF_MAP_UPDATE, the code that saves the kernel-func-ptr's args as an array of u64 is generated. BPF_MAP_UPDATE also allows the specific struct_ops to do some final checks in "st_ops->init_member()" (e.g. ensure all mandatory func ptrs are implemented). If everything looks good, it will register this kernel struct to the kernel subsystem. The map will not allow further update from this point. Unregister a struct_ops from the kernel subsystem: BPF_MAP_DELETE with key "0". Introspect a struct_ops: BPF_MAP_LOOKUP_ELEM with key "0". The map value returned will have the prog _id_ populated as the func ptr. The map value state (enum bpf_struct_ops_state) will transit from: INIT (map created) => INUSE (map updated, i.e. reg) => TOBEFREE (map value deleted, i.e. unreg) The kernel subsystem needs to call bpf_struct_ops_get() and bpf_struct_ops_put() to manage the "refcnt" in the "struct bpf_struct_ops_XYZ". This patch uses a separate refcnt for the purose of tracking the subsystem usage. Another approach is to reuse the map->refcnt and then "show" (i.e. during map_lookup) the subsystem's usage by doing map->refcnt - map->usercnt to filter out the map-fd/pinned-map usage. However, that will also tie down the future semantics of map->refcnt and map->usercnt. The very first subsystem's refcnt (during reg()) holds one count to map->refcnt. When the very last subsystem's refcnt is gone, it will also release the map->refcnt. All bpf_prog will be freed when the map->refcnt reaches 0 (i.e. during map_free()). Here is how the bpftool map command will look like: [root@arch-fb-vm1 bpf]# bpftool map show 6: struct_ops name dctcp flags 0x0 key 4B value 256B max_entries 1 memlock 4096B btf_id 6 [root@arch-fb-vm1 bpf]# bpftool map dump id 6 [{ "value": { "refcnt": { "refs": { "counter": 1 } }, "state": 1, "data": { "list": { "next": 0, "prev": 0 }, "key": 0, "flags": 2, "init": 24, "release": 0, "ssthresh": 25, "cong_avoid": 30, "set_state": 27, "cwnd_event": 28, "in_ack_event": 26, "undo_cwnd": 29, "pkts_acked": 0, "min_tso_segs": 0, "sndbuf_expand": 0, "cong_control": 0, "get_info": 0, "name": [98,112,102,95,100,99,116,99,112,0,0,0,0,0,0,0 ], "owner": 0 } } } ] Misc Notes: * bpf_struct_ops_map_sys_lookup_elem() is added for syscall lookup. It does an inplace update on "*value" instead returning a pointer to syscall.c. Otherwise, it needs a separate copy of "zero" value for the BPF_STRUCT_OPS_STATE_INIT to avoid races. * The bpf_struct_ops_map_delete_elem() is also called without preempt_disable() from map_delete_elem(). It is because the "->unreg()" may requires sleepable context, e.g. the "tcp_unregister_congestion_control()". * "const" is added to some of the existing "struct btf_func_model *" function arg to avoid a compiler warning caused by this patch. Signed-off-by: Martin KaFai Lau Signed-off-by: Alexei Starovoitov Acked-by: Andrii Nakryiko Acked-by: Yonghong Song Link: https://lore.kernel.org/bpf/20200109003505.3855919-1-kafai@fb.com --- include/uapi/linux/bpf.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c1eeb3e0e116..38059880963e 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -136,6 +136,7 @@ enum bpf_map_type { BPF_MAP_TYPE_STACK, BPF_MAP_TYPE_SK_STORAGE, BPF_MAP_TYPE_DEVMAP_HASH, + BPF_MAP_TYPE_STRUCT_OPS, }; /* Note that tracing related programs such as @@ -398,6 +399,10 @@ union bpf_attr { __u32 btf_fd; /* fd pointing to a BTF type data */ __u32 btf_key_type_id; /* BTF type_id of the key */ __u32 btf_value_type_id; /* BTF type_id of the value */ + __u32 btf_vmlinux_value_type_id;/* BTF type_id of a kernel- + * struct stored as the + * map value + */ }; struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */ @@ -3350,7 +3355,7 @@ struct bpf_map_info { __u32 map_flags; char name[BPF_OBJ_NAME_LEN]; __u32 ifindex; - __u32 :32; + __u32 btf_vmlinux_value_type_id; __u64 netns_dev; __u64 netns_ino; __u32 btf_id; -- cgit From 206057fe020ac5c037d5e2dd6562a9bd216ec765 Mon Sep 17 00:00:00 2001 From: Martin KaFai Lau Date: Wed, 8 Jan 2020 16:45:51 -0800 Subject: bpf: Add BPF_FUNC_tcp_send_ack helper Add a helper to send out a tcp-ack. It will be used in the later bpf_dctcp implementation that requires to send out an ack when the CE state changed. Signed-off-by: Martin KaFai Lau Signed-off-by: Alexei Starovoitov Acked-by: Yonghong Song Link: https://lore.kernel.org/bpf/20200109004551.3900448-1-kafai@fb.com --- include/uapi/linux/bpf.h | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 38059880963e..2d6a2e572f56 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -2837,6 +2837,14 @@ union bpf_attr { * Return * On success, the strictly positive length of the string, including * the trailing NUL character. On error, a negative value. + * + * int bpf_tcp_send_ack(void *tp, u32 rcv_nxt) + * Description + * Send out a tcp-ack. *tp* is the in-kernel struct tcp_sock. + * *rcv_nxt* is the ack_seq to be sent out. + * Return + * 0 on success, or a negative error in case of failure. + * */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -2954,7 +2962,8 @@ union bpf_attr { FN(probe_read_user), \ FN(probe_read_kernel), \ FN(probe_read_user_str), \ - FN(probe_read_kernel_str), + FN(probe_read_kernel_str), \ + FN(tcp_send_ack), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call -- cgit From f5bfcd953d811dbb8913de36b96b38da6bb62135 Mon Sep 17 00:00:00 2001 From: Andrey Ignatov Date: Tue, 7 Jan 2020 17:40:06 -0800 Subject: bpf: Document BPF_F_QUERY_EFFECTIVE flag Document BPF_F_QUERY_EFFECTIVE flag, mostly to clarify how it affects attach_flags what may not be obvious and what may lead to confision. Specifically attach_flags is returned only for target_fd but if programs are inherited from an ancestor cgroup then returned attach_flags for current cgroup may be confusing. For example, two effective programs of same attach_type can be returned but w/o BPF_F_ALLOW_MULTI in attach_flags. Simple repro: # bpftool c s /sys/fs/cgroup/path/to/task ID AttachType AttachFlags Name # bpftool c s /sys/fs/cgroup/path/to/task effective ID AttachType AttachFlags Name 95043 ingress tw_ipt_ingress 95048 ingress tw_ingress Signed-off-by: Andrey Ignatov Signed-off-by: Alexei Starovoitov Acked-by: Song Liu Link: https://lore.kernel.org/bpf/20200108014006.938363-1-rdna@fb.com --- include/uapi/linux/bpf.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 2d6a2e572f56..52966e758fe5 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -359,7 +359,12 @@ enum bpf_attach_type { /* Enable memory-mapping BPF map */ #define BPF_F_MMAPABLE (1U << 10) -/* flags for BPF_PROG_QUERY */ +/* Flags for BPF_PROG_QUERY. */ + +/* Query effective (directly attached + inherited from ancestor cgroups) + * programs that will be executed for events within a cgroup. + * attach_flags with this flag are returned only for directly attached programs. + */ #define BPF_F_QUERY_EFFECTIVE (1U << 0) enum bpf_stack_build_id_status { -- cgit From faf391c3826cd29feae02078ca2022d2f912f7cc Mon Sep 17 00:00:00 2001 From: Mat Martineau Date: Thu, 9 Jan 2020 07:59:16 -0800 Subject: tcp: Define IPPROTO_MPTCP To open a MPTCP socket with socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP), IPPROTO_MPTCP needs a value that differs from IPPROTO_TCP. The existing IPPROTO numbers mostly map directly to IANA-specified protocol numbers. MPTCP does not have a protocol number allocated because MPTCP packets use the TCP protocol number. Use private number not used OTA. Reviewed-by: Eric Dumazet Signed-off-by: Mat Martineau Signed-off-by: David S. Miller --- include/uapi/linux/in.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h index e7ad9d350a28..1521073b6348 100644 --- a/include/uapi/linux/in.h +++ b/include/uapi/linux/in.h @@ -76,6 +76,8 @@ enum { #define IPPROTO_MPLS IPPROTO_MPLS IPPROTO_RAW = 255, /* Raw IP packets */ #define IPPROTO_RAW IPPROTO_RAW + IPPROTO_MPTCP = 262, /* Multipath TCP connection */ +#define IPPROTO_MPTCP IPPROTO_MPTCP IPPROTO_MAX }; #endif -- cgit From 51c39bb1d5d105a02e29aa7960f0a395086e6342 Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Thu, 9 Jan 2020 22:41:20 -0800 Subject: bpf: Introduce function-by-function verification New llvm and old llvm with libbpf help produce BTF that distinguish global and static functions. Unlike arguments of static function the arguments of global functions cannot be removed or optimized away by llvm. The compiler has to use exactly the arguments specified in a function prototype. The argument type information allows the verifier validate each global function independently. For now only supported argument types are pointer to context and scalars. In the future pointers to structures, sizes, pointer to packet data can be supported as well. Consider the following example: static int f1(int ...) { ... } int f3(int b); int f2(int a) { f1(a) + f3(a); } int f3(int b) { ... } int main(...) { f1(...) + f2(...) + f3(...); } The verifier will start its safety checks from the first global function f2(). It will recursively descend into f1() because it's static. Then it will check that arguments match for the f3() invocation inside f2(). It will not descend into f3(). It will finish f2() that has to be successfully verified for all possible values of 'a'. Then it will proceed with f3(). That function also has to be safe for all possible values of 'b'. Then it will start subprog 0 (which is main() function). It will recursively descend into f1() and will skip full check of f2() and f3(), since they are global. The order of processing global functions doesn't affect safety, since all global functions must be proven safe based on their arguments only. Such function by function verification can drastically improve speed of the verification and reduce complexity. Note that the stack limit of 512 still applies to the call chain regardless whether functions were static or global. The nested level of 8 also still applies. The same recursion prevention checks are in place as well. The type information and static/global kind is preserved after the verification hence in the above example global function f2() and f3() can be replaced later by equivalent functions with the same types that are loaded and verified later without affecting safety of this main() program. Such replacement (re-linking) of global functions is a subject of future patches. Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Acked-by: Song Liu Link: https://lore.kernel.org/bpf/20200110064124.1760511-3-ast@kernel.org --- include/uapi/linux/btf.h | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h index 1a2898c482ee..5a667107ad2c 100644 --- a/include/uapi/linux/btf.h +++ b/include/uapi/linux/btf.h @@ -146,6 +146,12 @@ enum { BTF_VAR_GLOBAL_EXTERN = 2, }; +enum btf_func_linkage { + BTF_FUNC_STATIC = 0, + BTF_FUNC_GLOBAL = 1, + BTF_FUNC_EXTERN = 2, +}; + /* BTF_KIND_VAR is followed by a single "struct btf_var" to describe * additional information related to the variable such as its linkage. */ -- cgit From 76564261a7db80c5f5c624e0122a28787f266bdf Mon Sep 17 00:00:00 2001 From: Antoine Tenart Date: Mon, 13 Jan 2020 23:31:40 +0100 Subject: net: macsec: introduce the macsec_context structure This patch introduces the macsec_context structure. It will be used in the kernel to exchange information between the common MACsec implementation (macsec.c) and the MACsec hardware offloading implementations. This structure contains pointers to MACsec specific structures which contain the actual MACsec configuration, and to the underlying device (phydev for now). Signed-off-by: Antoine Tenart Signed-off-by: David S. Miller --- include/uapi/linux/if_link.h | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 1d69f637c5d6..024af2d1d0af 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -486,6 +486,13 @@ enum macsec_validation_type { MACSEC_VALIDATE_MAX = __MACSEC_VALIDATE_END - 1, }; +enum macsec_offload { + MACSEC_OFFLOAD_OFF = 0, + MACSEC_OFFLOAD_PHY = 1, + __MACSEC_OFFLOAD_END, + MACSEC_OFFLOAD_MAX = __MACSEC_OFFLOAD_END - 1, +}; + /* IPVLAN section */ enum { IFLA_IPVLAN_UNSPEC, -- cgit From dcb780fb279514f268826f2e9f4df3bc75610703 Mon Sep 17 00:00:00 2001 From: Antoine Tenart Date: Mon, 13 Jan 2020 23:31:44 +0100 Subject: net: macsec: add nla support for changing the offloading selection MACsec offloading to underlying hardware devices is disabled by default (the software implementation is used). This patch adds support for changing this setting through the MACsec netlink interface. Many checks are done when enabling offloading on a given MACsec interface as there are limitations (it must be supported by the hardware, only a single interface can be offloaded on a given physical device at a time, rules can't be moved for now). Signed-off-by: Antoine Tenart Signed-off-by: David S. Miller --- include/uapi/linux/if_macsec.h | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/if_macsec.h b/include/uapi/linux/if_macsec.h index 98e4d5d7c45c..1d63c43c38cc 100644 --- a/include/uapi/linux/if_macsec.h +++ b/include/uapi/linux/if_macsec.h @@ -45,6 +45,7 @@ enum macsec_attrs { MACSEC_ATTR_RXSC_LIST, /* dump, nested, macsec_rxsc_attrs for each RXSC */ MACSEC_ATTR_TXSC_STATS, /* dump, nested, macsec_txsc_stats_attr */ MACSEC_ATTR_SECY_STATS, /* dump, nested, macsec_secy_stats_attr */ + MACSEC_ATTR_OFFLOAD, /* config, nested, macsec_offload_attrs */ __MACSEC_ATTR_END, NUM_MACSEC_ATTR = __MACSEC_ATTR_END, MACSEC_ATTR_MAX = __MACSEC_ATTR_END - 1, @@ -97,6 +98,15 @@ enum macsec_sa_attrs { MACSEC_SA_ATTR_MAX = __MACSEC_SA_ATTR_END - 1, }; +enum macsec_offload_attrs { + MACSEC_OFFLOAD_ATTR_UNSPEC, + MACSEC_OFFLOAD_ATTR_TYPE, /* config/dump, u8 0..2 */ + MACSEC_OFFLOAD_ATTR_PAD, + __MACSEC_OFFLOAD_ATTR_END, + NUM_MACSEC_OFFLOAD_ATTR = __MACSEC_OFFLOAD_ATTR_END, + MACSEC_OFFLOAD_ATTR_MAX = __MACSEC_OFFLOAD_ATTR_END - 1, +}; + enum macsec_nl_commands { MACSEC_CMD_GET_TXSC, MACSEC_CMD_ADD_RXSC, @@ -108,6 +118,7 @@ enum macsec_nl_commands { MACSEC_CMD_ADD_RXSA, MACSEC_CMD_DEL_RXSA, MACSEC_CMD_UPD_RXSA, + MACSEC_CMD_UPD_OFFLOAD, }; /* u64 per-RXSC stats */ -- cgit From 90b93f1b31f86dcde2fa3c57f1ae33d28d87a1f8 Mon Sep 17 00:00:00 2001 From: Ido Schimmel Date: Tue, 14 Jan 2020 13:23:11 +0200 Subject: ipv4: Add "offload" and "trap" indications to routes When performing L3 offload, routes and nexthops are usually programmed into two different tables in the underlying device. Therefore, the fact that a nexthop resides in hardware does not necessarily mean that all the associated routes also reside in hardware and vice-versa. While the kernel can signal to user space the presence of a nexthop in hardware (via 'RTNH_F_OFFLOAD'), it does not have a corresponding flag for routes. In addition, the fact that a route resides in hardware does not necessarily mean that the traffic is offloaded. For example, unreachable routes (i.e., 'RTN_UNREACHABLE') are programmed to trap packets to the CPU so that the kernel will be able to generate the appropriate ICMP error packet. This patch adds an "offload" and "trap" indications to IPv4 routes, so that users will have better visibility into the offload process. 'struct fib_alias' is extended with two new fields that indicate if the route resides in hardware or not and if it is offloading traffic from the kernel or trapping packets to it. Note that the new fields are added in the 6 bytes hole and therefore the struct still fits in a single cache line [1]. Capable drivers are expected to invoke fib_alias_hw_flags_set() with the route's key in order to set the flags. The indications are dumped to user space via a new flags (i.e., 'RTM_F_OFFLOAD' and 'RTM_F_TRAP') in the 'rtm_flags' field in the ancillary header. v2: * Make use of 'struct fib_rt_info' in fib_alias_hw_flags_set() [1] struct fib_alias { struct hlist_node fa_list; /* 0 16 */ struct fib_info * fa_info; /* 16 8 */ u8 fa_tos; /* 24 1 */ u8 fa_type; /* 25 1 */ u8 fa_state; /* 26 1 */ u8 fa_slen; /* 27 1 */ u32 tb_id; /* 28 4 */ s16 fa_default; /* 32 2 */ u8 offload:1; /* 34: 0 1 */ u8 trap:1; /* 34: 1 1 */ u8 unused:6; /* 34: 2 1 */ /* XXX 5 bytes hole, try to pack */ struct callback_head rcu __attribute__((__aligned__(8))); /* 40 16 */ /* size: 56, cachelines: 1, members: 12 */ /* sum members: 50, holes: 1, sum holes: 5 */ /* sum bitfield members: 8 bits (1 bytes) */ /* forced alignments: 1, forced holes: 1, sum forced holes: 5 */ /* last cacheline: 56 bytes */ } __attribute__((__aligned__(8))); Signed-off-by: Ido Schimmel Reviewed-by: David Ahern Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller --- include/uapi/linux/rtnetlink.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 1418a8362bb7..cd43321d20dd 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -309,6 +309,8 @@ enum rt_scope_t { #define RTM_F_PREFIX 0x800 /* Prefix addresses */ #define RTM_F_LOOKUP_TABLE 0x1000 /* set rtm_table to FIB lookup result */ #define RTM_F_FIB_MATCH 0x2000 /* return full fib lookup match */ +#define RTM_F_OFFLOAD 0x4000 /* route is offloaded */ +#define RTM_F_TRAP 0x8000 /* route is trapping packets */ /* Reserved table identifiers */ -- cgit From 8dcea187088bce5d2f1149294ad109f022653547 Mon Sep 17 00:00:00 2001 From: Nikolay Aleksandrov Date: Tue, 14 Jan 2020 19:56:09 +0200 Subject: net: bridge: vlan: add rtm definitions and dump support This patch adds vlan rtm definitions: - NEWVLAN: to be used for creating vlans, setting options and notifications - DELVLAN: to be used for deleting vlans - GETVLAN: used for dumping vlan information Dumping vlans which can span multiple messages is added now with basic information (vid and flags). We use nlmsg_parse() to validate the header length in order to be able to extend the message with filtering attributes later. Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller --- include/uapi/linux/if_bridge.h | 28 ++++++++++++++++++++++++++++ include/uapi/linux/rtnetlink.h | 7 +++++++ 2 files changed, 35 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h index 4a58e3d7de46..4da04f77d9ee 100644 --- a/include/uapi/linux/if_bridge.h +++ b/include/uapi/linux/if_bridge.h @@ -165,6 +165,34 @@ struct bridge_stp_xstats { __u64 tx_tcn; }; +/* Bridge vlan RTM header */ +struct br_vlan_msg { + __u8 family; + __u8 reserved1; + __u16 reserved2; + __u32 ifindex; +}; + +/* Bridge vlan RTM attributes + * [BRIDGE_VLANDB_ENTRY] = { + * [BRIDGE_VLANDB_ENTRY_INFO] + * ... + * } + */ +enum { + BRIDGE_VLANDB_UNSPEC, + BRIDGE_VLANDB_ENTRY, + __BRIDGE_VLANDB_MAX, +}; +#define BRIDGE_VLANDB_MAX (__BRIDGE_VLANDB_MAX - 1) + +enum { + BRIDGE_VLANDB_ENTRY_UNSPEC, + BRIDGE_VLANDB_ENTRY_INFO, + __BRIDGE_VLANDB_ENTRY_MAX, +}; +#define BRIDGE_VLANDB_ENTRY_MAX (__BRIDGE_VLANDB_ENTRY_MAX - 1) + /* Bridge multicast database attributes * [MDBA_MDB] = { * [MDBA_MDB_ENTRY] = { diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index cd43321d20dd..b333cff14071 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -171,6 +171,13 @@ enum { RTM_GETLINKPROP, #define RTM_GETLINKPROP RTM_GETLINKPROP + RTM_NEWVLAN = 112, +#define RTM_NEWNVLAN RTM_NEWVLAN + RTM_DELVLAN, +#define RTM_DELVLAN RTM_DELVLAN + RTM_GETVLAN, +#define RTM_GETVLAN RTM_GETVLAN + __RTM_MAX, #define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1) }; -- cgit From 0ab558795184f87b532363e262357160d25f5d64 Mon Sep 17 00:00:00 2001 From: Nikolay Aleksandrov Date: Tue, 14 Jan 2020 19:56:12 +0200 Subject: net: bridge: vlan: add rtm range support Add a new vlandb nl attribute - BRIDGE_VLANDB_ENTRY_RANGE which causes RTM_NEWVLAN/DELVAN to act on a range. Dumps now automatically compress similar vlans into ranges. This will be also used when per-vlan options are introduced and vlans' options match, they will be put into a single range which is encapsulated in one netlink attribute. We need to run similar checks as br_process_vlan_info() does because these ranges will be used for options setting and they'll be able to skip br_process_vlan_info(). Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller --- include/uapi/linux/if_bridge.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h index 4da04f77d9ee..ac38f0b674b8 100644 --- a/include/uapi/linux/if_bridge.h +++ b/include/uapi/linux/if_bridge.h @@ -189,6 +189,7 @@ enum { enum { BRIDGE_VLANDB_ENTRY_UNSPEC, BRIDGE_VLANDB_ENTRY_INFO, + BRIDGE_VLANDB_ENTRY_RANGE, __BRIDGE_VLANDB_ENTRY_MAX, }; #define BRIDGE_VLANDB_ENTRY_MAX (__BRIDGE_VLANDB_ENTRY_MAX - 1) -- cgit From cf5bddb95cbe5340e486e84d46cf2f0bb324078c Mon Sep 17 00:00:00 2001 From: Nikolay Aleksandrov Date: Tue, 14 Jan 2020 19:56:13 +0200 Subject: net: bridge: vlan: add rtnetlink group and notify support Add a new rtnetlink group for bridge vlan notifications - RTNLGRP_BRVLAN and add support for sending vlan notifications (both single and ranges). No functional changes intended, the notification support will be used by later patches. Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller --- include/uapi/linux/rtnetlink.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index b333cff14071..4a8c5b745157 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -730,6 +730,8 @@ enum rtnetlink_groups { #define RTNLGRP_IPV6_MROUTE_R RTNLGRP_IPV6_MROUTE_R RTNLGRP_NEXTHOP, #define RTNLGRP_NEXTHOP RTNLGRP_NEXTHOP + RTNLGRP_BRVLAN, +#define RTNLGRP_BRVLAN RTNLGRP_BRVLAN __RTNLGRP_MAX }; #define RTNLGRP_MAX (__RTNLGRP_MAX - 1) -- cgit From 8482941f09067da42f9c3362e15bfb3f3c19d610 Mon Sep 17 00:00:00 2001 From: Yonghong Song Date: Tue, 14 Jan 2020 19:50:02 -0800 Subject: bpf: Add bpf_send_signal_thread() helper Commit 8b401f9ed244 ("bpf: implement bpf_send_signal() helper") added helper bpf_send_signal() which permits bpf program to send a signal to the current process. The signal may be delivered to any threads in the process. We found a use case where sending the signal to the current thread is more preferable. - A bpf program will collect the stack trace and then send signal to the user application. - The user application will add some thread specific information to the just collected stack trace for later analysis. If bpf_send_signal() is used, user application will need to check whether the thread receiving the signal matches the thread collecting the stack by checking thread id. If not, it will need to send signal to another thread through pthread_kill(). This patch proposed a new helper bpf_send_signal_thread(), which sends the signal to the thread corresponding to the current kernel task. This way, user space is guaranteed that bpf_program execution context and user space signal handling context are the same thread. Signed-off-by: Yonghong Song Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20200115035002.602336-1-yhs@fb.com --- include/uapi/linux/bpf.h | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 52966e758fe5..a88837d4dd18 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -2714,7 +2714,8 @@ union bpf_attr { * * int bpf_send_signal(u32 sig) * Description - * Send signal *sig* to the current task. + * Send signal *sig* to the process of the current task. + * The signal may be delivered to any of this process's threads. * Return * 0 on success or successfully queued. * @@ -2850,6 +2851,19 @@ union bpf_attr { * Return * 0 on success, or a negative error in case of failure. * + * int bpf_send_signal_thread(u32 sig) + * Description + * Send signal *sig* to the thread corresponding to the current task. + * Return + * 0 on success or successfully queued. + * + * **-EBUSY** if work queue under nmi is full. + * + * **-EINVAL** if *sig* is invalid. + * + * **-EPERM** if no permission to send the *sig*. + * + * **-EAGAIN** if bpf program can try again. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -2968,7 +2982,8 @@ union bpf_attr { FN(probe_read_kernel), \ FN(probe_read_user_str), \ FN(probe_read_kernel_str), \ - FN(tcp_send_ack), + FN(tcp_send_ack), \ + FN(send_signal_thread), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call -- cgit From cb4d03ab499d4c040f4ab6fd4389d2b49f42b5a5 Mon Sep 17 00:00:00 2001 From: Brian Vazquez Date: Wed, 15 Jan 2020 10:43:01 -0800 Subject: bpf: Add generic support for lookup batch op This commit introduces generic support for the bpf_map_lookup_batch. This implementation can be used by almost all the bpf maps since its core implementation is relying on the existing map_get_next_key and map_lookup_elem. The bpf syscall subcommand introduced is: BPF_MAP_LOOKUP_BATCH The UAPI attribute is: struct { /* struct used by BPF_MAP_*_BATCH commands */ __aligned_u64 in_batch; /* start batch, * NULL to start from beginning */ __aligned_u64 out_batch; /* output: next start batch */ __aligned_u64 keys; __aligned_u64 values; __u32 count; /* input/output: * input: # of key/value * elements * output: # of filled elements */ __u32 map_fd; __u64 elem_flags; __u64 flags; } batch; in_batch/out_batch are opaque values use to communicate between user/kernel space, in_batch/out_batch must be of key_size length. To start iterating from the beginning in_batch must be null, count is the # of key/value elements to retrieve. Note that the 'keys' buffer must be a buffer of key_size * count size and the 'values' buffer must be value_size * count, where value_size must be aligned to 8 bytes by userspace if it's dealing with percpu maps. 'count' will contain the number of keys/values successfully retrieved. Note that 'count' is an input/output variable and it can contain a lower value after a call. If there's no more entries to retrieve, ENOENT will be returned. If error is ENOENT, count might be > 0 in case it copied some values but there were no more entries to retrieve. Note that if the return code is an error and not -EFAULT, count indicates the number of elements successfully processed. Suggested-by: Stanislav Fomichev Signed-off-by: Brian Vazquez Signed-off-by: Yonghong Song Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20200115184308.162644-3-brianvv@google.com --- include/uapi/linux/bpf.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index a88837d4dd18..0721b2c6bb8c 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -107,6 +107,7 @@ enum bpf_cmd { BPF_MAP_LOOKUP_AND_DELETE_ELEM, BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, + BPF_MAP_LOOKUP_BATCH, }; enum bpf_map_type { @@ -420,6 +421,23 @@ union bpf_attr { __u64 flags; }; + struct { /* struct used by BPF_MAP_*_BATCH commands */ + __aligned_u64 in_batch; /* start batch, + * NULL to start from beginning + */ + __aligned_u64 out_batch; /* output: next start batch */ + __aligned_u64 keys; + __aligned_u64 values; + __u32 count; /* input/output: + * input: # of key/value + * elements + * output: # of filled elements + */ + __u32 map_fd; + __u64 elem_flags; + __u64 flags; + } batch; + struct { /* anonymous struct used by BPF_PROG_LOAD command */ __u32 prog_type; /* one of enum bpf_prog_type */ __u32 insn_cnt; -- cgit From aa2e93b8e58e18442edfb2427446732415bc215e Mon Sep 17 00:00:00 2001 From: Brian Vazquez Date: Wed, 15 Jan 2020 10:43:02 -0800 Subject: bpf: Add generic support for update and delete batch ops This commit adds generic support for update and delete batch ops that can be used for almost all the bpf maps. These commands share the same UAPI attr that lookup and lookup_and_delete batch ops use and the syscall commands are: BPF_MAP_UPDATE_BATCH BPF_MAP_DELETE_BATCH The main difference between update/delete and lookup batch ops is that for update/delete keys/values must be specified for userspace and because of that, neither in_batch nor out_batch are used. Suggested-by: Stanislav Fomichev Signed-off-by: Brian Vazquez Signed-off-by: Yonghong Song Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20200115184308.162644-4-brianvv@google.com --- include/uapi/linux/bpf.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 0721b2c6bb8c..d5320b0be459 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -108,6 +108,8 @@ enum bpf_cmd { BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, BPF_MAP_LOOKUP_BATCH, + BPF_MAP_UPDATE_BATCH, + BPF_MAP_DELETE_BATCH, }; enum bpf_map_type { -- cgit From 057996380a42bb64ccc04383cfa9c0ace4ea11f0 Mon Sep 17 00:00:00 2001 From: Yonghong Song Date: Wed, 15 Jan 2020 10:43:04 -0800 Subject: bpf: Add batch ops to all htab bpf map htab can't use generic batch support due some problematic behaviours inherent to the data structre, i.e. while iterating the bpf map a concurrent program might delete the next entry that batch was about to use, in that case there's no easy solution to retrieve the next entry, the issue has been discussed multiple times (see [1] and [2]). The only way hmap can be traversed without the problem previously exposed is by making sure that the map is traversing entire buckets. This commit implements those strict requirements for hmap, the implementation follows the same interaction that generic support with some exceptions: - If keys/values buffer are not big enough to traverse a bucket, ENOSPC will be returned. - out_batch contains the value of the next bucket in the iteration, not the next key, but this is transparent for the user since the user should never use out_batch for other than bpf batch syscalls. This commits implements BPF_MAP_LOOKUP_BATCH and adds support for new command BPF_MAP_LOOKUP_AND_DELETE_BATCH. Note that for update/delete batch ops it is possible to use the generic implementations. [1] https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/ [2] https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/ Signed-off-by: Yonghong Song Signed-off-by: Brian Vazquez Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20200115184308.162644-6-brianvv@google.com --- include/uapi/linux/bpf.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d5320b0be459..033d90a2282d 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -108,6 +108,7 @@ enum bpf_cmd { BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, BPF_MAP_LOOKUP_BATCH, + BPF_MAP_LOOKUP_AND_DELETE_BATCH, BPF_MAP_UPDATE_BATCH, BPF_MAP_DELETE_BATCH, }; -- cgit From 4a7faaf4add3108522512c16c9ee08fb2ad4525e Mon Sep 17 00:00:00 2001 From: Jeremy Sowden Date: Wed, 1 Jan 2020 13:41:32 +0000 Subject: netfilter: nft_bitwise: correct uapi header comment. The comment documenting how bitwise expressions work includes a table which summarizes the mask and xor arguments combined to express the supported boolean operations. However, the row for OR: mask xor 0 x is incorrect. dreg = (sreg & 0) ^ x is not equivalent to: dreg = sreg | x What the code actually does is: dreg = (sreg & ~x) ^ x Update the documentation to match. Signed-off-by: Jeremy Sowden Signed-off-by: Pablo Neira Ayuso --- include/uapi/linux/netfilter/nf_tables.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index e237ecbdcd8a..dd4611767933 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -501,7 +501,7 @@ enum nft_immediate_attributes { * * mask xor * NOT: 1 1 - * OR: 0 x + * OR: ~x x * XOR: 1 x * AND: x 0 */ -- cgit From 9d1f979986c2e29632b6a8f7a8ef8b3c7d24a48c Mon Sep 17 00:00:00 2001 From: Jeremy Sowden Date: Wed, 15 Jan 2020 20:05:51 +0000 Subject: netfilter: bitwise: add NFTA_BITWISE_OP netlink attribute. Add a new bitwise netlink attribute, NFTA_BITWISE_OP, which is set to a value of a new enum, nft_bitwise_ops. It describes the type of operation an expression contains. Currently, it only has one value: NFT_BITWISE_BOOL. More values will be added later to implement shifts. Signed-off-by: Jeremy Sowden Signed-off-by: Pablo Neira Ayuso --- include/uapi/linux/netfilter/nf_tables.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index dd4611767933..0cddf357281f 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -484,6 +484,16 @@ enum nft_immediate_attributes { }; #define NFTA_IMMEDIATE_MAX (__NFTA_IMMEDIATE_MAX - 1) +/** + * enum nft_bitwise_ops - nf_tables bitwise operations + * + * @NFT_BITWISE_BOOL: mask-and-xor operation used to implement NOT, AND, OR and + * XOR boolean operations + */ +enum nft_bitwise_ops { + NFT_BITWISE_BOOL, +}; + /** * enum nft_bitwise_attributes - nf_tables bitwise expression netlink attributes * @@ -492,6 +502,7 @@ enum nft_immediate_attributes { * @NFTA_BITWISE_LEN: length of operands (NLA_U32) * @NFTA_BITWISE_MASK: mask value (NLA_NESTED: nft_data_attributes) * @NFTA_BITWISE_XOR: xor value (NLA_NESTED: nft_data_attributes) + * @NFTA_BITWISE_OP: type of operation (NLA_U32: nft_bitwise_ops) * * The bitwise expression performs the following operation: * @@ -512,6 +523,7 @@ enum nft_bitwise_attributes { NFTA_BITWISE_LEN, NFTA_BITWISE_MASK, NFTA_BITWISE_XOR, + NFTA_BITWISE_OP, __NFTA_BITWISE_MAX }; #define NFTA_BITWISE_MAX (__NFTA_BITWISE_MAX - 1) -- cgit From 779f725e142cd987a69296c0eb7d416ee3ba57dd Mon Sep 17 00:00:00 2001 From: Jeremy Sowden Date: Wed, 15 Jan 2020 20:05:56 +0000 Subject: netfilter: bitwise: add NFTA_BITWISE_DATA attribute. Add a new bitwise netlink attribute that will be used by shift operations to store the size of the shift. It is not used by boolean operations. Signed-off-by: Jeremy Sowden Signed-off-by: Pablo Neira Ayuso --- include/uapi/linux/netfilter/nf_tables.h | 3 +++ 1 file changed, 3 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index 0cddf357281f..8bef0620bc4f 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -503,6 +503,8 @@ enum nft_bitwise_ops { * @NFTA_BITWISE_MASK: mask value (NLA_NESTED: nft_data_attributes) * @NFTA_BITWISE_XOR: xor value (NLA_NESTED: nft_data_attributes) * @NFTA_BITWISE_OP: type of operation (NLA_U32: nft_bitwise_ops) + * @NFTA_BITWISE_DATA: argument for non-boolean operations + * (NLA_NESTED: nft_data_attributes) * * The bitwise expression performs the following operation: * @@ -524,6 +526,7 @@ enum nft_bitwise_attributes { NFTA_BITWISE_MASK, NFTA_BITWISE_XOR, NFTA_BITWISE_OP, + NFTA_BITWISE_DATA, __NFTA_BITWISE_MAX }; #define NFTA_BITWISE_MAX (__NFTA_BITWISE_MAX - 1) -- cgit From 567d746b55bc66d3800c9ae91d50f0c5deb2fd93 Mon Sep 17 00:00:00 2001 From: Jeremy Sowden Date: Wed, 15 Jan 2020 20:05:57 +0000 Subject: netfilter: bitwise: add support for shifts. Hitherto nft_bitwise has only supported boolean operations: NOT, AND, OR and XOR. Extend it to do shifts as well. Signed-off-by: Jeremy Sowden Signed-off-by: Pablo Neira Ayuso --- include/uapi/linux/netfilter/nf_tables.h | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index 8bef0620bc4f..261864736b26 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -489,9 +489,13 @@ enum nft_immediate_attributes { * * @NFT_BITWISE_BOOL: mask-and-xor operation used to implement NOT, AND, OR and * XOR boolean operations + * @NFT_BITWISE_LSHIFT: left-shift operation + * @NFT_BITWISE_RSHIFT: right-shift operation */ enum nft_bitwise_ops { NFT_BITWISE_BOOL, + NFT_BITWISE_LSHIFT, + NFT_BITWISE_RSHIFT, }; /** @@ -506,11 +510,12 @@ enum nft_bitwise_ops { * @NFTA_BITWISE_DATA: argument for non-boolean operations * (NLA_NESTED: nft_data_attributes) * - * The bitwise expression performs the following operation: + * The bitwise expression supports boolean and shift operations. It implements + * the boolean operations by performing the following operation: * * dreg = (sreg & mask) ^ xor * - * which allow to express all bitwise operations: + * with these mask and xor values: * * mask xor * NOT: 1 1 -- cgit From f362e5fe0f1f9774b28c5b4c9b07b0168575cf6e Mon Sep 17 00:00:00 2001 From: Martin Schiller Date: Tue, 21 Jan 2020 07:00:33 +0100 Subject: wan/hdlc_x25: make lapb params configurable This enables you to configure mode (DTE/DCE), Modulo, Window, T1, T2, N2 via sethdlc (which needs to be patched as well). Signed-off-by: Martin Schiller Signed-off-by: David S. Miller --- include/uapi/linux/hdlc/ioctl.h | 9 +++++++++ include/uapi/linux/if.h | 1 + 2 files changed, 10 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/hdlc/ioctl.h b/include/uapi/linux/hdlc/ioctl.h index 0fe4238e8246..b06341acab5e 100644 --- a/include/uapi/linux/hdlc/ioctl.h +++ b/include/uapi/linux/hdlc/ioctl.h @@ -79,6 +79,15 @@ typedef struct { unsigned int timeout; } cisco_proto; +typedef struct { + unsigned short dce; /* 1 for DCE (network side) operation */ + unsigned int modulo; /* modulo (8 = basic / 128 = extended) */ + unsigned int window; /* frame window size */ + unsigned int t1; /* timeout t1 */ + unsigned int t2; /* timeout t2 */ + unsigned int n2; /* frame retry counter */ +} x25_hdlc_proto; + /* PPP doesn't need any info now - supply length = 0 to ioctl */ #endif /* __ASSEMBLY__ */ diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h index 4bf33344aab1..be714cd8c826 100644 --- a/include/uapi/linux/if.h +++ b/include/uapi/linux/if.h @@ -213,6 +213,7 @@ struct if_settings { fr_proto __user *fr; fr_proto_pvc __user *fr_pvc; fr_proto_pvc_info __user *fr_pvc_info; + x25_hdlc_proto __user *x25; /* interface settings */ sync_serial_settings __user *sync; -- cgit From be8704ff07d2374bcc5c675526f95e70c6459683 Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Mon, 20 Jan 2020 16:53:46 -0800 Subject: bpf: Introduce dynamic program extensions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Introduce dynamic program extensions. The users can load additional BPF functions and replace global functions in previously loaded BPF programs while these programs are executing. Global functions are verified individually by the verifier based on their types only. Hence the global function in the new program which types match older function can safely replace that corresponding function. This new function/program is called 'an extension' of old program. At load time the verifier uses (attach_prog_fd, attach_btf_id) pair to identify the function to be replaced. The BPF program type is derived from the target program into extension program. Technically bpf_verifier_ops is copied from target program. The BPF_PROG_TYPE_EXT program type is a placeholder. It has empty verifier_ops. The extension program can call the same bpf helper functions as target program. Single BPF_PROG_TYPE_EXT type is used to extend XDP, SKB and all other program types. The verifier allows only one level of replacement. Meaning that the extension program cannot recursively extend an extension. That also means that the maximum stack size is increasing from 512 to 1024 bytes and maximum function nesting level from 8 to 16. The programs don't always consume that much. The stack usage is determined by the number of on-stack variables used by the program. The verifier could have enforced 512 limit for combined original plus extension program, but it makes for difficult user experience. The main use case for extensions is to provide generic mechanism to plug external programs into policy program or function call chaining. BPF trampoline is used to track both fentry/fexit and program extensions because both are using the same nop slot at the beginning of every BPF function. Attaching fentry/fexit to a function that was replaced is not allowed. The opposite is true as well. Replacing a function that currently being analyzed with fentry/fexit is not allowed. The executable page allocated by BPF trampoline is not used by program extensions. This inefficiency will be optimized in future patches. Function by function verification of global function supports scalars and pointer to context only. Hence program extensions are supported for such class of global functions only. In the future the verifier will be extended with support to pointers to structures, arrays with sizes, etc. Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Acked-by: John Fastabend Acked-by: Andrii Nakryiko Acked-by: Toke Høiland-Jørgensen Link: https://lore.kernel.org/bpf/20200121005348.2769920-2-ast@kernel.org --- include/uapi/linux/bpf.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 033d90a2282d..e81628eb059c 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -180,6 +180,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_CGROUP_SOCKOPT, BPF_PROG_TYPE_TRACING, BPF_PROG_TYPE_STRUCT_OPS, + BPF_PROG_TYPE_EXT, }; enum bpf_attach_type { -- cgit From 5576b991e9c1a11d2cc21c4b94fc75ec27603896 Mon Sep 17 00:00:00 2001 From: Martin KaFai Lau Date: Wed, 22 Jan 2020 15:36:46 -0800 Subject: bpf: Add BPF_FUNC_jiffies64 This patch adds a helper to read the 64bit jiffies. It will be used in a later patch to implement the bpf_cubic.c. The helper is inlined for jit_requested and 64 BITS_PER_LONG as the map_gen_lookup(). Other cases could be considered together with map_gen_lookup() if needed. Signed-off-by: Martin KaFai Lau Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20200122233646.903260-1-kafai@fb.com --- include/uapi/linux/bpf.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) (limited to 'include/uapi') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index e81628eb059c..f1d74a2bd234 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -2886,6 +2886,12 @@ union bpf_attr { * **-EPERM** if no permission to send the *sig*. * * **-EAGAIN** if bpf program can try again. + * + * u64 bpf_jiffies64(void) + * Description + * Obtain the 64bit jiffies + * Return + * The 64 bit jiffies */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3005,7 +3011,8 @@ union bpf_attr { FN(probe_read_user_str), \ FN(probe_read_kernel_str), \ FN(tcp_send_ack), \ - FN(send_signal_thread), + FN(send_signal_thread), \ + FN(jiffies64), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call -- cgit From ec97ecf1ebe485a17cd8395a5f35e6b80b57665a Mon Sep 17 00:00:00 2001 From: "Mohit P. Tahiliani" Date: Wed, 22 Jan 2020 23:52:33 +0530 Subject: net: sched: add Flow Queue PIE packet scheduler Principles: - Packets are classified on flows. - This is a Stochastic model (as we use a hash, several flows might be hashed to the same slot) - Each flow has a PIE managed queue. - Flows are linked onto two (Round Robin) lists, so that new flows have priority on old ones. - For a given flow, packets are not reordered. - Drops during enqueue only. - ECN capability is off by default. - ECN threshold (if ECN is enabled) is at 10% by default. - Uses timestamps to calculate queue delay by default. Usage: tc qdisc ... fq_pie [ limit PACKETS ] [ flows NUMBER ] [ target TIME ] [ tupdate TIME ] [ alpha NUMBER ] [ beta NUMBER ] [ quantum BYTES ] [ memory_limit BYTES ] [ ecnprob PERCENTAGE ] [ [no]ecn ] [ [no]bytemode ] [ [no_]dq_rate_estimator ] defaults: limit: 10240 packets, flows: 1024 target: 15 ms, tupdate: 15 ms (in jiffies) alpha: 1/8, beta : 5/4 quantum: device MTU, memory_limit: 32 Mb ecnprob: 10%, ecn: off bytemode: off, dq_rate_estimator: off Signed-off-by: Mohit P. Tahiliani Signed-off-by: Sachin D. Patil Signed-off-by: V. Saicharan Signed-off-by: Mohit Bhasi Signed-off-by: Leslie Monis Signed-off-by: Gautam Ramakrishnan Signed-off-by: David S. Miller --- include/uapi/linux/pkt_sched.h | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h index bf5a5b1dfb0b..bbe791b24168 100644 --- a/include/uapi/linux/pkt_sched.h +++ b/include/uapi/linux/pkt_sched.h @@ -971,6 +971,37 @@ struct tc_pie_xstats { __u32 ecn_mark; /* packets marked with ecn*/ }; +/* FQ PIE */ +enum { + TCA_FQ_PIE_UNSPEC, + TCA_FQ_PIE_LIMIT, + TCA_FQ_PIE_FLOWS, + TCA_FQ_PIE_TARGET, + TCA_FQ_PIE_TUPDATE, + TCA_FQ_PIE_ALPHA, + TCA_FQ_PIE_BETA, + TCA_FQ_PIE_QUANTUM, + TCA_FQ_PIE_MEMORY_LIMIT, + TCA_FQ_PIE_ECN_PROB, + TCA_FQ_PIE_ECN, + TCA_FQ_PIE_BYTEMODE, + TCA_FQ_PIE_DQ_RATE_ESTIMATOR, + __TCA_FQ_PIE_MAX +}; +#define TCA_FQ_PIE_MAX (__TCA_FQ_PIE_MAX - 1) + +struct tc_fq_pie_xstats { + __u32 packets_in; /* total number of packets enqueued */ + __u32 dropped; /* packets dropped due to fq_pie_action */ + __u32 overlimit; /* dropped due to lack of space in queue */ + __u32 overmemory; /* dropped due to lack of memory in queue */ + __u32 ecn_mark; /* packets marked with ecn */ + __u32 new_flow_count; /* count of new flows created by packets */ + __u32 new_flows_len; /* count of flows in new list */ + __u32 old_flows_len; /* count of flows in old list */ + __u32 memory_usage; /* total memory across all queues */ +}; + /* CBS */ struct tc_cbs_qopt { __u8 offload; -- cgit From a5d29ae226812ae620f58a0726525d8af9b1d751 Mon Sep 17 00:00:00 2001 From: Nikolay Aleksandrov Date: Fri, 24 Jan 2020 13:40:21 +0200 Subject: net: bridge: vlan: add basic option setting support This patch adds support for option modification of single vlans and ranges. It allows to only modify options, i.e. skip create/delete by using the BRIDGE_VLAN_INFO_ONLY_OPTS flag. When working with a range option changes we try to pack the notifications as much as possible. v2: do full port (all vlans) notification only when creating/deleting vlans for compatibility, rework the range detection when changing options, add more verbose extack errors and check if a vlan should be used (br_vlan_should_use checks) Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller --- include/uapi/linux/if_bridge.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h index ac38f0b674b8..c8b87ad52c5b 100644 --- a/include/uapi/linux/if_bridge.h +++ b/include/uapi/linux/if_bridge.h @@ -130,6 +130,7 @@ enum { #define BRIDGE_VLAN_INFO_RANGE_BEGIN (1<<3) /* VLAN is start of vlan range */ #define BRIDGE_VLAN_INFO_RANGE_END (1<<4) /* VLAN is end of vlan range */ #define BRIDGE_VLAN_INFO_BRENTRY (1<<5) /* Global bridge VLAN entry */ +#define BRIDGE_VLAN_INFO_ONLY_OPTS (1<<6) /* Skip create/delete/flags */ struct bridge_vlan_info { __u16 flags; -- cgit From a580c76d534c7360ba68042b19cb255e8420e987 Mon Sep 17 00:00:00 2001 From: Nikolay Aleksandrov Date: Fri, 24 Jan 2020 13:40:22 +0200 Subject: net: bridge: vlan: add per-vlan state The first per-vlan option added is state, it is needed for EVPN and for per-vlan STP. The state allows to control the forwarding on per-vlan basis. The vlan state is considered only if the port state is forwarding in order to avoid conflicts and be consistent. br_allowed_egress is called only when the state is forwarding, but the ingress case is a bit more complicated due to the fact that we may have the transition between port:BR_STATE_FORWARDING -> vlan:BR_STATE_LEARNING which should still allow the bridge to learn from the packet after vlan filtering and it will be dropped after that. Also to optimize the pvid state check we keep a copy in the vlan group to avoid one lookup. The state members are modified with *_ONCE() to annotate the lockless access. Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller --- include/uapi/linux/if_bridge.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h index c8b87ad52c5b..42f7ca38ad80 100644 --- a/include/uapi/linux/if_bridge.h +++ b/include/uapi/linux/if_bridge.h @@ -191,6 +191,7 @@ enum { BRIDGE_VLANDB_ENTRY_UNSPEC, BRIDGE_VLANDB_ENTRY_INFO, BRIDGE_VLANDB_ENTRY_RANGE, + BRIDGE_VLANDB_ENTRY_STATE, __BRIDGE_VLANDB_ENTRY_MAX, }; #define BRIDGE_VLANDB_ENTRY_MAX (__BRIDGE_VLANDB_ENTRY_MAX - 1) -- cgit From 32efcc06d2a15fa87585614d12d6c2308cc2d3f3 Mon Sep 17 00:00:00 2001 From: Abdul Kabbani Date: Fri, 24 Jan 2020 16:34:02 -0500 Subject: tcp: export count for rehash attempts Using IPv6 flow-label to swiftly route around avoid congested or disconnected network path can greatly improve TCP reliability. This patch adds SNMP counters and a OPT_STATS counter to track both host-level and connection-level statistics. Network administrators can use these counters to evaluate the impact of this new ability better. Export count for rehash attempts to 1) two SNMP counters: TcpTimeoutRehash (rehash due to timeouts), and TcpDuplicateDataRehash (rehash due to receiving duplicate packets) 2) Timestamping API SOF_TIMESTAMPING_OPT_STATS. Signed-off-by: Abdul Kabbani Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Kevin(Yudong) Yang Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller --- include/uapi/linux/snmp.h | 2 ++ include/uapi/linux/tcp.h | 1 + 2 files changed, 3 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h index 7eee233e78d2..7d91f4debc48 100644 --- a/include/uapi/linux/snmp.h +++ b/include/uapi/linux/snmp.h @@ -285,6 +285,8 @@ enum LINUX_MIB_TCPRCVQDROP, /* TCPRcvQDrop */ LINUX_MIB_TCPWQUEUETOOBIG, /* TCPWqueueTooBig */ LINUX_MIB_TCPFASTOPENPASSIVEALTKEY, /* TCPFastOpenPassiveAltKey */ + LINUX_MIB_TCPTIMEOUTREHASH, /* TCPTimeoutRehash */ + LINUX_MIB_TCPDUPLICATEDATAREHASH, /* TCPDuplicateDataRehash */ __LINUX_MIB_MAX }; diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index d87184e673ca..fd9eb8f6bcae 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -311,6 +311,7 @@ enum { TCP_NLA_DSACK_DUPS, /* DSACK blocks received */ TCP_NLA_REORD_SEEN, /* reordering events seen */ TCP_NLA_SRTT, /* smoothed RTT in usecs */ + TCP_NLA_TIMEOUT_REHASH, /* Timeout-triggered rehash attempts */ }; /* for TCP_MD5SIG socket option */ -- cgit From 7b225d0b5c6dda5fefab578175f210c6fc7e389a Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Wed, 22 Jan 2020 00:17:52 +0100 Subject: netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute Add NFTA_SET_ELEM_KEY_END attribute to convey the closing element of the interval between kernel and userspace. This patch also adds the NFT_SET_EXT_KEY_END extension to store the closing element value in this interval. v4: No changes v3: New patch [sbrivio: refactor error paths and labels; add corresponding nft_set_ext_type for new key; rebase] Signed-off-by: Pablo Neira Ayuso --- include/uapi/linux/netfilter/nf_tables.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index 261864736b26..c13106496bd2 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -370,6 +370,7 @@ enum nft_set_elem_flags { * @NFTA_SET_ELEM_USERDATA: user data (NLA_BINARY) * @NFTA_SET_ELEM_EXPR: expression (NLA_NESTED: nft_expr_attributes) * @NFTA_SET_ELEM_OBJREF: stateful object reference (NLA_STRING) + * @NFTA_SET_ELEM_KEY_END: closing key value (NLA_NESTED: nft_data) */ enum nft_set_elem_attributes { NFTA_SET_ELEM_UNSPEC, @@ -382,6 +383,7 @@ enum nft_set_elem_attributes { NFTA_SET_ELEM_EXPR, NFTA_SET_ELEM_PAD, NFTA_SET_ELEM_OBJREF, + NFTA_SET_ELEM_KEY_END, __NFTA_SET_ELEM_MAX }; #define NFTA_SET_ELEM_MAX (__NFTA_SET_ELEM_MAX - 1) -- cgit From f3a2181e16f1dcbf5446ed43f6b5d9f56c459f85 Mon Sep 17 00:00:00 2001 From: Stefano Brivio Date: Wed, 22 Jan 2020 00:17:53 +0100 Subject: netfilter: nf_tables: Support for sets with multiple ranged fields Introduce a new nested netlink attribute, NFTA_SET_DESC_CONCAT, used to specify the length of each field in a set concatenation. This allows set implementations to support concatenation of multiple ranged items, as they can divide the input key into matching data for every single field. Such set implementations would be selected as they specify support for NFT_SET_INTERVAL and allow desc->field_count to be greater than one. Explicitly disallow this for nft_set_rbtree. In order to specify the interval for a set entry, userspace would include in NFTA_SET_DESC_CONCAT attributes field lengths, and pass range endpoints as two separate keys, represented by attributes NFTA_SET_ELEM_KEY and NFTA_SET_ELEM_KEY_END. While at it, export the number of 32-bit registers available for packet matching, as nftables will need this to know the maximum number of field lengths that can be specified. For example, "packets with an IPv4 address between 192.0.2.0 and 192.0.2.42, with destination port between 22 and 25", can be expressed as two concatenated elements: NFTA_SET_ELEM_KEY: 192.0.2.0 . 22 NFTA_SET_ELEM_KEY_END: 192.0.2.42 . 25 and NFTA_SET_DESC_CONCAT attribute would contain: NFTA_LIST_ELEM NFTA_SET_FIELD_LEN: 4 NFTA_LIST_ELEM NFTA_SET_FIELD_LEN: 2 v4: No changes v3: Complete rework, NFTA_SET_DESC_CONCAT instead of NFTA_SET_SUBKEY v2: No changes Signed-off-by: Stefano Brivio Signed-off-by: Pablo Neira Ayuso --- include/uapi/linux/netfilter/nf_tables.h | 15 +++++++++++++++ 1 file changed, 15 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index c13106496bd2..065218a20bb7 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -48,6 +48,7 @@ enum nft_registers { #define NFT_REG_SIZE 16 #define NFT_REG32_SIZE 4 +#define NFT_REG32_COUNT (NFT_REG32_15 - NFT_REG32_00 + 1) /** * enum nft_verdicts - nf_tables internal verdicts @@ -301,14 +302,28 @@ enum nft_set_policies { * enum nft_set_desc_attributes - set element description * * @NFTA_SET_DESC_SIZE: number of elements in set (NLA_U32) + * @NFTA_SET_DESC_CONCAT: description of field concatenation (NLA_NESTED) */ enum nft_set_desc_attributes { NFTA_SET_DESC_UNSPEC, NFTA_SET_DESC_SIZE, + NFTA_SET_DESC_CONCAT, __NFTA_SET_DESC_MAX }; #define NFTA_SET_DESC_MAX (__NFTA_SET_DESC_MAX - 1) +/** + * enum nft_set_field_attributes - attributes of concatenated fields + * + * @NFTA_SET_FIELD_LEN: length of single field, in bits (NLA_U32) + */ +enum nft_set_field_attributes { + NFTA_SET_FIELD_UNSPEC, + NFTA_SET_FIELD_LEN, + __NFTA_SET_FIELD_MAX +}; +#define NFTA_SET_FIELD_MAX (__NFTA_SET_FIELD_MAX - 1) + /** * enum nft_set_attributes - nf_tables set netlink attributes * -- cgit From 6a94b8ccf6b77f005ab1b36a878e1d81df0c033e Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Sun, 26 Jan 2020 23:11:04 +0100 Subject: ethtool: provide message mask with DEBUG_GET request Implement DEBUG_GET request to get debugging settings for a device. At the moment, only message mask corresponding to message level as reported by ETHTOOL_GMSGLVL ioctl request is provided. (It is called message level in ioctl interface but almost all drivers interpret it as a bit mask.) As part of the implementation, provide symbolic names for message mask bits as ETH_SS_MSG_CLASSES string set. Signed-off-by: Michal Kubecek Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller --- include/uapi/linux/ethtool.h | 2 ++ include/uapi/linux/ethtool_netlink.h | 14 ++++++++++++++ 2 files changed, 16 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index 116bcbf09c74..456fb2aa0fad 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -594,6 +594,7 @@ struct ethtool_pauseparam { * @ETH_SS_PHY_STATS: Statistic names, for use with %ETHTOOL_GPHYSTATS * @ETH_SS_PHY_TUNABLES: PHY tunable names * @ETH_SS_LINK_MODES: link mode names + * @ETH_SS_MSG_CLASSES: debug message class names */ enum ethtool_stringset { ETH_SS_TEST = 0, @@ -606,6 +607,7 @@ enum ethtool_stringset { ETH_SS_PHY_STATS, ETH_SS_PHY_TUNABLES, ETH_SS_LINK_MODES, + ETH_SS_MSG_CLASSES, /* add new constants above here */ ETH_SS_COUNT diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 02f82f42a889..0d39e04567cb 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -20,6 +20,7 @@ enum { ETHTOOL_MSG_LINKMODES_GET, ETHTOOL_MSG_LINKMODES_SET, ETHTOOL_MSG_LINKSTATE_GET, + ETHTOOL_MSG_DEBUG_GET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -35,6 +36,7 @@ enum { ETHTOOL_MSG_LINKMODES_GET_REPLY, ETHTOOL_MSG_LINKMODES_NTF, ETHTOOL_MSG_LINKSTATE_GET_REPLY, + ETHTOOL_MSG_DEBUG_GET_REPLY, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@ -195,6 +197,18 @@ enum { ETHTOOL_A_LINKSTATE_MAX = __ETHTOOL_A_LINKSTATE_CNT - 1 }; +/* DEBUG */ + +enum { + ETHTOOL_A_DEBUG_UNSPEC, + ETHTOOL_A_DEBUG_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_DEBUG_MSGMASK, /* bitset */ + + /* add new constants above here */ + __ETHTOOL_A_DEBUG_CNT, + ETHTOOL_A_DEBUG_MAX = __ETHTOOL_A_DEBUG_CNT - 1 +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 -- cgit From e54d04e3afead22d8e7d6edaaac487a1205bac39 Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Sun, 26 Jan 2020 23:11:07 +0100 Subject: ethtool: set message mask with DEBUG_SET request Implement DEBUG_SET netlink request to set debugging settings for a device. At the moment, only message mask corresponding to message level as set by ETHTOOL_SMSGLVL ioctl request can be set. (It is called message level in ioctl interface but almost all drivers interpret it as a bit mask.) Signed-off-by: Michal Kubecek Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 0d39e04567cb..8f0b6fd277d5 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -21,6 +21,7 @@ enum { ETHTOOL_MSG_LINKMODES_SET, ETHTOOL_MSG_LINKSTATE_GET, ETHTOOL_MSG_DEBUG_GET, + ETHTOOL_MSG_DEBUG_SET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, -- cgit From 0bda7af39d2ba6ef83bd18e26bcb027f1630004e Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Sun, 26 Jan 2020 23:11:10 +0100 Subject: ethtool: add DEBUG_NTF notification Send ETHTOOL_MSG_DEBUG_NTF notification message whenever debugging message mask for a device are modified using ETHTOOL_MSG_DEBUG_SET netlink message or ETHTOOL_SMSGLVL ioctl request. The notification message has the same format as reply to DEBUG_GET request. As with other ethtool notifications, netlink requests only trigger the notification if the mask is actually changed while ioctl request trigger it whenever the request results in calling the ethtool_ops handler. Signed-off-by: Michal Kubecek Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 8f0b6fd277d5..67a06b94bf28 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -38,6 +38,7 @@ enum { ETHTOOL_MSG_LINKMODES_NTF, ETHTOOL_MSG_LINKSTATE_GET_REPLY, ETHTOOL_MSG_DEBUG_GET_REPLY, + ETHTOOL_MSG_DEBUG_NTF, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, -- cgit From 51ea22b04ea0c210f9ce87b8a600965dbe476bc2 Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Sun, 26 Jan 2020 23:11:13 +0100 Subject: ethtool: provide WoL settings with WOL_GET request Implement WOL_GET request to get wake-on-lan settings for a device, traditionally available via ETHTOOL_GWOL ioctl request. As part of the implementation, provide symbolic names for wake-on-line modes as ETH_SS_WOL_MODES string set. Signed-off-by: Michal Kubecek Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller --- include/uapi/linux/ethtool.h | 4 ++++ include/uapi/linux/ethtool_netlink.h | 15 +++++++++++++++ 2 files changed, 19 insertions(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index 456fb2aa0fad..4295ebfa2f91 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -595,6 +595,7 @@ struct ethtool_pauseparam { * @ETH_SS_PHY_TUNABLES: PHY tunable names * @ETH_SS_LINK_MODES: link mode names * @ETH_SS_MSG_CLASSES: debug message class names + * @ETH_SS_WOL_MODES: wake-on-lan modes */ enum ethtool_stringset { ETH_SS_TEST = 0, @@ -608,6 +609,7 @@ enum ethtool_stringset { ETH_SS_PHY_TUNABLES, ETH_SS_LINK_MODES, ETH_SS_MSG_CLASSES, + ETH_SS_WOL_MODES, /* add new constants above here */ ETH_SS_COUNT @@ -1695,6 +1697,8 @@ static inline int ethtool_validate_duplex(__u8 duplex) #define WAKE_MAGICSECURE (1 << 6) /* only meaningful if WAKE_MAGIC */ #define WAKE_FILTER (1 << 7) +#define WOL_MODE_COUNT 8 + /* L2-L4 network traffic flow types */ #define TCP_V4_FLOW 0x01 /* hash or spec (tcp_ip4_spec) */ #define UDP_V4_FLOW 0x02 /* hash or spec (udp_ip4_spec) */ diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 67a06b94bf28..dcc5c32dc018 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -22,6 +22,7 @@ enum { ETHTOOL_MSG_LINKSTATE_GET, ETHTOOL_MSG_DEBUG_GET, ETHTOOL_MSG_DEBUG_SET, + ETHTOOL_MSG_WOL_GET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -39,6 +40,7 @@ enum { ETHTOOL_MSG_LINKSTATE_GET_REPLY, ETHTOOL_MSG_DEBUG_GET_REPLY, ETHTOOL_MSG_DEBUG_NTF, + ETHTOOL_MSG_WOL_GET_REPLY, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@ -211,6 +213,19 @@ enum { ETHTOOL_A_DEBUG_MAX = __ETHTOOL_A_DEBUG_CNT - 1 }; +/* WOL */ + +enum { + ETHTOOL_A_WOL_UNSPEC, + ETHTOOL_A_WOL_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_WOL_MODES, /* bitset */ + ETHTOOL_A_WOL_SOPASS, /* binary */ + + /* add new constants above here */ + __ETHTOOL_A_WOL_CNT, + ETHTOOL_A_WOL_MAX = __ETHTOOL_A_WOL_CNT - 1 +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 -- cgit From 8d425b19b305a77cb1ba67b84e7c1ca547de032f Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Sun, 26 Jan 2020 23:11:16 +0100 Subject: ethtool: set wake-on-lan settings with WOL_SET request Implement WOL_SET netlink request to set wake-on-lan settings. This is equivalent to ETHTOOL_SWOL ioctl request. Signed-off-by: Michal Kubecek Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index dcc5c32dc018..59de35695521 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -23,6 +23,7 @@ enum { ETHTOOL_MSG_DEBUG_GET, ETHTOOL_MSG_DEBUG_SET, ETHTOOL_MSG_WOL_GET, + ETHTOOL_MSG_WOL_SET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, -- cgit From 67bffa79231f15ba372007972563cc8aa5e48bfa Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Sun, 26 Jan 2020 23:11:19 +0100 Subject: ethtool: add WOL_NTF notification Send ETHTOOL_MSG_WOL_NTF notification whenever wake-on-lan settings of a device are modified using ETHTOOL_MSG_WOL_SET netlink message or ETHTOOL_SWOL ioctl request. As notifications can be received by anyone, do not include SecureOn(tm) password in notification messages. Signed-off-by: Michal Kubecek Signed-off-by: David S. Miller --- include/uapi/linux/ethtool_netlink.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi') diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 59de35695521..7e0b460f872c 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -42,6 +42,7 @@ enum { ETHTOOL_MSG_DEBUG_GET_REPLY, ETHTOOL_MSG_DEBUG_NTF, ETHTOOL_MSG_WOL_GET_REPLY, + ETHTOOL_MSG_WOL_NTF, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, -- cgit