summaryrefslogtreecommitdiff
path: root/include/trace
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2021-04-29 11:57:23 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2021-04-29 11:57:23 -0700
commit9d31d2338950293ec19d9b095fbaa9030899dcb4 (patch)
treee688040d0557c24a2eeb9f6c9c223d949f6f7ef9 /include/trace
parent635de956a7f5a6ffcb04f29d70630c64c717b56b (diff)
parent4a52dd8fefb45626dace70a63c0738dbd83b7edb (diff)
Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski: "Core: - bpf: - allow bpf programs calling kernel functions (initially to reuse TCP congestion control implementations) - enable task local storage for tracing programs - remove the need to store per-task state in hash maps, and allow tracing programs access to task local storage previously added for BPF_LSM - add bpf_for_each_map_elem() helper, allowing programs to walk all map elements in a more robust and easier to verify fashion - sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT redirection - lpm: add support for batched ops in LPM trie - add BTF_KIND_FLOAT support - mostly to allow use of BTF on s390 which has floats in its headers files - improve BPF syscall documentation and extend the use of kdoc parsing scripts we already employ for bpf-helpers - libbpf, bpftool: support static linking of BPF ELF files - improve support for encapsulation of L2 packets - xdp: restructure redirect actions to avoid a runtime lookup, improving performance by 4-8% in microbenchmarks - xsk: build skb by page (aka generic zerocopy xmit) - improve performance of software AF_XDP path by 33% for devices which don't need headers in the linear skb part (e.g. virtio) - nexthop: resilient next-hop groups - improve path stability on next-hops group changes (incl. offload for mlxsw) - ipv6: segment routing: add support for IPv4 decapsulation - icmp: add support for RFC 8335 extended PROBE messages - inet: use bigger hash table for IP ID generation - tcp: deal better with delayed TX completions - make sure we don't give up on fast TCP retransmissions only because driver is slow in reporting that it completed transmitting the original - tcp: reorder tcp_congestion_ops for better cache locality - mptcp: - add sockopt support for common TCP options - add support for common TCP msg flags - include multiple address ids in RM_ADDR - add reset option support for resetting one subflow - udp: GRO L4 improvements - improve 'forward' / 'frag_list' co-existence with UDP tunnel GRO, allowing the first to take place correctly even for encapsulated UDP traffic - micro-optimize dev_gro_receive() and flow dissection, avoid retpoline overhead on VLAN and TEB GRO - use less memory for sysctls, add a new sysctl type, to allow using u8 instead of "int" and "long" and shrink networking sysctls - veth: allow GRO without XDP - this allows aggregating UDP packets before handing them off to routing, bridge, OvS, etc. - allow specifing ifindex when device is moved to another namespace - netfilter: - nft_socket: add support for cgroupsv2 - nftables: add catch-all set element - special element used to define a default action in case normal lookup missed - use net_generic infra in many modules to avoid allocating per-ns memory unnecessarily - xps: improve the xps handling to avoid potential out-of-bound accesses and use-after-free when XPS change race with other re-configuration under traffic - add a config knob to turn off per-cpu netdev refcnt to catch underflows in testing Device APIs: - add WWAN subsystem to organize the WWAN interfaces better and hopefully start driving towards more unified and vendor- independent APIs - ethtool: - add interface for reading IEEE MIB stats (incl. mlx5 and bnxt support) - allow network drivers to dump arbitrary SFP EEPROM data, current offset+length API was a poor fit for modern SFP which define EEPROM in terms of pages (incl. mlx5 support) - act_police, flow_offload: add support for packet-per-second policing (incl. offload for nfp) - psample: add additional metadata attributes like transit delay for packets sampled from switch HW (and corresponding egress and policy-based sampling in the mlxsw driver) - dsa: improve support for sandwiched LAGs with bridge and DSA - netfilter: - flowtable: use direct xmit in topologies with IP forwarding, bridging, vlans etc. - nftables: counter hardware offload support - Bluetooth: - improvements for firmware download w/ Intel devices - add support for reading AOSP vendor capabilities - add support for virtio transport driver - mac80211: - allow concurrent monitor iface and ethernet rx decap - set priority and queue mapping for injected frames - phy: add support for Clause-45 PHY Loopback - pci/iov: add sysfs MSI-X vector assignment interface to distribute MSI-X resources to VFs (incl. mlx5 support) New hardware/drivers: - dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit interfaces. - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and BCM63xx switches - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches - ath11k: support for QCN9074 a 802.11ax device - Bluetooth: Broadcom BCM4330 and BMC4334 - phy: Marvell 88X2222 transceiver support - mdio: add BCM6368 MDIO mux bus controller - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips - mana: driver for Microsoft Azure Network Adapter (MANA) - Actions Semi Owl Ethernet MAC - can: driver for ETAS ES58X CAN/USB interfaces Pure driver changes: - add XDP support to: enetc, igc, stmmac - add AF_XDP support to: stmmac - virtio: - page_to_skb() use build_skb when there's sufficient tailroom (21% improvement for 1000B UDP frames) - support XDP even without dedicated Tx queues - share the Tx queues with the stack when necessary - mlx5: - flow rules: add support for mirroring with conntrack, matching on ICMP, GTP, flex filters and more - support packet sampling with flow offloads - persist uplink representor netdev across eswitch mode changes - allow coexistence of CQE compression and HW time-stamping - add ethtool extended link error state reporting - ice, iavf: support flow filters, UDP Segmentation Offload - dpaa2-switch: - move the driver out of staging - add spanning tree (STP) support - add rx copybreak support - add tc flower hardware offload on ingress traffic - ionic: - implement Rx page reuse - support HW PTP time-stamping - octeon: support TC hardware offloads - flower matching on ingress and egress ratelimitting. - stmmac: - add RX frame steering based on VLAN priority in tc flower - support frame preemption (FPE) - intel: add cross time-stamping freq difference adjustment - ocelot: - support forwarding of MRP frames in HW - support multiple bridges - support PTP Sync one-step timestamping - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like learning, flooding etc. - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350, SC7280 SoCs) - mt7601u: enable TDLS support - mt76: - add support for 802.3 rx frames (mt7915/mt7615) - mt7915 flash pre-calibration support - mt7921/mt7663 runtime power management fixes" * tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits) net: selftest: fix build issue if INET is disabled net: netrom: nr_in: Remove redundant assignment to ns net: tun: Remove redundant assignment to ret net: phy: marvell: add downshift support for M88E1240 net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255 net/sched: act_ct: Remove redundant ct get and check icmp: standardize naming of RFC 8335 PROBE constants bpf, selftests: Update array map tests for per-cpu batched ops bpf: Add batched ops support for percpu array bpf: Implement formatted output helpers with bstr_printf seq_file: Add a seq_bprintf function sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues net:nfc:digital: Fix a double free in digital_tg_recv_dep_req net: fix a concurrency bug in l2tp_tunnel_register() net/smc: Remove redundant assignment to rc mpls: Remove redundant assignment to err llc2: Remove redundant assignment to rc net/tls: Remove redundant initialization of record rds: Remove redundant assignment to nr_sig dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0 ...
Diffstat (limited to 'include/trace')
-rw-r--r--include/trace/events/mptcp.h173
-rw-r--r--include/trace/events/xdp.h62
2 files changed, 208 insertions, 27 deletions
diff --git a/include/trace/events/mptcp.h b/include/trace/events/mptcp.h
new file mode 100644
index 000000000000..775a46d0b0f0
--- /dev/null
+++ b/include/trace/events/mptcp.h
@@ -0,0 +1,173 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM mptcp
+
+#if !defined(_TRACE_MPTCP_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_MPTCP_H
+
+#include <linux/tracepoint.h>
+
+#define show_mapping_status(status) \
+ __print_symbolic(status, \
+ { 0, "MAPPING_OK" }, \
+ { 1, "MAPPING_INVALID" }, \
+ { 2, "MAPPING_EMPTY" }, \
+ { 3, "MAPPING_DATA_FIN" }, \
+ { 4, "MAPPING_DUMMY" })
+
+TRACE_EVENT(mptcp_subflow_get_send,
+
+ TP_PROTO(struct mptcp_subflow_context *subflow),
+
+ TP_ARGS(subflow),
+
+ TP_STRUCT__entry(
+ __field(bool, active)
+ __field(bool, free)
+ __field(u32, snd_wnd)
+ __field(u32, pace)
+ __field(u8, backup)
+ __field(u64, ratio)
+ ),
+
+ TP_fast_assign(
+ struct sock *ssk;
+
+ __entry->active = mptcp_subflow_active(subflow);
+ __entry->backup = subflow->backup;
+
+ if (subflow->tcp_sock && sk_fullsock(subflow->tcp_sock))
+ __entry->free = sk_stream_memory_free(subflow->tcp_sock);
+ else
+ __entry->free = 0;
+
+ ssk = mptcp_subflow_tcp_sock(subflow);
+ if (ssk && sk_fullsock(ssk)) {
+ __entry->snd_wnd = tcp_sk(ssk)->snd_wnd;
+ __entry->pace = ssk->sk_pacing_rate;
+ } else {
+ __entry->snd_wnd = 0;
+ __entry->pace = 0;
+ }
+
+ if (ssk && sk_fullsock(ssk) && __entry->pace)
+ __entry->ratio = div_u64((u64)ssk->sk_wmem_queued << 32, __entry->pace);
+ else
+ __entry->ratio = 0;
+ ),
+
+ TP_printk("active=%d free=%d snd_wnd=%u pace=%u backup=%u ratio=%llu",
+ __entry->active, __entry->free,
+ __entry->snd_wnd, __entry->pace,
+ __entry->backup, __entry->ratio)
+);
+
+DECLARE_EVENT_CLASS(mptcp_dump_mpext,
+
+ TP_PROTO(struct mptcp_ext *mpext),
+
+ TP_ARGS(mpext),
+
+ TP_STRUCT__entry(
+ __field(u64, data_ack)
+ __field(u64, data_seq)
+ __field(u32, subflow_seq)
+ __field(u16, data_len)
+ __field(u8, use_map)
+ __field(u8, dsn64)
+ __field(u8, data_fin)
+ __field(u8, use_ack)
+ __field(u8, ack64)
+ __field(u8, mpc_map)
+ __field(u8, frozen)
+ __field(u8, reset_transient)
+ __field(u8, reset_reason)
+ ),
+
+ TP_fast_assign(
+ __entry->data_ack = mpext->ack64 ? mpext->data_ack : mpext->data_ack32;
+ __entry->data_seq = mpext->data_seq;
+ __entry->subflow_seq = mpext->subflow_seq;
+ __entry->data_len = mpext->data_len;
+ __entry->use_map = mpext->use_map;
+ __entry->dsn64 = mpext->dsn64;
+ __entry->data_fin = mpext->data_fin;
+ __entry->use_ack = mpext->use_ack;
+ __entry->ack64 = mpext->ack64;
+ __entry->mpc_map = mpext->mpc_map;
+ __entry->frozen = mpext->frozen;
+ __entry->reset_transient = mpext->reset_transient;
+ __entry->reset_reason = mpext->reset_reason;
+ ),
+
+ TP_printk("data_ack=%llu data_seq=%llu subflow_seq=%u data_len=%u use_map=%u dsn64=%u data_fin=%u use_ack=%u ack64=%u mpc_map=%u frozen=%u reset_transient=%u reset_reason=%u",
+ __entry->data_ack, __entry->data_seq,
+ __entry->subflow_seq, __entry->data_len,
+ __entry->use_map, __entry->dsn64,
+ __entry->data_fin, __entry->use_ack,
+ __entry->ack64, __entry->mpc_map,
+ __entry->frozen, __entry->reset_transient,
+ __entry->reset_reason)
+);
+
+DEFINE_EVENT(mptcp_dump_mpext, get_mapping_status,
+ TP_PROTO(struct mptcp_ext *mpext),
+ TP_ARGS(mpext));
+
+TRACE_EVENT(ack_update_msk,
+
+ TP_PROTO(u64 data_ack, u64 old_snd_una,
+ u64 new_snd_una, u64 new_wnd_end,
+ u64 msk_wnd_end),
+
+ TP_ARGS(data_ack, old_snd_una,
+ new_snd_una, new_wnd_end,
+ msk_wnd_end),
+
+ TP_STRUCT__entry(
+ __field(u64, data_ack)
+ __field(u64, old_snd_una)
+ __field(u64, new_snd_una)
+ __field(u64, new_wnd_end)
+ __field(u64, msk_wnd_end)
+ ),
+
+ TP_fast_assign(
+ __entry->data_ack = data_ack;
+ __entry->old_snd_una = old_snd_una;
+ __entry->new_snd_una = new_snd_una;
+ __entry->new_wnd_end = new_wnd_end;
+ __entry->msk_wnd_end = msk_wnd_end;
+ ),
+
+ TP_printk("data_ack=%llu old_snd_una=%llu new_snd_una=%llu new_wnd_end=%llu msk_wnd_end=%llu",
+ __entry->data_ack, __entry->old_snd_una,
+ __entry->new_snd_una, __entry->new_wnd_end,
+ __entry->msk_wnd_end)
+);
+
+TRACE_EVENT(subflow_check_data_avail,
+
+ TP_PROTO(__u8 status, struct sk_buff *skb),
+
+ TP_ARGS(status, skb),
+
+ TP_STRUCT__entry(
+ __field(u8, status)
+ __field(const void *, skb)
+ ),
+
+ TP_fast_assign(
+ __entry->status = status;
+ __entry->skb = skb;
+ ),
+
+ TP_printk("mapping_status=%s, skb=%p",
+ show_mapping_status(__entry->status),
+ __entry->skb)
+);
+
+#endif /* _TRACE_MPTCP_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h
index 76a97176ab81..fcad3645a70b 100644
--- a/include/trace/events/xdp.h
+++ b/include/trace/events/xdp.h
@@ -86,19 +86,15 @@ struct _bpf_dtab_netdev {
};
#endif /* __DEVMAP_OBJ_TYPE */
-#define devmap_ifindex(tgt, map) \
- (((map->map_type == BPF_MAP_TYPE_DEVMAP || \
- map->map_type == BPF_MAP_TYPE_DEVMAP_HASH)) ? \
- ((struct _bpf_dtab_netdev *)tgt)->dev->ifindex : 0)
-
DECLARE_EVENT_CLASS(xdp_redirect_template,
TP_PROTO(const struct net_device *dev,
const struct bpf_prog *xdp,
const void *tgt, int err,
- const struct bpf_map *map, u32 index),
+ enum bpf_map_type map_type,
+ u32 map_id, u32 index),
- TP_ARGS(dev, xdp, tgt, err, map, index),
+ TP_ARGS(dev, xdp, tgt, err, map_type, map_id, index),
TP_STRUCT__entry(
__field(int, prog_id)
@@ -111,14 +107,22 @@ DECLARE_EVENT_CLASS(xdp_redirect_template,
),
TP_fast_assign(
+ u32 ifindex = 0, map_index = index;
+
+ if (map_type == BPF_MAP_TYPE_DEVMAP || map_type == BPF_MAP_TYPE_DEVMAP_HASH) {
+ ifindex = ((struct _bpf_dtab_netdev *)tgt)->dev->ifindex;
+ } else if (map_type == BPF_MAP_TYPE_UNSPEC && map_id == INT_MAX) {
+ ifindex = index;
+ map_index = 0;
+ }
+
__entry->prog_id = xdp->aux->id;
__entry->act = XDP_REDIRECT;
__entry->ifindex = dev->ifindex;
__entry->err = err;
- __entry->to_ifindex = map ? devmap_ifindex(tgt, map) :
- index;
- __entry->map_id = map ? map->id : 0;
- __entry->map_index = map ? index : 0;
+ __entry->to_ifindex = ifindex;
+ __entry->map_id = map_id;
+ __entry->map_index = map_index;
),
TP_printk("prog_id=%d action=%s ifindex=%d to_ifindex=%d err=%d"
@@ -133,45 +137,49 @@ DEFINE_EVENT(xdp_redirect_template, xdp_redirect,
TP_PROTO(const struct net_device *dev,
const struct bpf_prog *xdp,
const void *tgt, int err,
- const struct bpf_map *map, u32 index),
- TP_ARGS(dev, xdp, tgt, err, map, index)
+ enum bpf_map_type map_type,
+ u32 map_id, u32 index),
+ TP_ARGS(dev, xdp, tgt, err, map_type, map_id, index)
);
DEFINE_EVENT(xdp_redirect_template, xdp_redirect_err,
TP_PROTO(const struct net_device *dev,
const struct bpf_prog *xdp,
const void *tgt, int err,
- const struct bpf_map *map, u32 index),
- TP_ARGS(dev, xdp, tgt, err, map, index)
+ enum bpf_map_type map_type,
+ u32 map_id, u32 index),
+ TP_ARGS(dev, xdp, tgt, err, map_type, map_id, index)
);
-#define _trace_xdp_redirect(dev, xdp, to) \
- trace_xdp_redirect(dev, xdp, NULL, 0, NULL, to)
+#define _trace_xdp_redirect(dev, xdp, to) \
+ trace_xdp_redirect(dev, xdp, NULL, 0, BPF_MAP_TYPE_UNSPEC, INT_MAX, to)
-#define _trace_xdp_redirect_err(dev, xdp, to, err) \
- trace_xdp_redirect_err(dev, xdp, NULL, err, NULL, to)
+#define _trace_xdp_redirect_err(dev, xdp, to, err) \
+ trace_xdp_redirect_err(dev, xdp, NULL, err, BPF_MAP_TYPE_UNSPEC, INT_MAX, to)
-#define _trace_xdp_redirect_map(dev, xdp, to, map, index) \
- trace_xdp_redirect(dev, xdp, to, 0, map, index)
+#define _trace_xdp_redirect_map(dev, xdp, to, map_type, map_id, index) \
+ trace_xdp_redirect(dev, xdp, to, 0, map_type, map_id, index)
-#define _trace_xdp_redirect_map_err(dev, xdp, to, map, index, err) \
- trace_xdp_redirect_err(dev, xdp, to, err, map, index)
+#define _trace_xdp_redirect_map_err(dev, xdp, to, map_type, map_id, index, err) \
+ trace_xdp_redirect_err(dev, xdp, to, err, map_type, map_id, index)
/* not used anymore, but kept around so as not to break old programs */
DEFINE_EVENT(xdp_redirect_template, xdp_redirect_map,
TP_PROTO(const struct net_device *dev,
const struct bpf_prog *xdp,
const void *tgt, int err,
- const struct bpf_map *map, u32 index),
- TP_ARGS(dev, xdp, tgt, err, map, index)
+ enum bpf_map_type map_type,
+ u32 map_id, u32 index),
+ TP_ARGS(dev, xdp, tgt, err, map_type, map_id, index)
);
DEFINE_EVENT(xdp_redirect_template, xdp_redirect_map_err,
TP_PROTO(const struct net_device *dev,
const struct bpf_prog *xdp,
const void *tgt, int err,
- const struct bpf_map *map, u32 index),
- TP_ARGS(dev, xdp, tgt, err, map, index)
+ enum bpf_map_type map_type,
+ u32 map_id, u32 index),
+ TP_ARGS(dev, xdp, tgt, err, map_type, map_id, index)
);
TRACE_EVENT(xdp_cpumap_kthread,