diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-05-14 19:42:24 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-05-14 19:42:24 -0700 |
commit | 1b294a1f35616977caddaddf3e9d28e576a1adbc (patch) | |
tree | 723a406740083006b8f8724b5c5e532d4efa431d /drivers/net/virtio_net.c | |
parent | b850dc206a57ae272c639e31ac202ec0c2f46960 (diff) | |
parent | 654de42f3fc6edc29d743c1dbcd1424f7793f63d (diff) |
Merge tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Complete rework of garbage collection of AF_UNIX sockets.
AF_UNIX is prone to forming reference count cycles due to fd
passing functionality. New method based on Tarjan's Strongly
Connected Components algorithm should be both faster and remove a
lot of workarounds we accumulated over the years.
- Add TCP fraglist GRO support, allowing chaining multiple TCP
packets and forwarding them together. Useful for small switches /
routers which lack basic checksum offload in some scenarios (e.g.
PPPoE).
- Support using SMP threads for handling packet backlog i.e. packet
processing from software interfaces and old drivers which don't use
NAPI. This helps move the processing out of the softirq jumble.
- Continue work of converting from rtnl lock to RCU protection.
Don't require rtnl lock when reading: IPv6 routing FIB, IPv6
address labels, netdev threaded NAPI sysfs files, bonding driver's
sysfs files, MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics,
TC Qdiscs, neighbor entries, ARP entries via ioctl(SIOCGARP), a lot
of the link information available via rtnetlink.
- Small optimizations from Eric to UDP wake up handling, memory
accounting, RPS/RFS implementation, TCP packet sizing etc.
- Allow direct page recycling in the bulk API used by XDP, for +2%
PPS.
- Support peek with an offset on TCP sockets.
- Add MPTCP APIs for querying last time packets were received/sent/acked
and whether MPTCP "upgrade" succeeded on a TCP socket.
- Add intra-node communication shortcut to improve SMC performance.
- Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol
driver.
- Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver.
- Add reset reasons for tracing what caused a TCP reset to be sent.
- Introduce direction attribute for xfrm (IPSec) states. State can be
used either for input or output packet processing.
Things we sprinkled into general kernel code:
- Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS().
This required touch-ups and renaming of a few existing users.
- Add Endian-dependent __counted_by_{le,be} annotations.
- Make building selftests "quieter" by printing summaries like
"CC object.o" rather than full commands with all the arguments.
Netfilter:
- Use GFP_KERNEL to clone elements, to deal better with OOM
situations and avoid failures in the .commit step.
BPF:
- Add eBPF JIT for ARCv2 CPUs.
- Support attaching kprobe BPF programs through kprobe_multi link in
a session mode, meaning, a BPF program is attached to both function
entry and return, the entry program can decide if the return
program gets executed and the entry program can share u64 cookie
value with return program. "Session mode" is a common use-case for
tetragon and bpftrace.
- Add the ability to specify and retrieve BPF cookie for raw
tracepoint programs in order to ease migration from classic to raw
tracepoints.
- Add an internal-only BPF per-CPU instruction for resolving per-CPU
memory addresses and implement support in x86, ARM64 and RISC-V
JITs. This allows inlining functions which need to access per-CPU
state.
- Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
atomics in bpf_arena which can be JITed as a single x86
instruction. Support BPF arena on ARM64.
- Add a new bpf_wq API for deferring events and refactor
process-context bpf_timer code to keep common code where possible.
- Harden the BPF verifier's and/or/xor value tracking.
- Introduce crypto kfuncs to let BPF programs call kernel crypto
APIs.
- Support bpf_tail_call_static() helper for BPF programs with GCC 13.
- Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
program to have code sections where preemption is disabled.
Driver API:
- Skip software TC processing completely if all installed rules are
marked as HW-only, instead of checking the HW-only flag rule by
rule.
- Add support for configuring PoE (Power over Ethernet), similar to
the already existing support for PoDL (Power over Data Line)
config.
- Initial bits of a queue control API, for now allowing a single
queue to be reset without disturbing packet flow to other queues.
- Common (ethtool) statistics for hardware timestamping.
Tests and tooling:
- Remove the need to create a config file to run the net forwarding
tests so that a naive "make run_tests" can exercise them.
- Define a method of writing tests which require an external endpoint
to communicate with (to send/receive data towards the test
machine). Add a few such tests.
- Create a shared code library for writing Python tests. Expose the
YAML Netlink library from tools/ to the tests for easy Netlink
access.
- Move netfilter tests under net/, extend them, separate performance
tests from correctness tests, and iron out issues found by running
them "on every commit".
- Refactor BPF selftests to use common network helpers.
- Further work filling in YAML definitions of Netlink messages for:
nftables, team driver, bonding interfaces, vlan interfaces, VF
info, TC u32 mark, TC police action.
- Teach Python YAML Netlink to decode attribute policies.
- Extend the definition of the "indexed array" construct in the specs
to cover arrays of scalars rather than just nests.
- Add hyperlinks between definitions in generated Netlink docs.
Drivers:
- Make sure unsupported flower control flags are rejected by drivers,
and make more drivers report errors directly to the application
rather than dmesg (large number of driver changes from Asbjørn
Sloth Tønnesen).
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support multiple RSS contexts and steering traffic to them
- support XDP metadata
- make page pool allocations more NUMA aware
- Intel (100G, ice, idpf):
- extract datapath code common among Intel drivers into a library
- use fewer resources in switchdev by sharing queues with the PF
- add PFCP filter support
- add Ethernet filter support
- use a spinlock instead of HW lock in PTP clock ops
- support 5 layer Tx scheduler topology
- nVidia/Mellanox:
- 800G link modes and 100G SerDes speeds
- per-queue IRQ coalescing configuration
- Marvell Octeon:
- support offloading TC packet mark action
- Ethernet NICs consumer, embedded and virtual:
- stop lying about skb->truesize in USB Ethernet drivers, it
messes up TCP memory calculations
- Google cloud vNIC:
- support changing ring size via ethtool
- support ring reset using the queue control API
- VirtIO net:
- expose flow hash from RSS to XDP
- per-queue statistics
- add selftests
- Synopsys (stmmac):
- support controllers which require an RX clock signal from the
MII bus to perform their hardware initialization
- TI:
- icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices
- icssg_prueth: add SW TX / RX Coalescing based on hrtimers
- cpsw: minimal XDP support
- Renesas (ravb):
- support describing the MDIO bus
- Realtek (r8169):
- add support for RTL8168M
- Microchip Sparx5:
- matchall and flower actions mirred and redirect
- Ethernet switches:
- nVidia/Mellanox:
- improve events processing performance
- Marvell:
- add support for MV88E6250 family internal PHYs
- Microchip:
- add DCB and DSCP mapping support for KSZ switches
- vsc73xx: convert to PHYLINK
- Realtek:
- rtl8226b/rtl8221b: add C45 instances and SerDes switching
- Many driver changes related to PHYLIB and PHYLINK deprecated API
cleanup
- Ethernet PHYs:
- Add a new driver for Airoha EN8811H 2.5 Gigabit PHY.
- micrel: lan8814: add support for PPS out and external timestamp trigger
- WiFi:
- Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices
drivers. Modern devices can only be configured using nl80211.
- mac80211/cfg80211
- handle color change per link for WiFi 7 Multi-Link Operation
- Intel (iwlwifi):
- don't support puncturing in 5 GHz
- support monitor mode on passive channels
- BZ-W device support
- P2P with HE/EHT support
- re-add support for firmware API 90
- provide channel survey information for Automatic Channel Selection
- MediaTek (mt76):
- mt7921 LED control
- mt7925 EHT radiotap support
- mt7920e PCI support
- Qualcomm (ath11k):
- P2P support for QCA6390, WCN6855 and QCA2066
- support hibernation
- ieee80211-freq-limit Device Tree property support
- Qualcomm (ath12k):
- refactoring in preparation of multi-link support
- suspend and hibernation support
- ACPI support
- debugfs support, including dfs_simulate_radar support
- RealTek:
- rtw88: RTL8723CS SDIO device support
- rtw89: RTL8922AE Wi-Fi 7 PCI device support
- rtw89: complete features of new WiFi 7 chip 8922AE including
BT-coexistence and Wake-on-WLAN
- rtw89: use BIOS ACPI settings to set TX power and channels
- rtl8xxxu: enable Management Frame Protection (MFP) support
- Bluetooth:
- support for Intel BlazarI and Filmore Peak2 (BE201)
- support for MediaTek MT7921S SDIO
- initial support for Intel PCIe BT driver
- remove HCI_AMP support"
* tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1827 commits)
selftests: netfilter: fix packetdrill conntrack testcase
net: gro: fix napi_gro_cb zeroed alignment
Bluetooth: btintel_pcie: Refactor and code cleanup
Bluetooth: btintel_pcie: Fix warning reported by sparse
Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1
Bluetooth: btintel: Fix compiler warning for multi_v7_defconfig config
Bluetooth: btintel_pcie: Fix compiler warnings
Bluetooth: btintel_pcie: Add *setup* function to download firmware
Bluetooth: btintel_pcie: Add support for PCIe transport
Bluetooth: btintel: Export few static functions
Bluetooth: HCI: Remove HCI_AMP support
Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
Bluetooth: qca: Fix error code in qca_read_fw_build_info()
Bluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warning
Bluetooth: btintel: Add support for Filmore Peak2 (BE201)
Bluetooth: btintel: Add support for BlazarI
LE Create Connection command timeout increased to 20 secs
dt-bindings: net: bluetooth: Add MediaTek MT7921S SDIO Bluetooth
Bluetooth: compute LE flow credits based on recvbuf space
Bluetooth: hci_sync: Use cmd->num_cis instead of magic number
...
Diffstat (limited to 'drivers/net/virtio_net.c')
-rw-r--r-- | drivers/net/virtio_net.c | 1454 |
1 files changed, 1227 insertions, 227 deletions
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 115c3c5414f2..19a9b50646c7 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -24,6 +24,7 @@ #include <net/xdp.h> #include <net/net_failover.h> #include <net/netdev_rx_queue.h> +#include <net/netdev_queues.h> static int napi_weight = NAPI_POLL_WEIGHT; module_param(napi_weight, int, 0444); @@ -78,6 +79,7 @@ static const unsigned long guest_offloads[] = { struct virtnet_stat_desc { char desc[ETH_GSTRING_LEN]; size_t offset; + size_t qstat_offset; }; struct virtnet_sq_free_stats { @@ -93,6 +95,8 @@ struct virtnet_sq_stats { u64_stats_t xdp_tx_drops; u64_stats_t kicks; u64_stats_t tx_timeouts; + u64_stats_t stop; + u64_stats_t wake; }; struct virtnet_rq_stats { @@ -107,31 +111,159 @@ struct virtnet_rq_stats { u64_stats_t kicks; }; -#define VIRTNET_SQ_STAT(m) offsetof(struct virtnet_sq_stats, m) -#define VIRTNET_RQ_STAT(m) offsetof(struct virtnet_rq_stats, m) +#define VIRTNET_SQ_STAT(name, m) {name, offsetof(struct virtnet_sq_stats, m), -1} +#define VIRTNET_RQ_STAT(name, m) {name, offsetof(struct virtnet_rq_stats, m), -1} + +#define VIRTNET_SQ_STAT_QSTAT(name, m) \ + { \ + name, \ + offsetof(struct virtnet_sq_stats, m), \ + offsetof(struct netdev_queue_stats_tx, m), \ + } + +#define VIRTNET_RQ_STAT_QSTAT(name, m) \ + { \ + name, \ + offsetof(struct virtnet_rq_stats, m), \ + offsetof(struct netdev_queue_stats_rx, m), \ + } static const struct virtnet_stat_desc virtnet_sq_stats_desc[] = { - { "packets", VIRTNET_SQ_STAT(packets) }, - { "bytes", VIRTNET_SQ_STAT(bytes) }, - { "xdp_tx", VIRTNET_SQ_STAT(xdp_tx) }, - { "xdp_tx_drops", VIRTNET_SQ_STAT(xdp_tx_drops) }, - { "kicks", VIRTNET_SQ_STAT(kicks) }, - { "tx_timeouts", VIRTNET_SQ_STAT(tx_timeouts) }, + VIRTNET_SQ_STAT("xdp_tx", xdp_tx), + VIRTNET_SQ_STAT("xdp_tx_drops", xdp_tx_drops), + VIRTNET_SQ_STAT("kicks", kicks), + VIRTNET_SQ_STAT("tx_timeouts", tx_timeouts), }; static const struct virtnet_stat_desc virtnet_rq_stats_desc[] = { - { "packets", VIRTNET_RQ_STAT(packets) }, - { "bytes", VIRTNET_RQ_STAT(bytes) }, - { "drops", VIRTNET_RQ_STAT(drops) }, - { "xdp_packets", VIRTNET_RQ_STAT(xdp_packets) }, - { "xdp_tx", VIRTNET_RQ_STAT(xdp_tx) }, - { "xdp_redirects", VIRTNET_RQ_STAT(xdp_redirects) }, - { "xdp_drops", VIRTNET_RQ_STAT(xdp_drops) }, - { "kicks", VIRTNET_RQ_STAT(kicks) }, + VIRTNET_RQ_STAT("drops", drops), + VIRTNET_RQ_STAT("xdp_packets", xdp_packets), + VIRTNET_RQ_STAT("xdp_tx", xdp_tx), + VIRTNET_RQ_STAT("xdp_redirects", xdp_redirects), + VIRTNET_RQ_STAT("xdp_drops", xdp_drops), + VIRTNET_RQ_STAT("kicks", kicks), +}; + +static const struct virtnet_stat_desc virtnet_sq_stats_desc_qstat[] = { + VIRTNET_SQ_STAT_QSTAT("packets", packets), + VIRTNET_SQ_STAT_QSTAT("bytes", bytes), + VIRTNET_SQ_STAT_QSTAT("stop", stop), + VIRTNET_SQ_STAT_QSTAT("wake", wake), +}; + +static const struct virtnet_stat_desc virtnet_rq_stats_desc_qstat[] = { + VIRTNET_RQ_STAT_QSTAT("packets", packets), + VIRTNET_RQ_STAT_QSTAT("bytes", bytes), +}; + +#define VIRTNET_STATS_DESC_CQ(name) \ + {#name, offsetof(struct virtio_net_stats_cvq, name), -1} + +#define VIRTNET_STATS_DESC_RX(class, name) \ + {#name, offsetof(struct virtio_net_stats_rx_ ## class, rx_ ## name), -1} + +#define VIRTNET_STATS_DESC_TX(class, name) \ + {#name, offsetof(struct virtio_net_stats_tx_ ## class, tx_ ## name), -1} + + +static const struct virtnet_stat_desc virtnet_stats_cvq_desc[] = { + VIRTNET_STATS_DESC_CQ(command_num), + VIRTNET_STATS_DESC_CQ(ok_num), +}; + +static const struct virtnet_stat_desc virtnet_stats_rx_basic_desc[] = { + VIRTNET_STATS_DESC_RX(basic, packets), + VIRTNET_STATS_DESC_RX(basic, bytes), + + VIRTNET_STATS_DESC_RX(basic, notifications), + VIRTNET_STATS_DESC_RX(basic, interrupts), +}; + +static const struct virtnet_stat_desc virtnet_stats_tx_basic_desc[] = { + VIRTNET_STATS_DESC_TX(basic, packets), + VIRTNET_STATS_DESC_TX(basic, bytes), + + VIRTNET_STATS_DESC_TX(basic, notifications), + VIRTNET_STATS_DESC_TX(basic, interrupts), +}; + +static const struct virtnet_stat_desc virtnet_stats_rx_csum_desc[] = { + VIRTNET_STATS_DESC_RX(csum, needs_csum), +}; + +static const struct virtnet_stat_desc virtnet_stats_tx_gso_desc[] = { + VIRTNET_STATS_DESC_TX(gso, gso_packets_noseg), + VIRTNET_STATS_DESC_TX(gso, gso_bytes_noseg), +}; + +static const struct virtnet_stat_desc virtnet_stats_rx_speed_desc[] = { + VIRTNET_STATS_DESC_RX(speed, ratelimit_bytes), +}; + +static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc[] = { + VIRTNET_STATS_DESC_TX(speed, ratelimit_bytes), +}; + +#define VIRTNET_STATS_DESC_RX_QSTAT(class, name, qstat_field) \ + { \ + #name, \ + offsetof(struct virtio_net_stats_rx_ ## class, rx_ ## name), \ + offsetof(struct netdev_queue_stats_rx, qstat_field), \ + } + +#define VIRTNET_STATS_DESC_TX_QSTAT(class, name, qstat_field) \ + { \ + #name, \ + offsetof(struct virtio_net_stats_tx_ ## class, tx_ ## name), \ + offsetof(struct netdev_queue_stats_tx, qstat_field), \ + } + +static const struct virtnet_stat_desc virtnet_stats_rx_basic_desc_qstat[] = { + VIRTNET_STATS_DESC_RX_QSTAT(basic, drops, hw_drops), + VIRTNET_STATS_DESC_RX_QSTAT(basic, drop_overruns, hw_drop_overruns), +}; + +static const struct virtnet_stat_desc virtnet_stats_tx_basic_desc_qstat[] = { + VIRTNET_STATS_DESC_TX_QSTAT(basic, drops, hw_drops), + VIRTNET_STATS_DESC_TX_QSTAT(basic, drop_malformed, hw_drop_errors), }; -#define VIRTNET_SQ_STATS_LEN ARRAY_SIZE(virtnet_sq_stats_desc) -#define VIRTNET_RQ_STATS_LEN ARRAY_SIZE(virtnet_rq_stats_desc) +static const struct virtnet_stat_desc virtnet_stats_rx_csum_desc_qstat[] = { + VIRTNET_STATS_DESC_RX_QSTAT(csum, csum_valid, csum_unnecessary), + VIRTNET_STATS_DESC_RX_QSTAT(csum, csum_none, csum_none), + VIRTNET_STATS_DESC_RX_QSTAT(csum, csum_bad, csum_bad), +}; + +static const struct virtnet_stat_desc virtnet_stats_tx_csum_desc_qstat[] = { + VIRTNET_STATS_DESC_TX_QSTAT(csum, csum_none, csum_none), + VIRTNET_STATS_DESC_TX_QSTAT(csum, needs_csum, needs_csum), +}; + +static const struct virtnet_stat_desc virtnet_stats_rx_gso_desc_qstat[] = { + VIRTNET_STATS_DESC_RX_QSTAT(gso, gso_packets, hw_gro_packets), + VIRTNET_STATS_DESC_RX_QSTAT(gso, gso_bytes, hw_gro_bytes), + VIRTNET_STATS_DESC_RX_QSTAT(gso, gso_packets_coalesced, hw_gro_wire_packets), + VIRTNET_STATS_DESC_RX_QSTAT(gso, gso_bytes_coalesced, hw_gro_wire_bytes), +}; + +static const struct virtnet_stat_desc virtnet_stats_tx_gso_desc_qstat[] = { + VIRTNET_STATS_DESC_TX_QSTAT(gso, gso_packets, hw_gso_packets), + VIRTNET_STATS_DESC_TX_QSTAT(gso, gso_bytes, hw_gso_bytes), + VIRTNET_STATS_DESC_TX_QSTAT(gso, gso_segments, hw_gso_wire_packets), + VIRTNET_STATS_DESC_TX_QSTAT(gso, gso_segments_bytes, hw_gso_wire_bytes), +}; + +static const struct virtnet_stat_desc virtnet_stats_rx_speed_desc_qstat[] = { + VIRTNET_STATS_DESC_RX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits), +}; + +static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = { + VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits), +}; + +#define VIRTNET_Q_TYPE_RX 0 +#define VIRTNET_Q_TYPE_TX 1 +#define VIRTNET_Q_TYPE_CQ 2 struct virtnet_interrupt_coalesce { u32 max_packets; @@ -184,6 +316,9 @@ struct receive_queue { /* Is dynamic interrupt moderation enabled? */ bool dim_enabled; + /* Used to protect dim_enabled and inter_coal */ + struct mutex dim_lock; + /* Dynamic Interrupt Moderation */ struct dim dim; @@ -213,9 +348,6 @@ struct receive_queue { /* Record the last dma info to free after new pages is allocated. */ struct virtnet_rq_dma *last_dma; - - /* Do dma by self */ - bool do_dma; }; /* This structure can contain rss message with maximum settings for indirection table and keysize @@ -240,15 +372,6 @@ struct virtio_net_ctrl_rss { struct control_buf { struct virtio_net_ctrl_hdr hdr; virtio_net_ctrl_ack status; - struct virtio_net_ctrl_mq mq; - u8 promisc; - u8 allmulti; - __virtio16 vid; - __virtio64 offloads; - struct virtio_net_ctrl_rss rss; - struct virtio_net_ctrl_coal_tx coal_tx; - struct virtio_net_ctrl_coal_rx coal_rx; - struct virtio_net_ctrl_coal_vq coal_vq; }; struct virtnet_info { @@ -287,10 +410,14 @@ struct virtnet_info { u16 rss_indir_table_size; u32 rss_hash_types_supported; u32 rss_hash_types_saved; + struct virtio_net_ctrl_rss rss; /* Has control virtqueue */ bool has_cvq; + /* Lock to protect the control VQ */ + struct mutex cvq_lock; + /* Host can handle any s/g split between our header and packet data */ bool any_header_sg; @@ -340,6 +467,8 @@ struct virtnet_info { /* failover when STANDBY feature enabled */ struct failover *failover; + + u64 device_stats_cap; }; struct padded_vnet_hdr { @@ -425,6 +554,17 @@ static int rxq2vq(int rxq) return rxq * 2; } +static int vq_type(struct virtnet_info *vi, int qid) +{ + if (qid == vi->max_queue_pairs * 2) + return VIRTNET_Q_TYPE_CQ; + + if (qid % 2) + return VIRTNET_Q_TYPE_TX; + + return VIRTNET_Q_TYPE_RX; +} + static inline struct virtio_net_common_hdr * skb_vnet_common_hdr(struct sk_buff *skb) { @@ -603,7 +743,6 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, shinfo_size = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); - /* copy small packet so we can reuse these pages */ if (!NET_IP_ALIGN && len > GOOD_COPY_LEN && tailroom >= shinfo_size) { skb = virtnet_build_skb(buf, truesize, p - buf, len); if (unlikely(!skb)) @@ -707,7 +846,7 @@ static void *virtnet_rq_get_buf(struct receive_queue *rq, u32 *len, void **ctx) void *buf; buf = virtqueue_get_buf_ctx(rq->vq, len, ctx); - if (buf && rq->do_dma) + if (buf) virtnet_rq_unmap(rq, buf, *len); return buf; @@ -720,11 +859,6 @@ static void virtnet_rq_init_one_sg(struct receive_queue *rq, void *buf, u32 len) u32 offset; void *head; - if (!rq->do_dma) { - sg_init_one(rq->sg, buf, len); - return; - } - head = page_address(rq->alloc_frag.page); offset = buf - head; @@ -750,44 +884,42 @@ static void *virtnet_rq_alloc(struct receive_queue *rq, u32 size, gfp_t gfp) head = page_address(alloc_frag->page); - if (rq->do_dma) { - dma = head; - - /* new pages */ - if (!alloc_frag->offset) { - if (rq->last_dma) { - /* Now, the new page is allocated, the last dma - * will not be used. So the dma can be unmapped - * if the ref is 0. - */ - virtnet_rq_unmap(rq, rq->last_dma, 0); - rq->last_dma = NULL; - } + dma = head; - dma->len = alloc_frag->size - sizeof(*dma); + /* new pages */ + if (!alloc_frag->offset) { + if (rq->last_dma) { + /* Now, the new page is allocated, the last dma + * will not be used. So the dma can be unmapped + * if the ref is 0. + */ + virtnet_rq_unmap(rq, rq->last_dma, 0); + rq->last_dma = NULL; + } - addr = virtqueue_dma_map_single_attrs(rq->vq, dma + 1, - dma->len, DMA_FROM_DEVICE, 0); - if (virtqueue_dma_mapping_error(rq->vq, addr)) - return NULL; + dma->len = alloc_frag->size - sizeof(*dma); - dma->addr = addr; - dma->need_sync = virtqueue_dma_need_sync(rq->vq, addr); + addr = virtqueue_dma_map_single_attrs(rq->vq, dma + 1, + dma->len, DMA_FROM_DEVICE, 0); + if (virtqueue_dma_mapping_error(rq->vq, addr)) + return NULL; - /* Add a reference to dma to prevent the entire dma from - * being released during error handling. This reference - * will be freed after the pages are no longer used. - */ - get_page(alloc_frag->page); - dma->ref = 1; - alloc_frag->offset = sizeof(*dma); + dma->addr = addr; + dma->need_sync = virtqueue_dma_need_sync(rq->vq, addr); - rq->last_dma = dma; - } + /* Add a reference to dma to prevent the entire dma from + * being released during error handling. This reference + * will be freed after the pages are no longer used. + */ + get_page(alloc_frag->page); + dma->ref = 1; + alloc_frag->offset = sizeof(*dma); - ++dma->ref; + rq->last_dma = dma; } + ++dma->ref; + buf = head + alloc_frag->offset; get_page(alloc_frag->page); @@ -804,12 +936,9 @@ static void virtnet_rq_set_premapped(struct virtnet_info *vi) if (!vi->mergeable_rx_bufs && vi->big_packets) return; - for (i = 0; i < vi->max_queue_pairs; i++) { - if (virtqueue_set_dma_premapped(vi->rq[i].vq)) - continue; - - vi->rq[i].do_dma = true; - } + for (i = 0; i < vi->max_queue_pairs; i++) + /* error should never happen */ + BUG_ON(virtqueue_set_dma_premapped(vi->rq[i].vq)); } static void virtnet_rq_unmap_free_buf(struct virtqueue *vq, void *buf) @@ -820,7 +949,7 @@ static void virtnet_rq_unmap_free_buf(struct virtqueue *vq, void *buf) rq = &vi->rq[i]; - if (rq->do_dma) + if (!vi->big_packets || vi->mergeable_rx_bufs) virtnet_rq_unmap(rq, buf, 0); virtnet_rq_free_buf(vi, rq, buf); @@ -875,6 +1004,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi, */ if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { netif_stop_subqueue(dev, qnum); + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.stop); + u64_stats_update_end(&sq->stats.syncp); if (use_napi) { if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) virtqueue_napi_schedule(&sq->napi, sq->vq); @@ -883,6 +1015,9 @@ static void check_sq_full_and_disable(struct virtnet_info *vi, free_old_xmit(sq, false); if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { netif_start_subqueue(dev, qnum); + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.wake); + u64_stats_update_end(&sq->stats.syncp); virtqueue_disable_cb(sq->vq); } } @@ -1881,8 +2016,7 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq, err = virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, buf, ctx, gfp); if (err < 0) { - if (rq->do_dma) - virtnet_rq_unmap(rq, buf, 0); + virtnet_rq_unmap(rq, buf, 0); put_page(virt_to_head_page(buf)); } @@ -1996,8 +2130,7 @@ static int add_recvbuf_mergeable(struct virtnet_info *vi, ctx = mergeable_len_to_ctx(len + room, headroom); err = virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, buf, ctx, gfp); if (err < 0) { - if (rq->do_dma) - virtnet_rq_unmap(rq, buf, 0); + virtnet_rq_unmap(rq, buf, 0); put_page(virt_to_head_page(buf)); } @@ -2128,7 +2261,7 @@ static int virtnet_receive(struct receive_queue *rq, int budget, } } else { while (packets < budget && - (buf = virtnet_rq_get_buf(rq, &len, NULL)) != NULL) { + (buf = virtqueue_get_buf(rq->vq, &len)) != NULL) { receive_buf(vi, rq, buf, len, NULL, xdp_xmit, &stats); packets++; } @@ -2145,7 +2278,7 @@ static int virtnet_receive(struct receive_queue *rq, int budget, u64_stats_set(&stats.packets, packets); u64_stats_update_begin(&rq->stats.syncp); - for (i = 0; i < VIRTNET_RQ_STATS_LEN; i++) { + for (i = 0; i < ARRAY_SIZE(virtnet_rq_stats_desc); i++) { size_t offset = virtnet_rq_stats_desc[i].offset; u64_stats_t *item, *src; @@ -2153,6 +2286,10 @@ static int virtnet_receive(struct receive_queue *rq, int budget, src = (u64_stats_t *)((u8 *)&stats + offset); u64_stats_add(item, u64_stats_read(src)); } + + u64_stats_add(&rq->stats.packets, u64_stats_read(&stats.packets)); + u64_stats_add(&rq->stats.bytes, u64_stats_read(&stats.bytes)); + u64_stats_update_end(&rq->stats.syncp); return packets; @@ -2179,8 +2316,14 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) free_old_xmit(sq, true); } while (unlikely(!virtqueue_enable_cb_delayed(sq->vq))); - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { + if (netif_tx_queue_stopped(txq)) { + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.wake); + u64_stats_update_end(&sq->stats.syncp); + } netif_tx_wake_queue(txq); + } __netif_tx_unlock(txq); } @@ -2225,6 +2368,10 @@ static int virtnet_poll(struct napi_struct *napi, int budget) /* Out of packets? */ if (received < budget) { napi_complete = virtqueue_napi_complete(napi, rq->vq, received); + /* Intentionally not taking dim_lock here. This may result in a + * spurious net_dim call. But if that happens virtnet_rx_dim_work + * will not act on the scheduled work. + */ if (napi_complete && rq->dim_enabled) virtnet_rx_dim_update(vi, rq); } @@ -2326,8 +2473,14 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) virtqueue_disable_cb(sq->vq); free_old_xmit(sq, true); - if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { + if (netif_tx_queue_stopped(txq)) { + u64_stats_update_begin(&sq->stats.syncp); + u64_stats_inc(&sq->stats.wake); + u64_stats_update_end(&sq->stats.syncp); + } netif_tx_wake_queue(txq); + } opaque = virtqueue_enable_cb_prepare(sq->vq); @@ -2527,16 +2680,18 @@ static int virtnet_tx_resize(struct virtnet_info *vi, * supported by the hypervisor, as indicated by feature bits, should * never fail unless improperly formatted. */ -static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, - struct scatterlist *out) +static bool virtnet_send_command_reply(struct virtnet_info *vi, u8 class, u8 cmd, + struct scatterlist *out, + struct scatterlist *in) { - struct scatterlist *sgs[4], hdr, stat; - unsigned out_num = 0, tmp; + struct scatterlist *sgs[5], hdr, stat; + u32 out_num = 0, tmp, in_num = 0; int ret; /* Caller should know better */ BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)); + mutex_lock(&vi->cvq_lock); vi->ctrl->status = ~0; vi->ctrl->hdr.class = class; vi->ctrl->hdr.cmd = cmd; @@ -2549,18 +2704,22 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, /* Add return status. */ sg_init_one(&stat, &vi->ctrl->status, sizeof(vi->ctrl->status)); - sgs[out_num] = &stat; + sgs[out_num + in_num++] = &stat; + + if (in) + sgs[out_num + in_num++] = in; - BUG_ON(out_num + 1 > ARRAY_SIZE(sgs)); - ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, vi, GFP_ATOMIC); + BUG_ON(out_num + in_num > ARRAY_SIZE(sgs)); + ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, in_num, vi, GFP_ATOMIC); if (ret < 0) { dev_warn(&vi->vdev->dev, "Failed to add sgs for command vq: %d\n.", ret); + mutex_unlock(&vi->cvq_lock); return false; } if (unlikely(!virtqueue_kick(vi->cvq))) - return vi->ctrl->status == VIRTIO_NET_OK; + goto unlock; /* Spin for a response, the kick causes an ioport write, trapping * into the hypervisor, so the request should be handled immediately. @@ -2571,9 +2730,17 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, cpu_relax(); } +unlock: + mutex_unlock(&vi->cvq_lock); return vi->ctrl->status == VIRTIO_NET_OK; } +static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, + struct scatterlist *out) +{ + return virtnet_send_command_reply(vi, class, cmd, out, NULL); +} + static int virtnet_set_mac_address(struct net_device *dev, void *p) { struct virtnet_info *vi = netdev_priv(dev); @@ -2663,23 +2830,26 @@ static void virtnet_stats(struct net_device *dev, static void virtnet_ack_link_announce(struct virtnet_info *vi) { - rtnl_lock(); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_ANNOUNCE, VIRTIO_NET_CTRL_ANNOUNCE_ACK, NULL)) dev_warn(&vi->dev->dev, "Failed to ack link announce.\n"); - rtnl_unlock(); } -static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) +static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { + struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; struct scatterlist sg; struct net_device *dev = vi->dev; if (!vi->has_cvq || !virtio_has_feature(vi->vdev, VIRTIO_NET_F_MQ)) return 0; - vi->ctrl->mq.virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); - sg_init_one(&sg, &vi->ctrl->mq, sizeof(vi->ctrl->mq)); + mq = kzalloc(sizeof(*mq), GFP_KERNEL); + if (!mq) + return -ENOMEM; + + mq->virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs); + sg_init_one(&sg, mq, sizeof(*mq)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET, &sg)) { @@ -2696,16 +2866,6 @@ static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) return 0; } -static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) -{ - int err; - - rtnl_lock(); - err = _virtnet_set_queues(vi, queue_pairs); - rtnl_unlock(); - return err; -} - static int virtnet_close(struct net_device *dev) { struct virtnet_info *vi = netdev_priv(dev); @@ -2728,6 +2888,7 @@ static void virtnet_rx_mode_work(struct work_struct *work) { struct virtnet_info *vi = container_of(work, struct virtnet_info, rx_mode_work); + u8 *promisc_allmulti __free(kfree) = NULL; struct net_device *dev = vi->dev; struct scatterlist sg[2]; struct virtio_net_ctrl_mac *mac_data; @@ -2743,22 +2904,27 @@ static void virtnet_rx_mode_work(struct work_struct *work) rtnl_lock(); - vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0); - vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0); + promisc_allmulti = kzalloc(sizeof(*promisc_allmulti), GFP_ATOMIC); + if (!promisc_allmulti) { + dev_warn(&dev->dev, "Failed to set RX mode, no memory.\n"); + return; + } - sg_init_one(sg, &vi->ctrl->promisc, sizeof(vi->ctrl->promisc)); + *promisc_allmulti = !!(dev->flags & IFF_PROMISC); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_PROMISC, sg)) dev_warn(&dev->dev, "Failed to %sable promisc mode.\n", - vi->ctrl->promisc ? "en" : "dis"); + *promisc_allmulti ? "en" : "dis"); - sg_init_one(sg, &vi->ctrl->allmulti, sizeof(vi->ctrl->allmulti)); + *promisc_allmulti = !!(dev->flags & IFF_ALLMULTI); + sg_init_one(sg, promisc_allmulti, sizeof(*promisc_allmulti)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX, VIRTIO_NET_CTRL_RX_ALLMULTI, sg)) dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n", - vi->ctrl->allmulti ? "en" : "dis"); + *promisc_allmulti ? "en" : "dis"); netif_addr_lock_bh(dev); @@ -2819,10 +2985,15 @@ static int virtnet_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid) { struct virtnet_info *vi = netdev_priv(dev); + __virtio16 *_vid __free(kfree) = NULL; struct scatterlist sg; - vi->ctrl->vid = cpu_to_virtio16(vi->vdev, vid); - sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid)); + _vid = kzalloc(sizeof(*_vid), GFP_KERNEL); + if (!_vid) + return -ENOMEM; + + *_vid = cpu_to_virtio16(vi->vdev, vid); + sg_init_one(&sg, _vid, sizeof(*_vid)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN, VIRTIO_NET_CTRL_VLAN_ADD, &sg)) @@ -2834,10 +3005,15 @@ static int virtnet_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid) { struct virtnet_info *vi = netdev_priv(dev); + __virtio16 *_vid __free(kfree) = NULL; struct scatterlist sg; - vi->ctrl->vid = cpu_to_virtio16(vi->vdev, vid); - sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid)); + _vid = kzalloc(sizeof(*_vid), GFP_KERNEL); + if (!_vid) + return -ENOMEM; + + *_vid = cpu_to_virtio16(vi->vdev, vid); + sg_init_one(&sg, _vid, sizeof(*_vid)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN, VIRTIO_NET_CTRL_VLAN_DEL, &sg)) @@ -2950,12 +3126,17 @@ static void virtnet_cpu_notif_remove(struct virtnet_info *vi) static int virtnet_send_ctrl_coal_vq_cmd(struct virtnet_info *vi, u16 vqn, u32 max_usecs, u32 max_packets) { + struct virtio_net_ctrl_coal_vq *coal_vq __free(kfree) = NULL; struct scatterlist sgs; - vi->ctrl->coal_vq.vqn = cpu_to_le16(vqn); - vi->ctrl->coal_vq.coal.max_usecs = cpu_to_le32(max_usecs); - vi->ctrl->coal_vq.coal.max_packets = cpu_to_le32(max_packets); - sg_init_one(&sgs, &vi->ctrl->coal_vq, sizeof(vi->ctrl->coal_vq)); + coal_vq = kzalloc(sizeof(*coal_vq), GFP_KERNEL); + if (!coal_vq) + return -ENOMEM; + + coal_vq->vqn = cpu_to_le16(vqn); + coal_vq->coal.max_usecs = cpu_to_le32(max_usecs); + coal_vq->coal.max_packets = cpu_to_le32(max_packets); + sg_init_one(&sgs, coal_vq, sizeof(*coal_vq)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL, VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET, @@ -3066,9 +3247,11 @@ static int virtnet_set_ringparam(struct net_device *dev, return err; /* The reason is same as the transmit virtqueue reset */ + mutex_lock(&vi->rq[i].dim_lock); err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, i, vi->intr_coal_rx.max_usecs, vi->intr_coal_rx.max_packets); + mutex_unlock(&vi->rq[i].dim_lock); if (err) return err; } @@ -3087,25 +3270,29 @@ static bool virtnet_commit_rss_command(struct virtnet_info *vi) sg_init_table(sgs, 4); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, indirection_table); - sg_set_buf(&sgs[0], &vi->ctrl->rss, sg_buf_size); + sg_set_buf(&sgs[0], &vi->rss, sg_buf_size); - sg_buf_size = sizeof(uint16_t) * (vi->ctrl->rss.indirection_table_mask + 1); - sg_set_buf(&sgs[1], vi->ctrl->rss.indirection_table, sg_buf_size); + sg_buf_size = sizeof(uint16_t) * (vi->rss.indirection_table_mask + 1); + sg_set_buf(&sgs[1], vi->rss.indirection_table, sg_buf_size); sg_buf_size = offsetof(struct virtio_net_ctrl_rss, key) - offsetof(struct virtio_net_ctrl_rss, max_tx_vq); - sg_set_buf(&sgs[2], &vi->ctrl->rss.max_tx_vq, sg_buf_size); + sg_set_buf(&sgs[2], &vi->rss.max_tx_vq, sg_buf_size); sg_buf_size = vi->rss_key_size; - sg_set_buf(&sgs[3], vi->ctrl->rss.key, sg_buf_size); + sg_set_buf(&sgs[3], vi->rss.key, sg_buf_size); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG - : VIRTIO_NET_CTRL_MQ_HASH_CONFIG, sgs)) { - dev_warn(&dev->dev, "VIRTIONET issue with committing RSS sgs\n"); - return false; - } + : VIRTIO_NET_CTRL_MQ_HASH_CONFIG, sgs)) + goto err; + return true; + +err: + dev_warn(&dev->dev, "VIRTIONET issue with committing RSS sgs\n"); + return false; + } static void virtnet_init_default_rss(struct virtnet_info *vi) @@ -3113,21 +3300,21 @@ static void virtnet_init_default_rss(struct virtnet_info *vi) u32 indir_val = 0; int i = 0; - vi->ctrl->rss.hash_types = vi->rss_hash_types_supported; + vi->rss.hash_types = vi->rss_hash_types_supported; vi->rss_hash_types_saved = vi->rss_hash_types_supported; - vi->ctrl->rss.indirection_table_mask = vi->rss_indir_table_size + vi->rss.indirection_table_mask = vi->rss_indir_table_size ? vi->rss_indir_table_size - 1 : 0; - vi->ctrl->rss.unclassified_queue = 0; + vi->rss.unclassified_queue = 0; for (; i < vi->rss_indir_table_size; ++i) { indir_val = ethtool_rxfh_indir_default(i, vi->curr_queue_pairs); - vi->ctrl->rss.indirection_table[i] = indir_val; + vi->rss.indirection_table[i] = indir_val; } - vi->ctrl->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; - vi->ctrl->rss.hash_key_length = vi->rss_key_size; + vi->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0; + vi->rss.hash_key_length = vi->rss_key_size; - netdev_rss_key_fill(vi->ctrl->rss.key, vi->rss_key_size); + netdev_rss_key_fill(vi->rss.key, vi->rss_key_size); } static void virtnet_get_hashflow(const struct virtnet_info *vi, struct ethtool_rxnfc *info) @@ -3238,7 +3425,7 @@ static bool virtnet_set_hashflow(struct virtnet_info *vi, struct ethtool_rxnfc * if (new_hashtypes != vi->rss_hash_types_saved) { vi->rss_hash_types_saved = new_hashtypes; - vi->ctrl->rss.hash_types = vi->rss_hash_types_saved; + vi->rss.hash_types = vi->rss_hash_types_saved; if (vi->dev->features & NETIF_F_RXHASH) return virtnet_commit_rss_command(vi); } @@ -3283,7 +3470,7 @@ static int virtnet_set_channels(struct net_device *dev, return -EINVAL; cpus_read_lock(); - err = _virtnet_set_queues(vi, queue_pairs); + err = virtnet_set_queues(vi, queue_pairs); if (err) { cpus_read_unlock(); goto err; @@ -3297,25 +3484,654 @@ static int virtnet_set_channels(struct net_device *dev, return err; } +static void virtnet_stats_sprintf(u8 **p, const char *fmt, const char *noq_fmt, + int num, int qid, const struct virtnet_stat_desc *desc) +{ + int i; + + if (qid < 0) { + for (i = 0; i < num; ++i) + ethtool_sprintf(p, noq_fmt, desc[i].desc); + } else { + for (i = 0; i < num; ++i) + ethtool_sprintf(p, fmt, qid, desc[i].desc); + } +} + +/* qid == -1: for rx/tx queue total field */ +static void virtnet_get_stats_string(struct virtnet_info *vi, int type, int qid, u8 **data) +{ + const struct virtnet_stat_desc *desc; + const char *fmt, *noq_fmt; + u8 *p = *data; + u32 num; + + if (type == VIRTNET_Q_TYPE_CQ && qid >= 0) { + noq_fmt = "cq_hw_%s"; + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_CVQ) { + desc = &virtnet_stats_cvq_desc[0]; + num = ARRAY_SIZE(virtnet_stats_cvq_desc); + + virtnet_stats_sprintf(&p, NULL, noq_fmt, num, -1, desc); + } + } + + if (type == VIRTNET_Q_TYPE_RX) { + fmt = "rx%u_%s"; + noq_fmt = "rx_%s"; + + desc = &virtnet_rq_stats_desc[0]; + num = ARRAY_SIZE(virtnet_rq_stats_desc); + + virtnet_stats_sprintf(&p, fmt, noq_fmt, num, qid, desc); + + fmt = "rx%u_hw_%s"; + noq_fmt = "rx_hw_%s"; + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_BASIC) { + desc = &virtnet_stats_rx_basic_desc[0]; + num = ARRAY_SIZE(virtnet_stats_rx_basic_desc); + + virtnet_stats_sprintf(&p, fmt, noq_fmt, num, qid, desc); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_CSUM) { + desc = &virtnet_stats_rx_csum_desc[0]; + num = ARRAY_SIZE(virtnet_stats_rx_csum_desc); + + virtnet_stats_sprintf(&p, fmt, noq_fmt, num, qid, desc); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_SPEED) { + desc = &virtnet_stats_rx_speed_desc[0]; + num = ARRAY_SIZE(virtnet_stats_rx_speed_desc); + + virtnet_stats_sprintf(&p, fmt, noq_fmt, num, qid, desc); + } + } + + if (type == VIRTNET_Q_TYPE_TX) { + fmt = "tx%u_%s"; + noq_fmt = "tx_%s"; + + desc = &virtnet_sq_stats_desc[0]; + num = ARRAY_SIZE(virtnet_sq_stats_desc); + + virtnet_stats_sprintf(&p, fmt, noq_fmt, num, qid, desc); + + fmt = "tx%u_hw_%s"; + noq_fmt = "tx_hw_%s"; + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_BASIC) { + desc = &virtnet_stats_tx_basic_desc[0]; + num = ARRAY_SIZE(virtnet_stats_tx_basic_desc); + + virtnet_stats_sprintf(&p, fmt, noq_fmt, num, qid, desc); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_GSO) { + desc = &virtnet_stats_tx_gso_desc[0]; + num = ARRAY_SIZE(virtnet_stats_tx_gso_desc); + + virtnet_stats_sprintf(&p, fmt, noq_fmt, num, qid, desc); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_SPEED) { + desc = &virtnet_stats_tx_speed_desc[0]; + num = ARRAY_SIZE(virtnet_stats_tx_speed_desc); + + virtnet_stats_sprintf(&p, fmt, noq_fmt, num, qid, desc); + } + } + + *data = p; +} + +struct virtnet_stats_ctx { + /* The stats are write to qstats or ethtool -S */ + bool to_qstat; + + /* Used to calculate the offset inside the output buffer. */ + u32 desc_num[3]; + + /* The actual supported stat types. */ + u32 bitmap[3]; + + /* Used to calculate the reply buffer size. */ + u32 size[3]; + + /* Record the output buffer. */ + u64 *data; +}; + +static void virtnet_stats_ctx_init(struct virtnet_info *vi, + struct virtnet_stats_ctx *ctx, + u64 *data, bool to_qstat) +{ + u32 queue_type; + + ctx->data = data; + ctx->to_qstat = to_qstat; + + if (to_qstat) { + ctx->desc_num[VIRTNET_Q_TYPE_RX] = ARRAY_SIZE(virtnet_rq_stats_desc_qstat); + ctx->desc_num[VIRTNET_Q_TYPE_TX] = ARRAY_SIZE(virtnet_sq_stats_desc_qstat); + + queue_type = VIRTNET_Q_TYPE_RX; + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_BASIC) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_RX_BASIC; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_rx_basic_desc_qstat); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_rx_basic); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_CSUM) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_RX_CSUM; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_rx_csum_desc_qstat); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_rx_csum); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_GSO) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_RX_GSO; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_rx_gso_desc_qstat); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_rx_gso); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_SPEED) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_RX_SPEED; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_rx_speed_desc_qstat); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_rx_speed); + } + + queue_type = VIRTNET_Q_TYPE_TX; + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_BASIC) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_TX_BASIC; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_tx_basic_desc_qstat); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_tx_basic); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_CSUM) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_TX_CSUM; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_tx_csum_desc_qstat); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_tx_csum); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_GSO) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_TX_GSO; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_tx_gso_desc_qstat); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_tx_gso); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_SPEED) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_TX_SPEED; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_tx_speed_desc_qstat); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_tx_speed); + } + + return; + } + + ctx->desc_num[VIRTNET_Q_TYPE_RX] = ARRAY_SIZE(virtnet_rq_stats_desc); + ctx->desc_num[VIRTNET_Q_TYPE_TX] = ARRAY_SIZE(virtnet_sq_stats_desc); + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_CVQ) { + queue_type = VIRTNET_Q_TYPE_CQ; + + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_CVQ; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_cvq_desc); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_cvq); + } + + queue_type = VIRTNET_Q_TYPE_RX; + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_BASIC) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_RX_BASIC; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_rx_basic_desc); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_rx_basic); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_CSUM) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_RX_CSUM; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_rx_csum_desc); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_rx_csum); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_SPEED) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_RX_SPEED; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_rx_speed_desc); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_rx_speed); + } + + queue_type = VIRTNET_Q_TYPE_TX; + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_BASIC) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_TX_BASIC; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_tx_basic_desc); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_tx_basic); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_GSO) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_TX_GSO; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_tx_gso_desc); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_tx_gso); + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_SPEED) { + ctx->bitmap[queue_type] |= VIRTIO_NET_STATS_TYPE_TX_SPEED; + ctx->desc_num[queue_type] += ARRAY_SIZE(virtnet_stats_tx_speed_desc); + ctx->size[queue_type] += sizeof(struct virtio_net_stats_tx_speed); + } +} + +/* stats_sum_queue - Calculate the sum of the same fields in sq or rq. + * @sum: the position to store the sum values + * @num: field num + * @q_value: the first queue fields + * @q_num: number of the queues + */ +static void stats_sum_queue(u64 *sum, u32 num, u64 *q_value, u32 q_num) +{ + u32 step = num; + int i, j; + u64 *p; + + for (i = 0; i < num; ++i) { + p = sum + i; + *p = 0; + + for (j = 0; j < q_num; ++j) + *p += *(q_value + i + j * step); + } +} + +static void virtnet_fill_total_fields(struct virtnet_info *vi, + struct virtnet_stats_ctx *ctx) +{ + u64 *data, *first_rx_q, *first_tx_q; + u32 num_cq, num_rx, num_tx; + + num_cq = ctx->desc_num[VIRTNET_Q_TYPE_CQ]; + num_rx = ctx->desc_num[VIRTNET_Q_TYPE_RX]; + num_tx = ctx->desc_num[VIRTNET_Q_TYPE_TX]; + + first_rx_q = ctx->data + num_rx + num_tx + num_cq; + first_tx_q = first_rx_q + vi->curr_queue_pairs * num_rx; + + data = ctx->data; + + stats_sum_queue(data, num_rx, first_rx_q, vi->curr_queue_pairs); + + data = ctx->data + num_rx; + + stats_sum_queue(data, num_tx, first_tx_q, vi->curr_queue_pairs); +} + +static void virtnet_fill_stats_qstat(struct virtnet_info *vi, u32 qid, + struct virtnet_stats_ctx *ctx, + const u8 *base, bool drv_stats, u8 reply_type) +{ + const struct virtnet_stat_desc *desc; + const u64_stats_t *v_stat; + u64 offset, bitmap; + const __le64 *v; + u32 queue_type; + int i, num; + + queue_type = vq_type(vi, qid); + bitmap = ctx->bitmap[queue_type]; + + if (drv_stats) { + if (queue_type == VIRTNET_Q_TYPE_RX) { + desc = &virtnet_rq_stats_desc_qstat[0]; + num = ARRAY_SIZE(virtnet_rq_stats_desc_qstat); + } else { + desc = &virtnet_sq_stats_desc_qstat[0]; + num = ARRAY_SIZE(virtnet_sq_stats_desc_qstat); + } + + for (i = 0; i < num; ++i) { + offset = desc[i].qstat_offset / sizeof(*ctx->data); + v_stat = (const u64_stats_t *)(base + desc[i].offset); + ctx->data[offset] = u64_stats_read(v_stat); + } + return; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_RX_BASIC) { + desc = &virtnet_stats_rx_basic_desc_qstat[0]; + num = ARRAY_SIZE(virtnet_stats_rx_basic_desc_qstat); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_RX_BASIC) + goto found; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_RX_CSUM) { + desc = &virtnet_stats_rx_csum_desc_qstat[0]; + num = ARRAY_SIZE(virtnet_stats_rx_csum_desc_qstat); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_RX_CSUM) + goto found; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_RX_GSO) { + desc = &virtnet_stats_rx_gso_desc_qstat[0]; + num = ARRAY_SIZE(virtnet_stats_rx_gso_desc_qstat); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_RX_GSO) + goto found; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_RX_SPEED) { + desc = &virtnet_stats_rx_speed_desc_qstat[0]; + num = ARRAY_SIZE(virtnet_stats_rx_speed_desc_qstat); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_RX_SPEED) + goto found; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_TX_BASIC) { + desc = &virtnet_stats_tx_basic_desc_qstat[0]; + num = ARRAY_SIZE(virtnet_stats_tx_basic_desc_qstat); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_TX_BASIC) + goto found; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_TX_CSUM) { + desc = &virtnet_stats_tx_csum_desc_qstat[0]; + num = ARRAY_SIZE(virtnet_stats_tx_csum_desc_qstat); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_TX_CSUM) + goto found; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_TX_GSO) { + desc = &virtnet_stats_tx_gso_desc_qstat[0]; + num = ARRAY_SIZE(virtnet_stats_tx_gso_desc_qstat); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_TX_GSO) + goto found; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_TX_SPEED) { + desc = &virtnet_stats_tx_speed_desc_qstat[0]; + num = ARRAY_SIZE(virtnet_stats_tx_speed_desc_qstat); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_TX_SPEED) + goto found; + } + + return; + +found: + for (i = 0; i < num; ++i) { + offset = desc[i].qstat_offset / sizeof(*ctx->data); + v = (const __le64 *)(base + desc[i].offset); + ctx->data[offset] = le64_to_cpu(*v); + } +} + +/* virtnet_fill_stats - copy the stats to qstats or ethtool -S + * The stats source is the device or the driver. + * + * @vi: virtio net info + * @qid: the vq id + * @ctx: stats ctx (initiated by virtnet_stats_ctx_init()) + * @base: pointer to the device reply or the driver stats structure. + * @drv_stats: designate the base type (device reply, driver stats) + * @type: the type of the device reply (if drv_stats is true, this must be zero) + */ +static void virtnet_fill_stats(struct virtnet_info *vi, u32 qid, + struct virtnet_stats_ctx *ctx, + const u8 *base, bool drv_stats, u8 reply_type) +{ + u32 queue_type, num_rx, num_tx, num_cq; + const struct virtnet_stat_desc *desc; + const u64_stats_t *v_stat; + u64 offset, bitmap; + const __le64 *v; + int i, num; + + if (ctx->to_qstat) + return virtnet_fill_stats_qstat(vi, qid, ctx, base, drv_stats, reply_type); + + num_cq = ctx->desc_num[VIRTNET_Q_TYPE_CQ]; + num_rx = ctx->desc_num[VIRTNET_Q_TYPE_RX]; + num_tx = ctx->desc_num[VIRTNET_Q_TYPE_TX]; + + queue_type = vq_type(vi, qid); + bitmap = ctx->bitmap[queue_type]; + + /* skip the total fields of pairs */ + offset = num_rx + num_tx; + + if (queue_type == VIRTNET_Q_TYPE_TX) { + offset += num_cq + num_rx * vi->curr_queue_pairs + num_tx * (qid / 2); + + num = ARRAY_SIZE(virtnet_sq_stats_desc); + if (drv_stats) { + desc = &virtnet_sq_stats_desc[0]; + goto drv_stats; + } + + offset += num; + + } else if (queue_type == VIRTNET_Q_TYPE_RX) { + offset += num_cq + num_rx * (qid / 2); + + num = ARRAY_SIZE(virtnet_rq_stats_desc); + if (drv_stats) { + desc = &virtnet_rq_stats_desc[0]; + goto drv_stats; + } + + offset += num; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_CVQ) { + desc = &virtnet_stats_cvq_desc[0]; + num = ARRAY_SIZE(virtnet_stats_cvq_desc); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_CVQ) + goto found; + + offset += num; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_RX_BASIC) { + desc = &virtnet_stats_rx_basic_desc[0]; + num = ARRAY_SIZE(virtnet_stats_rx_basic_desc); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_RX_BASIC) + goto found; + + offset += num; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_RX_CSUM) { + desc = &virtnet_stats_rx_csum_desc[0]; + num = ARRAY_SIZE(virtnet_stats_rx_csum_desc); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_RX_CSUM) + goto found; + + offset += num; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_RX_SPEED) { + desc = &virtnet_stats_rx_speed_desc[0]; + num = ARRAY_SIZE(virtnet_stats_rx_speed_desc); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_RX_SPEED) + goto found; + + offset += num; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_TX_BASIC) { + desc = &virtnet_stats_tx_basic_desc[0]; + num = ARRAY_SIZE(virtnet_stats_tx_basic_desc); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_TX_BASIC) + goto found; + + offset += num; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_TX_GSO) { + desc = &virtnet_stats_tx_gso_desc[0]; + num = ARRAY_SIZE(virtnet_stats_tx_gso_desc); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_TX_GSO) + goto found; + + offset += num; + } + + if (bitmap & VIRTIO_NET_STATS_TYPE_TX_SPEED) { + desc = &virtnet_stats_tx_speed_desc[0]; + num = ARRAY_SIZE(virtnet_stats_tx_speed_desc); + if (reply_type == VIRTIO_NET_STATS_TYPE_REPLY_TX_SPEED) + goto found; + + offset += num; + } + + return; + +found: + for (i = 0; i < num; ++i) { + v = (const __le64 *)(base + desc[i].offset); + ctx->data[offset + i] = le64_to_cpu(*v); + } + + return; + +drv_stats: + for (i = 0; i < num; ++i) { + v_stat = (const u64_stats_t *)(base + desc[i].offset); + ctx->data[offset + i] = u64_stats_read(v_stat); + } +} + +static int __virtnet_get_hw_stats(struct virtnet_info *vi, + struct virtnet_stats_ctx *ctx, + struct virtio_net_ctrl_queue_stats *req, + int req_size, void *reply, int res_size) +{ + struct virtio_net_stats_reply_hdr *hdr; + struct scatterlist sgs_in, sgs_out; + void *p; + u32 qid; + int ok; + + sg_init_one(&sgs_out, req, req_size); + sg_init_one(&sgs_in, reply, res_size); + + ok = virtnet_send_command_reply(vi, VIRTIO_NET_CTRL_STATS, + VIRTIO_NET_CTRL_STATS_GET, + &sgs_out, &sgs_in); + + if (!ok) + return ok; + + for (p = reply; p - reply < res_size; p += le16_to_cpu(hdr->size)) { + hdr = p; + qid = le16_to_cpu(hdr->vq_index); + virtnet_fill_stats(vi, qid, ctx, p, false, hdr->type); + } + + return 0; +} + +static void virtnet_make_stat_req(struct virtnet_info *vi, + struct virtnet_stats_ctx *ctx, + struct virtio_net_ctrl_queue_stats *req, + int qid, int *idx) +{ + int qtype = vq_type(vi, qid); + u64 bitmap = ctx->bitmap[qtype]; + + if (!bitmap) + return; + + req->stats[*idx].vq_index = cpu_to_le16(qid); + req->stats[*idx].types_bitmap[0] = cpu_to_le64(bitmap); + *idx += 1; +} + +/* qid: -1: get stats of all vq. + * > 0: get the stats for the special vq. This must not be cvq. + */ +static int virtnet_get_hw_stats(struct virtnet_info *vi, + struct virtnet_stats_ctx *ctx, int qid) +{ + int qnum, i, j, res_size, qtype, last_vq, first_vq; + struct virtio_net_ctrl_queue_stats *req; + bool enable_cvq; + void *reply; + int ok; + + if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_DEVICE_STATS)) + return 0; + + if (qid == -1) { + last_vq = vi->curr_queue_pairs * 2 - 1; + first_vq = 0; + enable_cvq = true; + } else { + last_vq = qid; + first_vq = qid; + enable_cvq = false; + } + + qnum = 0; + res_size = 0; + for (i = first_vq; i <= last_vq ; ++i) { + qtype = vq_type(vi, i); + if (ctx->bitmap[qtype]) { + ++qnum; + res_size += ctx->size[qtype]; + } + } + + if (enable_cvq && ctx->bitmap[VIRTNET_Q_TYPE_CQ]) { + res_size += ctx->size[VIRTNET_Q_TYPE_CQ]; + qnum += 1; + } + + req = kcalloc(qnum, sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + reply = kmalloc(res_size, GFP_KERNEL); + if (!reply) { + kfree(req); + return -ENOMEM; + } + + j = 0; + for (i = first_vq; i <= last_vq ; ++i) + virtnet_make_stat_req(vi, ctx, req, i, &j); + + if (enable_cvq) + virtnet_make_stat_req(vi, ctx, req, vi->max_queue_pairs * 2, &j); + + ok = __virtnet_get_hw_stats(vi, ctx, req, sizeof(*req) * j, reply, res_size); + + kfree(req); + kfree(reply); + + return ok; +} + static void virtnet_get_strings(struct net_device *dev, u32 stringset, u8 *data) { struct virtnet_info *vi = netdev_priv(dev); - unsigned int i, j; + unsigned int i; u8 *p = data; switch (stringset) { case ETH_SS_STATS: - for (i = 0; i < vi->curr_queue_pairs; i++) { - for (j = 0; j < VIRTNET_RQ_STATS_LEN; j++) - ethtool_sprintf(&p, "rx_queue_%u_%s", i, - virtnet_rq_stats_desc[j].desc); - } + /* Generate the total field names. */ + virtnet_get_stats_string(vi, VIRTNET_Q_TYPE_RX, -1, &p); + virtnet_get_stats_string(vi, VIRTNET_Q_TYPE_TX, -1, &p); - for (i = 0; i < vi->curr_queue_pairs; i++) { - for (j = 0; j < VIRTNET_SQ_STATS_LEN; j++) - ethtool_sprintf(&p, "tx_queue_%u_%s", i, - virtnet_sq_stats_desc[j].desc); - } + virtnet_get_stats_string(vi, VIRTNET_Q_TYPE_CQ, 0, &p); + + for (i = 0; i < vi->curr_queue_pairs; ++i) + virtnet_get_stats_string(vi, VIRTNET_Q_TYPE_RX, i, &p); + + for (i = 0; i < vi->curr_queue_pairs; ++i) + virtnet_get_stats_string(vi, VIRTNET_Q_TYPE_TX, i, &p); break; } } @@ -3323,11 +4139,17 @@ static void virtnet_get_strings(struct net_device *dev, u32 stringset, u8 *data) static int virtnet_get_sset_count(struct net_device *dev, int sset) { struct virtnet_info *vi = netdev_priv(dev); + struct virtnet_stats_ctx ctx = {0}; + u32 pair_count; switch (sset) { case ETH_SS_STATS: - return vi->curr_queue_pairs * (VIRTNET_RQ_STATS_LEN + - VIRTNET_SQ_STATS_LEN); + virtnet_stats_ctx_init(vi, &ctx, NULL, false); + + pair_count = ctx.desc_num[VIRTNET_Q_TYPE_RX] + ctx.desc_num[VIRTNET_Q_TYPE_TX]; + + return pair_count + ctx.desc_num[VIRTNET_Q_TYPE_CQ] + + vi->curr_queue_pairs * pair_count; default: return -EOPNOTSUPP; } @@ -3337,40 +4159,32 @@ static void virtnet_get_ethtool_stats(struct net_device *dev, struct ethtool_stats *stats, u64 *data) { struct virtnet_info *vi = netdev_priv(dev); - unsigned int idx = 0, start, i, j; + struct virtnet_stats_ctx ctx = {0}; + unsigned int start, i; const u8 *stats_base; - const u64_stats_t *p; - size_t offset; + + virtnet_stats_ctx_init(vi, &ctx, data, false); + if (virtnet_get_hw_stats(vi, &ctx, -1)) + dev_warn(&vi->dev->dev, "Failed to get hw stats.\n"); for (i = 0; i < vi->curr_queue_pairs; i++) { struct receive_queue *rq = &vi->rq[i]; + struct send_queue *sq = &vi->sq[i]; stats_base = (const u8 *)&rq->stats; do { start = u64_stats_fetch_begin(&rq->stats.syncp); - for (j = 0; j < VIRTNET_RQ_STATS_LEN; j++) { - offset = virtnet_rq_stats_desc[j].offset; - p = (const u64_stats_t *)(stats_base + offset); - data[idx + j] = u64_stats_read(p); - } + virtnet_fill_stats(vi, i * 2, &ctx, stats_base, true, 0); } while (u64_stats_fetch_retry(&rq->stats.syncp, start)); - idx += VIRTNET_RQ_STATS_LEN; - } - - for (i = 0; i < vi->curr_queue_pairs; i++) { - struct send_queue *sq = &vi->sq[i]; stats_base = (const u8 *)&sq->stats; do { start = u64_stats_fetch_begin(&sq->stats.syncp); - for (j = 0; j < VIRTNET_SQ_STATS_LEN; j++) { - offset = virtnet_sq_stats_desc[j].offset; - p = (const u64_stats_t *)(stats_base + offset); - data[idx + j] = u64_stats_read(p); - } + virtnet_fill_stats(vi, i * 2 + 1, &ctx, stats_base, true, 0); } while (u64_stats_fetch_retry(&sq->stats.syncp, start)); - idx += VIRTNET_SQ_STATS_LEN; } + + virtnet_fill_total_fields(vi, &ctx); } static void virtnet_get_channels(struct net_device *dev, @@ -3410,12 +4224,17 @@ static int virtnet_get_link_ksettings(struct net_device *dev, static int virtnet_send_tx_notf_coal_cmds(struct virtnet_info *vi, struct ethtool_coalesce *ec) { + struct virtio_net_ctrl_coal_tx *coal_tx __free(kfree) = NULL; struct scatterlist sgs_tx; int i; - vi->ctrl->coal_tx.tx_usecs = cpu_to_le32(ec->tx_coalesce_usecs); - vi->ctrl->coal_tx.tx_max_packets = cpu_to_le32(ec->tx_max_coalesced_frames); - sg_init_one(&sgs_tx, &vi->ctrl->coal_tx, sizeof(vi->ctrl->coal_tx)); + coal_tx = kzalloc(sizeof(*coal_tx), GFP_KERNEL); + if (!coal_tx) + return -ENOMEM; + + coal_tx->tx_usecs = cpu_to_le32(ec->tx_coalesce_usecs); + coal_tx->tx_max_packets = cpu_to_le32(ec->tx_max_coalesced_frames); + sg_init_one(&sgs_tx, coal_tx, sizeof(*coal_tx)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL, VIRTIO_NET_CTRL_NOTF_COAL_TX_SET, @@ -3435,8 +4254,10 @@ static int virtnet_send_tx_notf_coal_cmds(struct virtnet_info *vi, static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, struct ethtool_coalesce *ec) { + struct virtio_net_ctrl_coal_rx *coal_rx __free(kfree) = NULL; bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce; struct scatterlist sgs_rx; + int ret = 0; int i; if (rx_ctrl_dim_on && !virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) @@ -3446,11 +4267,21 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, ec->rx_max_coalesced_frames != vi->intr_coal_rx.max_packets)) return -EINVAL; + /* Acquire all queues dim_locks */ + for (i = 0; i < vi->max_queue_pairs; i++) + mutex_lock(&vi->rq[i].dim_lock); + if (rx_ctrl_dim_on && !vi->rx_dim_enabled) { vi->rx_dim_enabled = true; for (i = 0; i < vi->max_queue_pairs; i++) vi->rq[i].dim_enabled = true; - return 0; + goto unlock; + } + + coal_rx = kzalloc(sizeof(*coal_rx), GFP_KERNEL); + if (!coal_rx) { + ret = -ENOMEM; + goto unlock; } if (!rx_ctrl_dim_on && vi->rx_dim_enabled) { @@ -3463,14 +4294,16 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, * we need apply the global new params even if they * are not updated. */ - vi->ctrl->coal_rx.rx_usecs = cpu_to_le32(ec->rx_coalesce_usecs); - vi->ctrl->coal_rx.rx_max_packets = cpu_to_le32(ec->rx_max_coalesced_frames); - sg_init_one(&sgs_rx, &vi->ctrl->coal_rx, sizeof(vi->ctrl->coal_rx)); + coal_rx->rx_usecs = cpu_to_le32(ec->rx_coalesce_usecs); + coal_rx->rx_max_packets = cpu_to_le32(ec->rx_max_coalesced_frames); + sg_init_one(&sgs_rx, coal_rx, sizeof(*coal_rx)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL, VIRTIO_NET_CTRL_NOTF_COAL_RX_SET, - &sgs_rx)) - return -EINVAL; + &sgs_rx)) { + ret = -EINVAL; + goto unlock; + } vi->intr_coal_rx.max_usecs = ec->rx_coalesce_usecs; vi->intr_coal_rx.max_packets = ec->rx_max_coalesced_frames; @@ -3478,8 +4311,11 @@ static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi, vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs; vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames; } +unlock: + for (i = vi->max_queue_pairs - 1; i >= 0; i--) + mutex_unlock(&vi->rq[i].dim_lock); - return 0; + return ret; } static int virtnet_send_notf_coal_cmds(struct virtnet_info *vi, @@ -3503,19 +4339,24 @@ static int virtnet_send_rx_notf_coal_vq_cmds(struct virtnet_info *vi, u16 queue) { bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce; - bool cur_rx_dim = vi->rq[queue].dim_enabled; u32 max_usecs, max_packets; + bool cur_rx_dim; int err; + mutex_lock(&vi->rq[queue].dim_lock); + cur_rx_dim = vi->rq[queue].dim_enabled; max_usecs = vi->rq[queue].intr_coal.max_usecs; max_packets = vi->rq[queue].intr_coal.max_packets; if (rx_ctrl_dim_on && (ec->rx_coalesce_usecs != max_usecs || - ec->rx_max_coalesced_frames != max_packets)) + ec->rx_max_coalesced_frames != max_packets)) { + mutex_unlock(&vi->rq[queue].dim_lock); return -EINVAL; + } if (rx_ctrl_dim_on && !cur_rx_dim) { vi->rq[queue].dim_enabled = true; + mutex_unlock(&vi->rq[queue].dim_lock); return 0; } @@ -3528,10 +4369,8 @@ static int virtnet_send_rx_notf_coal_vq_cmds(struct virtnet_info *vi, err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, queue, ec->rx_coalesce_usecs, ec->rx_max_coalesced_frames); - if (err) - return err; - - return 0; + mutex_unlock(&vi->rq[queue].dim_lock); + return err; } static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi, @@ -3561,39 +4400,27 @@ static void virtnet_rx_dim_work(struct work_struct *work) struct virtnet_info *vi = rq->vq->vdev->priv; struct net_device *dev = vi->dev; struct dim_cq_moder update_moder; - int i, qnum, err; + int qnum, err; - if (!rtnl_trylock()) - return; + qnum = rq - vi->rq; - /* Each rxq's work is queued by "net_dim()->schedule_work()" - * in response to NAPI traffic changes. Note that dim->profile_ix - * for each rxq is updated prior to the queuing action. - * So we only need to traverse and update profiles for all rxqs - * in the work which is holding rtnl_lock. - */ - for (i = 0; i < vi->curr_queue_pairs; i++) { - rq = &vi->rq[i]; - dim = &rq->dim; - qnum = rq - vi->rq; - - if (!rq->dim_enabled) - continue; - - update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); - if (update_moder.usec != rq->intr_coal.max_usecs || - update_moder.pkts != rq->intr_coal.max_packets) { - err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, - update_moder.usec, - update_moder.pkts); - if (err) - pr_debug("%s: Failed to send dim parameters on rxq%d\n", - dev->name, qnum); - dim->state = DIM_START_MEASURE; - } - } + mutex_lock(&rq->dim_lock); + if (!rq->dim_enabled) + goto out; - rtnl_unlock(); + update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix); + if (update_moder.usec != rq->intr_coal.max_usecs || + update_moder.pkts != rq->intr_coal.max_packets) { + err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum, + update_moder.usec, + update_moder.pkts); + if (err) + pr_debug("%s: Failed to send dim parameters on rxq%d\n", + dev->name, qnum); + dim->state = DIM_START_MEASURE; + } +out: + mutex_unlock(&rq->dim_lock); } static int virtnet_coal_params_supported(struct ethtool_coalesce *ec) @@ -3731,11 +4558,13 @@ static int virtnet_get_per_queue_coalesce(struct net_device *dev, return -EINVAL; if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) { + mutex_lock(&vi->rq[queue].dim_lock); ec->rx_coalesce_usecs = vi->rq[queue].intr_coal.max_usecs; ec->tx_coalesce_usecs = vi->sq[queue].intr_coal.max_usecs; ec->tx_max_coalesced_frames = vi->sq[queue].intr_coal.max_packets; ec->rx_max_coalesced_frames = vi->rq[queue].intr_coal.max_packets; ec->use_adaptive_rx_coalesce = vi->rq[queue].dim_enabled; + mutex_unlock(&vi->rq[queue].dim_lock); } else { ec->rx_max_coalesced_frames = 1; @@ -3791,11 +4620,11 @@ static int virtnet_get_rxfh(struct net_device *dev, if (rxfh->indir) { for (i = 0; i < vi->rss_indir_table_size; ++i) - rxfh->indir[i] = vi->ctrl->rss.indirection_table[i]; + rxfh->indir[i] = vi->rss.indirection_table[i]; } if (rxfh->key) - memcpy(rxfh->key, vi->ctrl->rss.key, vi->rss_key_size); + memcpy(rxfh->key, vi->rss.key, vi->rss_key_size); rxfh->hfunc = ETH_RSS_HASH_TOP; @@ -3819,7 +4648,7 @@ static int virtnet_set_rxfh(struct net_device *dev, return -EOPNOTSUPP; for (i = 0; i < vi->rss_indir_table_size; ++i) - vi->ctrl->rss.indirection_table[i] = rxfh->indir[i]; + vi->rss.indirection_table[i] = rxfh->indir[i]; update = true; } @@ -3831,7 +4660,7 @@ static int virtnet_set_rxfh(struct net_device *dev, if (!vi->has_rss && !vi->has_rss_hash_report) return -EOPNOTSUPP; - memcpy(vi->ctrl->rss.key, rxfh->key, vi->rss_key_size); + memcpy(vi->rss.key, rxfh->key, vi->rss_key_size); update = true; } @@ -3905,6 +4734,97 @@ static const struct ethtool_ops virtnet_ethtool_ops = { .set_rxnfc = virtnet_set_rxnfc, }; +static void virtnet_get_queue_stats_rx(struct net_device *dev, int i, + struct netdev_queue_stats_rx *stats) +{ + struct virtnet_info *vi = netdev_priv(dev); + struct receive_queue *rq = &vi->rq[i]; + struct virtnet_stats_ctx ctx = {0}; + + virtnet_stats_ctx_init(vi, &ctx, (void *)stats, true); + + virtnet_get_hw_stats(vi, &ctx, i * 2); + virtnet_fill_stats(vi, i * 2, &ctx, (void *)&rq->stats, true, 0); +} + +static void virtnet_get_queue_stats_tx(struct net_device *dev, int i, + struct netdev_queue_stats_tx *stats) +{ + struct virtnet_info *vi = netdev_priv(dev); + struct send_queue *sq = &vi->sq[i]; + struct virtnet_stats_ctx ctx = {0}; + + virtnet_stats_ctx_init(vi, &ctx, (void *)stats, true); + + virtnet_get_hw_stats(vi, &ctx, i * 2 + 1); + virtnet_fill_stats(vi, i * 2 + 1, &ctx, (void *)&sq->stats, true, 0); +} + +static void virtnet_get_base_stats(struct net_device *dev, + struct netdev_queue_stats_rx *rx, + struct netdev_queue_stats_tx *tx) +{ + struct virtnet_info *vi = netdev_priv(dev); + + /* The queue stats of the virtio-net will not be reset. So here we + * return 0. + */ + rx->bytes = 0; + rx->packets = 0; + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_BASIC) { + rx->hw_drops = 0; + rx->hw_drop_overruns = 0; + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_CSUM) { + rx->csum_unnecessary = 0; + rx->csum_none = 0; + rx->csum_bad = 0; + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_GSO) { + rx->hw_gro_packets = 0; + rx->hw_gro_bytes = 0; + rx->hw_gro_wire_packets = 0; + rx->hw_gro_wire_bytes = 0; + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_RX_SPEED) + rx->hw_drop_ratelimits = 0; + + tx->bytes = 0; + tx->packets = 0; + tx->stop = 0; + tx->wake = 0; + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_BASIC) { + tx->hw_drops = 0; + tx->hw_drop_errors = 0; + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_CSUM) { + tx->csum_none = 0; + tx->needs_csum = 0; + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_GSO) { + tx->hw_gso_packets = 0; + tx->hw_gso_bytes = 0; + tx->hw_gso_wire_packets = 0; + tx->hw_gso_wire_bytes = 0; + } + + if (vi->device_stats_cap & VIRTIO_NET_STATS_TYPE_TX_SPEED) + tx->hw_drop_ratelimits = 0; +} + +static const struct netdev_stat_ops virtnet_stat_ops = { + .get_queue_stats_rx = virtnet_get_queue_stats_rx, + .get_queue_stats_tx = virtnet_get_queue_stats_tx, + .get_base_stats = virtnet_get_base_stats, +}; + static void virtnet_freeze_down(struct virtio_device *vdev) { struct virtnet_info *vi = vdev->priv; @@ -3951,10 +4871,16 @@ static int virtnet_restore_up(struct virtio_device *vdev) static int virtnet_set_guest_offloads(struct virtnet_info *vi, u64 offloads) { + __virtio64 *_offloads __free(kfree) = NULL; struct scatterlist sg; - vi->ctrl->offloads = cpu_to_virtio64(vi->vdev, offloads); - sg_init_one(&sg, &vi->ctrl->offloads, sizeof(vi->ctrl->offloads)); + _offloads = kzalloc(sizeof(*_offloads), GFP_KERNEL); + if (!_offloads) + return -ENOMEM; + + *_offloads = cpu_to_virtio64(vi->vdev, offloads); + + sg_init_one(&sg, _offloads, sizeof(*_offloads)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_GUEST_OFFLOADS, VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET, &sg)) { @@ -4054,7 +4980,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, synchronize_net(); } - err = _virtnet_set_queues(vi, curr_qp + xdp_qp); + err = virtnet_set_queues(vi, curr_qp + xdp_qp); if (err) goto err; netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp); @@ -4156,9 +5082,9 @@ static int virtnet_set_features(struct net_device *dev, if ((dev->features ^ features) & NETIF_F_RXHASH) { if (features & NETIF_F_RXHASH) - vi->ctrl->rss.hash_types = vi->rss_hash_types_saved; + vi->rss.hash_types = vi->rss_hash_types_saved; else - vi->ctrl->rss.hash_types = VIRTIO_NET_HASH_REPORT_NONE; + vi->rss.hash_types = VIRTIO_NET_HASH_REPORT_NONE; if (!virtnet_commit_rss_command(vi)) return -EINVAL; @@ -4287,7 +5213,7 @@ static void free_receive_page_frags(struct virtnet_info *vi) int i; for (i = 0; i < vi->max_queue_pairs; i++) if (vi->rq[i].alloc_frag.page) { - if (vi->rq[i].do_dma && vi->rq[i].last_dma) + if (vi->rq[i].last_dma) virtnet_rq_unmap(&vi->rq[i], vi->rq[i].last_dma, 0); put_page(vi->rq[i].alloc_frag.page); } @@ -4470,6 +5396,7 @@ static int virtnet_alloc_queues(struct virtnet_info *vi) u64_stats_init(&vi->rq[i].stats.syncp); u64_stats_init(&vi->sq[i].stats.syncp); + mutex_init(&vi->rq[i].dim_lock); } return 0; @@ -4637,6 +5564,48 @@ static void virtnet_set_big_packets(struct virtnet_info *vi, const int mtu) } } +#define VIRTIO_NET_HASH_REPORT_MAX_TABLE 10 +static enum xdp_rss_hash_type +virtnet_xdp_rss_type[VIRTIO_NET_HASH_REPORT_MAX_TABLE] = { + [VIRTIO_NET_HASH_REPORT_NONE] = XDP_RSS_TYPE_NONE, + [VIRTIO_NET_HASH_REPORT_IPv4] = XDP_RSS_TYPE_L3_IPV4, + [VIRTIO_NET_HASH_REPORT_TCPv4] = XDP_RSS_TYPE_L4_IPV4_TCP, + [VIRTIO_NET_HASH_REPORT_UDPv4] = XDP_RSS_TYPE_L4_IPV4_UDP, + [VIRTIO_NET_HASH_REPORT_IPv6] = XDP_RSS_TYPE_L3_IPV6, + [VIRTIO_NET_HASH_REPORT_TCPv6] = XDP_RSS_TYPE_L4_IPV6_TCP, + [VIRTIO_NET_HASH_REPORT_UDPv6] = XDP_RSS_TYPE_L4_IPV6_UDP, + [VIRTIO_NET_HASH_REPORT_IPv6_EX] = XDP_RSS_TYPE_L3_IPV6_EX, + [VIRTIO_NET_HASH_REPORT_TCPv6_EX] = XDP_RSS_TYPE_L4_IPV6_TCP_EX, + [VIRTIO_NET_HASH_REPORT_UDPv6_EX] = XDP_RSS_TYPE_L4_IPV6_UDP_EX +}; + +static int virtnet_xdp_rx_hash(const struct xdp_md *_ctx, u32 *hash, + enum xdp_rss_hash_type *rss_type) +{ + const struct xdp_buff *xdp = (void *)_ctx; + struct virtio_net_hdr_v1_hash *hdr_hash; + struct virtnet_info *vi; + u16 hash_report; + + if (!(xdp->rxq->dev->features & NETIF_F_RXHASH)) + return -ENODATA; + + vi = netdev_priv(xdp->rxq->dev); + hdr_hash = (struct virtio_net_hdr_v1_hash *)(xdp->data - vi->hdr_len); + hash_report = __le16_to_cpu(hdr_hash->hash_report); + + if (hash_report >= VIRTIO_NET_HASH_REPORT_MAX_TABLE) + hash_report = VIRTIO_NET_HASH_REPORT_NONE; + + *rss_type = virtnet_xdp_rss_type[hash_report]; + *hash = __le32_to_cpu(hdr_hash->hash_value); + return 0; +} + +static const struct xdp_metadata_ops virtnet_xdp_metadata_ops = { + .xmo_rx_hash = virtnet_xdp_rx_hash, +}; + static int virtnet_probe(struct virtio_device *vdev) { int i, err = -ENOMEM; @@ -4666,6 +5635,7 @@ static int virtnet_probe(struct virtio_device *vdev) dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE | IFF_TX_SKB_NO_LINEAR; dev->netdev_ops = &virtnet_netdev; + dev->stat_ops = &virtnet_stat_ops; dev->features = NETIF_F_HIGHDMA; dev->ethtool_ops = &virtnet_ethtool_ops; @@ -4765,6 +5735,7 @@ static int virtnet_probe(struct virtio_device *vdev) VIRTIO_NET_RSS_HASH_TYPE_UDP_EX); dev->hw_features |= NETIF_F_RXHASH; + dev->xdp_metadata_ops = &virtnet_xdp_metadata_ops; } if (vi->has_rss_hash_report) @@ -4782,6 +5753,8 @@ static int virtnet_probe(struct virtio_device *vdev) if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ)) vi->has_cvq = true; + mutex_init(&vi->cvq_lock); + if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) { mtu = virtio_cread16(vdev, offsetof(struct virtio_net_config, @@ -4873,7 +5846,7 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_device_ready(vdev); - _virtnet_set_queues(vi, vi->curr_queue_pairs); + virtnet_set_queues(vi, vi->curr_queue_pairs); /* a random MAC address has been assigned, notify the device. * We don't fail probe if VIRTIO_NET_F_CTRL_MAC_ADDR is not there @@ -4893,6 +5866,33 @@ static int virtnet_probe(struct virtio_device *vdev) } } + if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_DEVICE_STATS)) { + struct virtio_net_stats_capabilities *stats_cap __free(kfree) = NULL; + struct scatterlist sg; + __le64 v; + + stats_cap = kzalloc(sizeof(*stats_cap), GFP_KERNEL); + if (!stats_cap) { + rtnl_unlock(); + err = -ENOMEM; + goto free_unregister_netdev; + } + + sg_init_one(&sg, stats_cap, sizeof(*stats_cap)); + + if (!virtnet_send_command_reply(vi, VIRTIO_NET_CTRL_STATS, + VIRTIO_NET_CTRL_STATS_QUERY, + NULL, &sg)) { + pr_debug("virtio_net: fail to get stats capability\n"); + rtnl_unlock(); + err = -EINVAL; + goto free_unregister_netdev; + } + + v = stats_cap->supported_stats_types[0]; + vi->device_stats_cap = le64_to_cpu(v); + } + rtnl_unlock(); err = virtnet_cpu_notif_add(vi); @@ -5021,7 +6021,7 @@ static struct virtio_device_id id_table[] = { VIRTIO_NET_F_SPEED_DUPLEX, VIRTIO_NET_F_STANDBY, \ VIRTIO_NET_F_RSS, VIRTIO_NET_F_HASH_REPORT, VIRTIO_NET_F_NOTF_COAL, \ VIRTIO_NET_F_VQ_NOTF_COAL, \ - VIRTIO_NET_F_GUEST_HDRLEN + VIRTIO_NET_F_GUEST_HDRLEN, VIRTIO_NET_F_DEVICE_STATS static unsigned int features[] = { VIRTNET_FEATURES, |