diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-06-30 15:51:09 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-06-30 15:51:09 -0700 |
commit | dbe69e43372212527abf48609aba7fc39a6daa27 (patch) | |
tree | 96cfafdf70f5325ceeac1054daf7deca339c9730 /Documentation/networking/dsa | |
parent | a6eaf3850cb171c328a8b0db6d3c79286a1eba9d (diff) | |
parent | b6df00789e2831fff7a2c65aa7164b2a4dcbe599 (diff) |
Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- BPF:
- add syscall program type and libbpf support for generating
instructions and bindings for in-kernel BPF loaders (BPF loaders
for BPF), this is a stepping stone for signed BPF programs
- infrastructure to migrate TCP child sockets from one listener to
another in the same reuseport group/map to improve flexibility
of service hand-off/restart
- add broadcast support to XDP redirect
- allow bypass of the lockless qdisc to improving performance (for
pktgen: +23% with one thread, +44% with 2 threads)
- add a simpler version of "DO_ONCE()" which does not require jump
labels, intended for slow-path usage
- virtio/vsock: introduce SOCK_SEQPACKET support
- add getsocketopt to retrieve netns cookie
- ip: treat lowest address of a IPv4 subnet as ordinary unicast
address allowing reclaiming of precious IPv4 addresses
- ipv6: use prandom_u32() for ID generation
- ip: add support for more flexible field selection for hashing
across multi-path routes (w/ offload to mlxsw)
- icmp: add support for extended RFC 8335 PROBE (ping)
- seg6: add support for SRv6 End.DT46 behavior
- mptcp:
- DSS checksum support (RFC 8684) to detect middlebox meddling
- support Connection-time 'C' flag
- time stamping support
- sctp: packetization Layer Path MTU Discovery (RFC 8899)
- xfrm: speed up state addition with seq set
- WiFi:
- hidden AP discovery on 6 GHz and other HE 6 GHz improvements
- aggregation handling improvements for some drivers
- minstrel improvements for no-ack frames
- deferred rate control for TXQs to improve reaction times
- switch from round robin to virtual time-based airtime scheduler
- add trace points:
- tcp checksum errors
- openvswitch - action execution, upcalls
- socket errors via sk_error_report
Device APIs:
- devlink: add rate API for hierarchical control of max egress rate
of virtual devices (VFs, SFs etc.)
- don't require RCU read lock to be held around BPF hooks in NAPI
context
- page_pool: generic buffer recycling
New hardware/drivers:
- mobile:
- iosm: PCIe Driver for Intel M.2 Modem
- support for Qualcomm MSM8998 (ipa)
- WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
- sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
- Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
- NXP SJA1110 Automotive Ethernet 10-port switch
- Qualcomm QCA8327 switch support (qca8k)
- Mikrotik 10/25G NIC (atl1c)
Driver changes:
- ACPI support for some MDIO, MAC and PHY devices from Marvell and
NXP (our first foray into MAC/PHY description via ACPI)
- HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
- Mellanox/Nvidia NIC (mlx5)
- NIC VF offload of L2 bridging
- support IRQ distribution to Sub-functions
- Marvell (prestera):
- add flower and match all
- devlink trap
- link aggregation
- Netronome (nfp): connection tracking offload
- Intel 1GE (igc): add AF_XDP support
- Marvell DPU (octeontx2): ingress ratelimit offload
- Google vNIC (gve): new ring/descriptor format support
- Qualcomm mobile (rmnet & ipa): inline checksum offload support
- MediaTek WiFi (mt76)
- mt7915 MSI support
- mt7915 Tx status reporting
- mt7915 thermal sensors support
- mt7921 decapsulation offload
- mt7921 enable runtime pm and deep sleep
- Realtek WiFi (rtw88)
- beacon filter support
- Tx antenna path diversity support
- firmware crash information via devcoredump
- Qualcomm WiFi (wcn36xx)
- Wake-on-WLAN support with magic packets and GTK rekeying
- Micrel PHY (ksz886x/ksz8081): add cable test support"
* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
tcp: change ICSK_CA_PRIV_SIZE definition
tcp_yeah: check struct yeah size at compile time
gve: DQO: Fix off by one in gve_rx_dqo()
stmmac: intel: set PCI_D3hot in suspend
stmmac: intel: Enable PHY WOL option in EHL
net: stmmac: option to enable PHY WOL with PMT enabled
net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
net: use netdev_info in ndo_dflt_fdb_{add,del}
ptp: Set lookup cookie when creating a PTP PPS source.
net: sock: add trace for socket errors
net: sock: introduce sk_error_report
net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
net: dsa: include fdb entries pointing to bridge in the host fdb list
net: dsa: include bridge addresses which are local in the host fdb list
net: dsa: sync static FDB entries on foreign interfaces to hardware
net: dsa: install the host MDB and FDB entries in the master's RX filter
net: dsa: reference count the FDB addresses at the cross-chip notifier level
net: dsa: introduce a separate cross-chip notifier type for host FDBs
net: dsa: reference count the MDB entries at the cross-chip notifier level
...
Diffstat (limited to 'Documentation/networking/dsa')
-rw-r--r-- | Documentation/networking/dsa/configuration.rst | 68 | ||||
-rw-r--r-- | Documentation/networking/dsa/dsa.rst | 21 | ||||
-rw-r--r-- | Documentation/networking/dsa/sja1105.rst | 61 |
3 files changed, 136 insertions, 14 deletions
diff --git a/Documentation/networking/dsa/configuration.rst b/Documentation/networking/dsa/configuration.rst index 774f0e76c746..2b08f1a772d3 100644 --- a/Documentation/networking/dsa/configuration.rst +++ b/Documentation/networking/dsa/configuration.rst @@ -292,3 +292,71 @@ configuration. # bring up the bridge devices ip link set br0 up + +Forwarding database (FDB) management +------------------------------------ + +The existing DSA switches do not have the necessary hardware support to keep +the software FDB of the bridge in sync with the hardware tables, so the two +tables are managed separately (``bridge fdb show`` queries both, and depending +on whether the ``self`` or ``master`` flags are being used, a ``bridge fdb +add`` or ``bridge fdb del`` command acts upon entries from one or both tables). + +Up until kernel v4.14, DSA only supported user space management of bridge FDB +entries using the bridge bypass operations (which do not update the software +FDB, just the hardware one) using the ``self`` flag (which is optional and can +be omitted). + + .. code-block:: sh + + bridge fdb add dev swp0 00:01:02:03:04:05 self static + # or shorthand + bridge fdb add dev swp0 00:01:02:03:04:05 static + +Due to a bug, the bridge bypass FDB implementation provided by DSA did not +distinguish between ``static`` and ``local`` FDB entries (``static`` are meant +to be forwarded, while ``local`` are meant to be locally terminated, i.e. sent +to the host port). Instead, all FDB entries with the ``self`` flag (implicit or +explicit) are treated by DSA as ``static`` even if they are ``local``. + + .. code-block:: sh + + # This command: + bridge fdb add dev swp0 00:01:02:03:04:05 static + # behaves the same for DSA as this command: + bridge fdb add dev swp0 00:01:02:03:04:05 local + # or shorthand, because the 'local' flag is implicit if 'static' is not + # specified, it also behaves the same as: + bridge fdb add dev swp0 00:01:02:03:04:05 + +The last command is an incorrect way of adding a static bridge FDB entry to a +DSA switch using the bridge bypass operations, and works by mistake. Other +drivers will treat an FDB entry added by the same command as ``local`` and as +such, will not forward it, as opposed to DSA. + +Between kernel v4.14 and v5.14, DSA has supported in parallel two modes of +adding a bridge FDB entry to the switch: the bridge bypass discussed above, as +well as a new mode using the ``master`` flag which installs FDB entries in the +software bridge too. + + .. code-block:: sh + + bridge fdb add dev swp0 00:01:02:03:04:05 master static + +Since kernel v5.14, DSA has gained stronger integration with the bridge's +software FDB, and the support for its bridge bypass FDB implementation (using +the ``self`` flag) has been removed. This results in the following changes: + + .. code-block:: sh + + # This is the only valid way of adding an FDB entry that is supported, + # compatible with v4.14 kernels and later: + bridge fdb add dev swp0 00:01:02:03:04:05 master static + # This command is no longer buggy and the entry is properly treated as + # 'local' instead of being forwarded: + bridge fdb add dev swp0 00:01:02:03:04:05 + # This command no longer installs a static FDB entry to hardware: + bridge fdb add dev swp0 00:01:02:03:04:05 static + +Script writers are therefore encouraged to use the ``master static`` set of +flags when working with bridge FDB entries on DSA switch interfaces. diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst index 8688009514cc..20baacf2bc5c 100644 --- a/Documentation/networking/dsa/dsa.rst +++ b/Documentation/networking/dsa/dsa.rst @@ -93,14 +93,15 @@ A tagging protocol may tag all packets with switch tags of the same length, or the tag length might vary (for example packets with PTP timestamps might require an extended switch tag, or there might be one tag length on TX and a different one on RX). Either way, the tagging protocol driver must populate the -``struct dsa_device_ops::overhead`` with the length in octets of the longest -switch frame header. The DSA framework will automatically adjust the MTU of the -master interface to accomodate for this extra size in order for DSA user ports -to support the standard MTU (L2 payload length) of 1500 octets. The ``overhead`` -is also used to request from the network stack, on a best-effort basis, the -allocation of packets with a ``needed_headroom`` or ``needed_tailroom`` -sufficient such that the act of pushing the switch tag on transmission of a -packet does not cause it to reallocate due to lack of memory. +``struct dsa_device_ops::needed_headroom`` and/or ``struct dsa_device_ops::needed_tailroom`` +with the length in octets of the longest switch frame header/trailer. The DSA +framework will automatically adjust the MTU of the master interface to +accommodate for this extra size in order for DSA user ports to support the +standard MTU (L2 payload length) of 1500 octets. The ``needed_headroom`` and +``needed_tailroom`` properties are also used to request from the network stack, +on a best-effort basis, the allocation of packets with enough extra space such +that the act of pushing the switch tag on transmission of a packet does not +cause it to reallocate due to lack of memory. Even though applications are not expected to parse DSA-specific frame headers, the format on the wire of the tagging protocol represents an Application Binary @@ -169,8 +170,8 @@ The job of this method is to prepare the skb in a way that the switch will understand what egress port the packet is for (and not deliver it towards other ports). Typically this is fulfilled by pushing a frame header. Checking for insufficient size in the skb headroom or tailroom is unnecessary provided that -the ``overhead`` and ``tail_tag`` properties were filled out properly, because -DSA ensures there is enough space before calling this method. +the ``needed_headroom`` and ``needed_tailroom`` properties were filled out +properly, because DSA ensures there is enough space before calling this method. The reception of a packet goes through the tagger's ``rcv`` function. The passed ``struct sk_buff *skb`` has ``skb->data`` pointing at diff --git a/Documentation/networking/dsa/sja1105.rst b/Documentation/networking/dsa/sja1105.rst index 7395a33baaf9..da4057ba37f1 100644 --- a/Documentation/networking/dsa/sja1105.rst +++ b/Documentation/networking/dsa/sja1105.rst @@ -5,7 +5,7 @@ NXP SJA1105 switch driver Overview ======== -The NXP SJA1105 is a family of 6 devices: +The NXP SJA1105 is a family of 10 SPI-managed automotive switches: - SJA1105E: First generation, no TTEthernet - SJA1105T: First generation, TTEthernet @@ -13,9 +13,11 @@ The NXP SJA1105 is a family of 6 devices: - SJA1105Q: Second generation, TTEthernet, no SGMII - SJA1105R: Second generation, no TTEthernet, SGMII - SJA1105S: Second generation, TTEthernet, SGMII - -These are SPI-managed automotive switches, with all ports being gigabit -capable, and supporting MII/RMII/RGMII and optionally SGMII on one port. +- SJA1110A: Third generation, TTEthernet, SGMII, integrated 100base-T1 and + 100base-TX PHYs +- SJA1110B: Third generation, TTEthernet, SGMII, 100base-T1, 100base-TX +- SJA1110C: Third generation, TTEthernet, SGMII, 100base-T1, 100base-TX +- SJA1110D: Third generation, TTEthernet, SGMII, 100base-T1 Being automotive parts, their configuration interface is geared towards set-and-forget use, with minimal dynamic interaction at runtime. They @@ -579,3 +581,54 @@ A board would need to hook up the PHYs connected to the switch to any other MDIO bus available to Linux within the system (e.g. to the DSA master's MDIO bus). Link state management then works by the driver manually keeping in sync (over SPI commands) the MAC link speed with the settings negotiated by the PHY. + +By comparison, the SJA1110 supports an MDIO slave access point over which its +internal 100base-T1 PHYs can be accessed from the host. This is, however, not +used by the driver, instead the internal 100base-T1 and 100base-TX PHYs are +accessed through SPI commands, modeled in Linux as virtual MDIO buses. + +The microcontroller attached to the SJA1110 port 0 also has an MDIO controller +operating in master mode, however the driver does not support this either, +since the microcontroller gets disabled when the Linux driver operates. +Discrete PHYs connected to the switch ports should have their MDIO interface +attached to an MDIO controller from the host system and not to the switch, +similar to SJA1105. + +Port compatibility matrix +------------------------- + +The SJA1105 port compatibility matrix is: + +===== ============== ============== ============== +Port SJA1105E/T SJA1105P/Q SJA1105R/S +===== ============== ============== ============== +0 xMII xMII xMII +1 xMII xMII xMII +2 xMII xMII xMII +3 xMII xMII xMII +4 xMII xMII SGMII +===== ============== ============== ============== + + +The SJA1110 port compatibility matrix is: + +===== ============== ============== ============== ============== +Port SJA1110A SJA1110B SJA1110C SJA1110D +===== ============== ============== ============== ============== +0 RevMII (uC) RevMII (uC) RevMII (uC) RevMII (uC) +1 100base-TX 100base-TX 100base-TX + or SGMII SGMII +2 xMII xMII xMII xMII + or SGMII or SGMII +3 xMII xMII xMII + or SGMII or SGMII SGMII + or 2500base-X or 2500base-X or 2500base-X +4 SGMII SGMII SGMII SGMII + or 2500base-X or 2500base-X or 2500base-X or 2500base-X +5 100base-T1 100base-T1 100base-T1 100base-T1 +6 100base-T1 100base-T1 100base-T1 100base-T1 +7 100base-T1 100base-T1 100base-T1 100base-T1 +8 100base-T1 100base-T1 n/a n/a +9 100base-T1 100base-T1 n/a n/a +10 100base-T1 n/a n/a n/a +===== ============== ============== ============== ============== |