summaryrefslogtreecommitdiff
path: root/Documentation/networking
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/filter.txt7
-rw-r--r--Documentation/networking/i40e.txt72
-rw-r--r--Documentation/networking/ip-sysctl.txt45
-rw-r--r--Documentation/networking/ipvs-sysctl.txt68
-rw-r--r--Documentation/networking/mpls-sysctl.txt19
-rw-r--r--Documentation/networking/switchdev.txt70
6 files changed, 226 insertions, 55 deletions
diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt
index 683ada5ad81d..b69b205501de 100644
--- a/Documentation/networking/filter.txt
+++ b/Documentation/networking/filter.txt
@@ -595,10 +595,9 @@ got from bpf_prog_create(), and 'ctx' the given context (e.g.
skb pointer). All constraints and restrictions from bpf_check_classic() apply
before a conversion to the new layout is being done behind the scenes!
-Currently, the classic BPF format is being used for JITing on most of the
-architectures. x86-64, aarch64 and s390x perform JIT compilation from eBPF
-instruction set, however, future work will migrate other JIT compilers as well,
-so that they will profit from the very same benefits.
+Currently, the classic BPF format is being used for JITing on most 32-bit
+architectures, whereas x86-64, aarch64, s390x, powerpc64, sparc64 perform JIT
+compilation from eBPF instruction set.
Some core changes of the new internal format:
diff --git a/Documentation/networking/i40e.txt b/Documentation/networking/i40e.txt
index a251bf4fe9c9..57e616ed10b0 100644
--- a/Documentation/networking/i40e.txt
+++ b/Documentation/networking/i40e.txt
@@ -63,6 +63,78 @@ Additional Configurations
The latest release of ethtool can be found from
https://www.kernel.org/pub/software/network/ethtool
+
+ Flow Director n-ntuple traffic filters (FDir)
+ ---------------------------------------------
+ The driver utilizes the ethtool interface for configuring ntuple filters,
+ via "ethtool -N <device> <filter>".
+
+ The sctp4, ip4, udp4, and tcp4 flow types are supported with the standard
+ fields including src-ip, dst-ip, src-port and dst-port. The driver only
+ supports fully enabling or fully masking the fields, so use of the mask
+ fields for partial matches is not supported.
+
+ Additionally, the driver supports using the action to specify filters for a
+ Virtual Function. You can specify the action as a 64bit value, where the
+ lower 32 bits represents the queue number, while the next 8 bits represent
+ which VF. Note that 0 is the PF, so the VF identifier is offset by 1. For
+ example:
+
+ ... action 0x800000002 ...
+
+ Would indicate to direct traffic for Virtual Function 7 (8 minus 1) on queue
+ 2 of that VF.
+
+ The driver also supports using the user-defined field to specify 2 bytes of
+ arbitrary data to match within the packet payload in addition to the regular
+ fields. The data is specified in the lower 32bits of the user-def field in
+ the following way:
+
+ +----------------------------+---------------------------+
+ | 31 28 24 20 16 | 15 12 8 4 0|
+ +----------------------------+---------------------------+
+ | offset into packet payload | 2 bytes of flexible data |
+ +----------------------------+---------------------------+
+
+ As an example,
+
+ ... user-def 0x4FFFF ....
+
+ means to match the value 0xFFFF 4 bytes into the packet payload. Note that
+ the offset is based on the beginning of the payload, and not the beginning
+ of the packet. Thus
+
+ flow-type tcp4 ... user-def 0x8BEAF ....
+
+ would match TCP/IPv4 packets which have the value 0xBEAF 8bytes into the
+ TCP/IPv4 payload.
+
+ For ICMP, the hardware parses the ICMP header as 4 bytes of header and 4
+ bytes of payload, so if you want to match an ICMP frames payload you may need
+ to add 4 to the offset in order to match the data.
+
+ Furthermore, the offset can only be up to a value of 64, as the hardware
+ will only read up to 64 bytes of data from the payload. It must also be even
+ as the flexible data is 2 bytes long and must be aligned to byte 0 of the
+ packet payload.
+
+ When programming filters, the hardware is limited to using a single input
+ set for each flow type. This means that it is an error to program two
+ different filters with the same type that don't match on the same fields.
+ Thus the second of the following two commands will fail:
+
+ ethtool -N <device> flow-type tcp4 src-ip 192.168.0.7 action 5
+ ethtool -N <device> flow-type tcp4 dst-ip 192.168.15.18 action 1
+
+ This is because the first filter will be accepted and reprogram the input
+ set for TCPv4 filters, but the second filter will be unable to reprogram the
+ input set until all the conflicting TCPv4 filters are first removed.
+
+ Note that the user-defined flexible offset is also considered part of the
+ input set and cannot be programmed separately for multiple filters of the
+ same type. However, the flexible data is not part of the input set and
+ multiple filters may use the same offset but match against different data.
+
Data Center Bridging (DCB)
--------------------------
DCB configuration is not currently supported.
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index ab0230461377..974ab47ae53a 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -73,6 +73,14 @@ fib_multipath_use_neigh - BOOLEAN
0 - disabled
1 - enabled
+fib_multipath_hash_policy - INTEGER
+ Controls which hash policy to use for multipath routes. Only valid
+ for kernels built with CONFIG_IP_ROUTE_MULTIPATH enabled.
+ Default: 0 (Layer 3)
+ Possible values:
+ 0 - Layer 3
+ 1 - Layer 4
+
route/max_size - INTEGER
Maximum number of routes allowed in the kernel. Increase
this when using large numbers of interfaces and/or routes.
@@ -594,6 +602,14 @@ tcp_fastopen - INTEGER
Note that that additional client or server features are only
effective if the basic support (0x1 and 0x2) are enabled respectively.
+tcp_fastopen_blackhole_timeout_sec - INTEGER
+ Initial time period in second to disable Fastopen on active TCP sockets
+ when a TFO firewall blackhole issue happens.
+ This time period will grow exponentially when more blackhole issues
+ get detected right after Fastopen is re-enabled and will reset to
+ initial value when the blackhole issue goes away.
+ By default, it is set to 1hr.
+
tcp_syn_retries - INTEGER
Number of times initial SYNs for an active TCP connection attempt
will be retransmitted. Should not be higher than 127. Default value
@@ -640,11 +656,6 @@ tcp_tso_win_divisor - INTEGER
building larger TSO frames.
Default: 3
-tcp_tw_recycle - BOOLEAN
- Enable fast recycling TIME-WAIT sockets. Default value is 0.
- It should not be changed without advice/request of technical
- experts.
-
tcp_tw_reuse - BOOLEAN
Allow to reuse TIME-WAIT sockets for new connections when it is
safe from protocol viewpoint. Default value is 0.
@@ -853,12 +864,21 @@ ip_dynaddr - BOOLEAN
ip_early_demux - BOOLEAN
Optimize input packet processing down to one demux for
certain kinds of local sockets. Currently we only do this
- for established TCP sockets.
+ for established TCP and connected UDP sockets.
It may add an additional cost for pure routing workloads that
reduces overall throughput, in such case you should disable it.
Default: 1
+tcp_early_demux - BOOLEAN
+ Enable early demux for established TCP sockets.
+ Default: 1
+
+udp_early_demux - BOOLEAN
+ Enable early demux for connected UDP sockets. Disable this if
+ your system could experience more unconnected load.
+ Default: 1
+
icmp_echo_ignore_all - BOOLEAN
If set non-zero, then the kernel will ignore all ICMP ECHO
requests sent to it.
@@ -1458,11 +1478,20 @@ accept_ra_pinfo - BOOLEAN
Functional default: enabled if accept_ra is enabled.
disabled if accept_ra is disabled.
+accept_ra_rt_info_min_plen - INTEGER
+ Minimum prefix length of Route Information in RA.
+
+ Route Information w/ prefix smaller than this variable shall
+ be ignored.
+
+ Functional default: 0 if accept_ra_rtr_pref is enabled.
+ -1 if accept_ra_rtr_pref is disabled.
+
accept_ra_rt_info_max_plen - INTEGER
Maximum prefix length of Route Information in RA.
- Route Information w/ prefix larger than or equal to this
- variable shall be ignored.
+ Route Information w/ prefix larger than this variable shall
+ be ignored.
Functional default: 0 if accept_ra_rtr_pref is enabled.
-1 if accept_ra_rtr_pref is disabled.
diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt
index e6b1c025fdd8..056898685d40 100644
--- a/Documentation/networking/ipvs-sysctl.txt
+++ b/Documentation/networking/ipvs-sysctl.txt
@@ -175,6 +175,14 @@ nat_icmp_send - BOOLEAN
for VS/NAT when the load balancer receives packets from real
servers but the connection entries don't exist.
+pmtu_disc - BOOLEAN
+ 0 - disabled
+ not 0 - enabled (default)
+
+ By default, reject with FRAG_NEEDED all DF packets that exceed
+ the PMTU, irrespective of the forwarding method. For TUN method
+ the flag can be disabled to fragment such packets.
+
secure_tcp - INTEGER
0 - disabled (default)
@@ -185,15 +193,59 @@ secure_tcp - INTEGER
The value definition is the same as that of drop_entry and
drop_packet.
-sync_threshold - INTEGER
- default 3
+sync_threshold - vector of 2 INTEGERs: sync_threshold, sync_period
+ default 3 50
+
+ It sets synchronization threshold, which is the minimum number
+ of incoming packets that a connection needs to receive before
+ the connection will be synchronized. A connection will be
+ synchronized, every time the number of its incoming packets
+ modulus sync_period equals the threshold. The range of the
+ threshold is from 0 to sync_period.
+
+ When sync_period and sync_refresh_period are 0, send sync only
+ for state changes or only once when pkts matches sync_threshold
+
+sync_refresh_period - UNSIGNED INTEGER
+ default 0
+
+ In seconds, difference in reported connection timer that triggers
+ new sync message. It can be used to avoid sync messages for the
+ specified period (or half of the connection timeout if it is lower)
+ if connection state is not changed since last sync.
+
+ This is useful for normal connections with high traffic to reduce
+ sync rate. Additionally, retry sync_retries times with period of
+ sync_refresh_period/8.
+
+sync_retries - INTEGER
+ default 0
+
+ Defines sync retries with period of sync_refresh_period/8. Useful
+ to protect against loss of sync messages. The range of the
+ sync_retries is from 0 to 3.
+
+sync_qlen_max - UNSIGNED LONG
+
+ Hard limit for queued sync messages that are not sent yet. It
+ defaults to 1/32 of the memory pages but actually represents
+ number of messages. It will protect us from allocating large
+ parts of memory when the sending rate is lower than the queuing
+ rate.
+
+sync_sock_size - INTEGER
+ default 0
+
+ Configuration of SNDBUF (master) or RCVBUF (slave) socket limit.
+ Default value is 0 (preserve system defaults).
+
+sync_ports - INTEGER
+ default 1
- It sets synchronization threshold, which is the minimum number
- of incoming packets that a connection needs to receive before
- the connection will be synchronized. A connection will be
- synchronized, every time the number of its incoming packets
- modulus 50 equals the threshold. The range of the threshold is
- from 0 to 49.
+ The number of threads that master and backup servers can use for
+ sync traffic. Every thread will use single UDP port, thread 0 will
+ use the default port 8848 while last thread will use port
+ 8848+sync_ports-1.
snat_reroute - BOOLEAN
0 - disabled
diff --git a/Documentation/networking/mpls-sysctl.txt b/Documentation/networking/mpls-sysctl.txt
index 15d8d16934fd..2f24a1912a48 100644
--- a/Documentation/networking/mpls-sysctl.txt
+++ b/Documentation/networking/mpls-sysctl.txt
@@ -19,6 +19,25 @@ platform_labels - INTEGER
Possible values: 0 - 1048575
Default: 0
+ip_ttl_propagate - BOOL
+ Control whether TTL is propagated from the IPv4/IPv6 header to
+ the MPLS header on imposing labels and propagated from the
+ MPLS header to the IPv4/IPv6 header on popping the last label.
+
+ If disabled, the MPLS transport network will appear as a
+ single hop to transit traffic.
+
+ 0 - disabled / RFC 3443 [Short] Pipe Model
+ 1 - enabled / RFC 3443 Uniform Model (default)
+
+default_ttl - BOOL
+ Default TTL value to use for MPLS packets where it cannot be
+ propagated from an IP header, either because one isn't present
+ or ip_ttl_propagate has been disabled.
+
+ Possible values: 1 - 255
+ Default: 255
+
conf/<interface>/input - BOOL
Control whether packets can be input on this interface.
diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
index 2bbac05ab9e2..3e7b946dea27 100644
--- a/Documentation/networking/switchdev.txt
+++ b/Documentation/networking/switchdev.txt
@@ -13,43 +13,43 @@ an example setup using a data-center-class switch ASIC chip. Other setups
with SR-IOV or soft switches, such as OVS, are possible.
-                             User-space tools                                 
-                                                                              
-       user space                   |                                         
-      +-------------------------------------------------------------------+   
-       kernel                       | Netlink                                 
-                                    |                                         
-                     +--------------+-------------------------------+         
-                     |         Network stack                        |         
-                     |           (Linux)                            |         
-                     |                                              |         
-                     +----------------------------------------------+         
-                                                                              
+                             User-space tools
+
+       user space                   |
+      +-------------------------------------------------------------------+
+       kernel                       | Netlink
+                                    |
+                     +--------------+-------------------------------+
+                     |         Network stack                        |
+                     |           (Linux)                            |
+                     |                                              |
+                     +----------------------------------------------+
+
sw1p2 sw1p4 sw1p6
-                      sw1p1  + sw1p3 +  sw1p5 +         eth1             
-                        +    |    +    |    +    |            +               
-                        |    |    |    |    |    |            |               
-                     +--+----+----+----+-+--+----+---+  +-----+-----+         
-                     |         Switch driver         |  |    mgmt   |         
-                     |        (this document)        |  |   driver  |         
-                     |                               |  |           |         
-                     +--------------+----------------+  +-----------+         
-                                    |                                         
-       kernel                       | HW bus (eg PCI)                         
-      +-------------------------------------------------------------------+   
-       hardware                     |                                         
-                     +--------------+---+------------+                        
-                     |         Switch device (sw1)   |                        
-                     |  +----+                       +--------+               
-                     |  |    v offloaded data path   | mgmt port              
-                     |  |    |                       |                        
-                     +--|----|----+----+----+----+---+                        
-                        |    |    |    |    |    |                            
-                        +    +    +    +    +    +                            
+                      sw1p1  + sw1p3 +  sw1p5 +         eth1
+                        +    |    +    |    +    |            +
+                        |    |    |    |    |    |            |
+                     +--+----+----+----+-+--+----+---+  +-----+-----+
+                     |         Switch driver         |  |    mgmt   |
+                     |        (this document)        |  |   driver  |
+                     |                               |  |           |
+                     +--------------+----------------+  +-----------+
+                                    |
+       kernel                       | HW bus (eg PCI)
+      +-------------------------------------------------------------------+
+       hardware                     |
+                     +--------------+---+------------+
+                     |         Switch device (sw1)   |
+                     |  +----+                       +--------+
+                     |  |    v offloaded data path   | mgmt port
+                     |  |    |                       |
+                     +--|----|----+----+----+----+---+
+                        |    |    |    |    |    |
+                        +    +    +    +    +    +
                       p1   p2   p3   p4   p5   p6
-                                       
-                             front-panel ports                                
-                                                                              
+
+                             front-panel ports
+
Fig 1.