summaryrefslogtreecommitdiff
path: root/Documentation/networking
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/checksum-offloads.txt11
-rw-r--r--Documentation/networking/conf.py10
-rw-r--r--Documentation/networking/dns_resolver.txt2
-rw-r--r--Documentation/networking/i40evf.txt23
-rw-r--r--Documentation/networking/index.rst18
-rw-r--r--Documentation/networking/ipvlan.txt4
-rw-r--r--Documentation/networking/kapi.rst147
-rw-r--r--Documentation/networking/phy.txt1
-rw-r--r--Documentation/networking/policy-routing.txt150
-rw-r--r--Documentation/networking/rxrpc.txt111
-rw-r--r--Documentation/networking/segmentation-offloads.txt2
-rw-r--r--Documentation/networking/timestamping.txt32
-rw-r--r--Documentation/networking/tls.txt135
-rw-r--r--Documentation/networking/z8530book.rst256
14 files changed, 722 insertions, 180 deletions
diff --git a/Documentation/networking/checksum-offloads.txt b/Documentation/networking/checksum-offloads.txt
index 56e36861245f..d52d191bbb0c 100644
--- a/Documentation/networking/checksum-offloads.txt
+++ b/Documentation/networking/checksum-offloads.txt
@@ -35,6 +35,9 @@ This interface only allows a single checksum to be offloaded. Where
encapsulation is used, the packet may have multiple checksum fields in
different header layers, and the rest will have to be handled by another
mechanism such as LCO or RCO.
+CRC32c can also be offloaded using this interface, by means of filling
+ skb->csum_start and skb->csum_offset as described above, and setting
+ skb->csum_not_inet: see skbuff.h comment (section 'D') for more details.
No offloading of the IP header checksum is performed; it is always done in
software. This is OK because when we build the IP header, we obviously
have it in cache, so summing it isn't expensive. It's also rather short.
@@ -49,9 +52,9 @@ A driver declares its offload capabilities in netdev->hw_features; see
and csum_offset given in the SKB; if it tries to deduce these itself in
hardware (as some NICs do) the driver should check that the values in the
SKB match those which the hardware will deduce, and if not, fall back to
- checksumming in software instead (with skb_checksum_help or one of the
- skb_csum_off_chk* functions as mentioned in include/linux/skbuff.h). This
- is a pain, but that's what you get when hardware tries to be clever.
+ checksumming in software instead (with skb_csum_hwoffload_help() or one of
+ the skb_checksum_help() / skb_crc32c_csum_help functions, as mentioned in
+ include/linux/skbuff.h).
The stack should, for the most part, assume that checksum offload is
supported by the underlying device. The only place that should check is
@@ -60,7 +63,7 @@ The stack should, for the most part, assume that checksum offload is
may include other offloads besides TX Checksum Offload) and, if they are
not supported or enabled on the device (determined by netdev->features),
performs the corresponding offload in software. In the case of TX
- Checksum Offload, that means calling skb_checksum_help(skb).
+ Checksum Offload, that means calling skb_csum_hwoffload_help(skb, features).
LCO: Local Checksum Offload
diff --git a/Documentation/networking/conf.py b/Documentation/networking/conf.py
new file mode 100644
index 000000000000..40f69e67a883
--- /dev/null
+++ b/Documentation/networking/conf.py
@@ -0,0 +1,10 @@
+# -*- coding: utf-8; mode: python -*-
+
+project = "Linux Networking Documentation"
+
+tags.add("subproject")
+
+latex_documents = [
+ ('index', 'networking.tex', project,
+ 'The kernel development community', 'manual'),
+]
diff --git a/Documentation/networking/dns_resolver.txt b/Documentation/networking/dns_resolver.txt
index d86adcdae420..eaa8f9a6fd5d 100644
--- a/Documentation/networking/dns_resolver.txt
+++ b/Documentation/networking/dns_resolver.txt
@@ -143,7 +143,7 @@ the key will be discarded and recreated when the data it holds has expired.
dns_query() returns a copy of the value attached to the key, or an error if
that is indicated instead.
-See <file:Documentation/security/keys-request-key.txt> for further
+See <file:Documentation/security/keys/request-key.rst> for further
information about request-key function.
diff --git a/Documentation/networking/i40evf.txt b/Documentation/networking/i40evf.txt
index 21e41271af79..e9b3035b95d0 100644
--- a/Documentation/networking/i40evf.txt
+++ b/Documentation/networking/i40evf.txt
@@ -1,8 +1,8 @@
Linux* Base Driver for Intel(R) Network Connection
==================================================
-Intel XL710 X710 Virtual Function Linux driver.
-Copyright(c) 2013 Intel Corporation.
+Intel Ethernet Adaptive Virtual Function Linux driver.
+Copyright(c) 2013-2017 Intel Corporation.
Contents
========
@@ -11,19 +11,26 @@ Contents
- Known Issues/Troubleshooting
- Support
-This file describes the i40evf Linux* Base Driver for the Intel(R) XL710
-X710 Virtual Function.
+This file describes the i40evf Linux* Base Driver.
-The i40evf driver supports XL710 and X710 virtual function devices that
-can only be activated on kernels with CONFIG_PCI_IOV enabled.
+The i40evf driver supports the below mentioned virtual function
+devices and can only be activated on kernels running the i40e or
+newer Physical Function (PF) driver compiled with CONFIG_PCI_IOV.
+The i40evf driver requires CONFIG_PCI_MSI to be enabled.
The guest OS loading the i40evf driver must support MSI-X interrupts.
+Supported Hardware
+==================
+Intel XL710 X710 Virtual Function
+Intel Ethernet Adaptive Virtual Function
+Intel X722 Virtual Function
+
Identifying Your Adapter
========================
-For more information on how to identify your adapter, go to the Adapter &
-Driver ID Guide at:
+For more information on how to identify your adapter, go to the
+Adapter & Driver ID Guide at:
http://support.intel.com/support/go/network/adapter/idguide.htm
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
new file mode 100644
index 000000000000..b5bd87e01f52
--- /dev/null
+++ b/Documentation/networking/index.rst
@@ -0,0 +1,18 @@
+Linux Networking Documentation
+==============================
+
+Contents:
+
+.. toctree::
+ :maxdepth: 2
+
+ kapi
+ z8530book
+
+.. only:: subproject
+
+ Indices
+ =======
+
+ * :ref:`genindex`
+
diff --git a/Documentation/networking/ipvlan.txt b/Documentation/networking/ipvlan.txt
index 24196cef7c91..1fe42a874aae 100644
--- a/Documentation/networking/ipvlan.txt
+++ b/Documentation/networking/ipvlan.txt
@@ -22,9 +22,9 @@ The driver can be built into the kernel (CONFIG_IPVLAN=y) or as a module
There are no module parameters for this driver and it can be configured
using IProute2/ip utility.
- ip link add link <master-dev> <slave-dev> type ipvlan mode { l2 | l3 | l3s }
+ ip link add link <master-dev> name <slave-dev> type ipvlan mode { l2 | l3 | l3s }
- e.g. ip link add link ipvl0 eth0 type ipvlan mode l2
+ e.g. ip link add link eth0 name ipvl0 type ipvlan mode l2
4. Operating modes:
diff --git a/Documentation/networking/kapi.rst b/Documentation/networking/kapi.rst
new file mode 100644
index 000000000000..580289f345da
--- /dev/null
+++ b/Documentation/networking/kapi.rst
@@ -0,0 +1,147 @@
+=========================================
+Linux Networking and Network Devices APIs
+=========================================
+
+Linux Networking
+================
+
+Networking Base Types
+---------------------
+
+.. kernel-doc:: include/linux/net.h
+ :internal:
+
+Socket Buffer Functions
+-----------------------
+
+.. kernel-doc:: include/linux/skbuff.h
+ :internal:
+
+.. kernel-doc:: include/net/sock.h
+ :internal:
+
+.. kernel-doc:: net/socket.c
+ :export:
+
+.. kernel-doc:: net/core/skbuff.c
+ :export:
+
+.. kernel-doc:: net/core/sock.c
+ :export:
+
+.. kernel-doc:: net/core/datagram.c
+ :export:
+
+.. kernel-doc:: net/core/stream.c
+ :export:
+
+Socket Filter
+-------------
+
+.. kernel-doc:: net/core/filter.c
+ :export:
+
+Generic Network Statistics
+--------------------------
+
+.. kernel-doc:: include/uapi/linux/gen_stats.h
+ :internal:
+
+.. kernel-doc:: net/core/gen_stats.c
+ :export:
+
+.. kernel-doc:: net/core/gen_estimator.c
+ :export:
+
+SUN RPC subsystem
+-----------------
+
+.. kernel-doc:: net/sunrpc/xdr.c
+ :export:
+
+.. kernel-doc:: net/sunrpc/svc_xprt.c
+ :export:
+
+.. kernel-doc:: net/sunrpc/xprt.c
+ :export:
+
+.. kernel-doc:: net/sunrpc/sched.c
+ :export:
+
+.. kernel-doc:: net/sunrpc/socklib.c
+ :export:
+
+.. kernel-doc:: net/sunrpc/stats.c
+ :export:
+
+.. kernel-doc:: net/sunrpc/rpc_pipe.c
+ :export:
+
+.. kernel-doc:: net/sunrpc/rpcb_clnt.c
+ :export:
+
+.. kernel-doc:: net/sunrpc/clnt.c
+ :export:
+
+WiMAX
+-----
+
+.. kernel-doc:: net/wimax/op-msg.c
+ :export:
+
+.. kernel-doc:: net/wimax/op-reset.c
+ :export:
+
+.. kernel-doc:: net/wimax/op-rfkill.c
+ :export:
+
+.. kernel-doc:: net/wimax/stack.c
+ :export:
+
+.. kernel-doc:: include/net/wimax.h
+ :internal:
+
+.. kernel-doc:: include/uapi/linux/wimax.h
+ :internal:
+
+Network device support
+======================
+
+Driver Support
+--------------
+
+.. kernel-doc:: net/core/dev.c
+ :export:
+
+.. kernel-doc:: net/ethernet/eth.c
+ :export:
+
+.. kernel-doc:: net/sched/sch_generic.c
+ :export:
+
+.. kernel-doc:: include/linux/etherdevice.h
+ :internal:
+
+.. kernel-doc:: include/linux/netdevice.h
+ :internal:
+
+PHY Support
+-----------
+
+.. kernel-doc:: drivers/net/phy/phy.c
+ :export:
+
+.. kernel-doc:: drivers/net/phy/phy.c
+ :internal:
+
+.. kernel-doc:: drivers/net/phy/phy_device.c
+ :export:
+
+.. kernel-doc:: drivers/net/phy/phy_device.c
+ :internal:
+
+.. kernel-doc:: drivers/net/phy/mdio_bus.c
+ :export:
+
+.. kernel-doc:: drivers/net/phy/mdio_bus.c
+ :internal:
diff --git a/Documentation/networking/phy.txt b/Documentation/networking/phy.txt
index 16f90d817224..bdec0f700bc1 100644
--- a/Documentation/networking/phy.txt
+++ b/Documentation/networking/phy.txt
@@ -295,7 +295,6 @@ Doing it all yourself
settings in the PHY.
int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd);
- int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd);
Ethtool convenience functions.
diff --git a/Documentation/networking/policy-routing.txt b/Documentation/networking/policy-routing.txt
deleted file mode 100644
index 36f6936d7f21..000000000000
--- a/Documentation/networking/policy-routing.txt
+++ /dev/null
@@ -1,150 +0,0 @@
-Classes
--------
-
- "Class" is a complete routing table in common sense.
- I.e. it is tree of nodes (destination prefix, tos, metric)
- with attached information: gateway, device etc.
- This tree is looked up as specified in RFC1812 5.2.4.3
- 1. Basic match
- 2. Longest match
- 3. Weak TOS.
- 4. Metric. (should not be in kernel space, but they are)
- 5. Additional pruning rules. (not in kernel space).
-
- We have two special type of nodes:
- REJECT - abort route lookup and return an error value.
- THROW - abort route lookup in this class.
-
-
- Currently the number of classes is limited to 255
- (0 is reserved for "not specified class")
-
- Three classes are builtin:
-
- RT_CLASS_LOCAL=255 - local interface addresses,
- broadcasts, nat addresses.
-
- RT_CLASS_MAIN=254 - all normal routes are put there
- by default.
-
- RT_CLASS_DEFAULT=253 - if ip_fib_model==1, then
- normal default routes are put there, if ip_fib_model==2
- all gateway routes are put there.
-
-
-Rules
------
- Rule is a record of (src prefix, src interface, tos, dst prefix)
- with attached information.
-
- Rule types:
- RTP_ROUTE - lookup in attached class
- RTP_NAT - lookup in attached class and if a match is found,
- translate packet source address.
- RTP_MASQUERADE - lookup in attached class and if a match is found,
- masquerade packet as sourced by us.
- RTP_DROP - silently drop the packet.
- RTP_REJECT - drop the packet and send ICMP NET UNREACHABLE.
- RTP_PROHIBIT - drop the packet and send ICMP COMM. ADM. PROHIBITED.
-
- Rule flags:
- RTRF_LOG - log route creations.
- RTRF_VALVE - One way route (used with masquerading)
-
-Default setup:
-
-root@amber:/pub/ip-routing # iproute -r
-Kernel routing policy rules
-Pref Source Destination TOS Iface Cl
- 0 default default 00 * 255
- 254 default default 00 * 254
- 255 default default 00 * 253
-
-
-Lookup algorithm
-----------------
-
- We scan rules list, and if a rule is matched, apply it.
- If a route is found, return it.
- If it is not found or a THROW node was matched, continue
- to scan rules.
-
-Applications
-------------
-
-1. Just ignore classes. All the routes are put into MAIN class
- (and/or into DEFAULT class).
-
- HOWTO: iproute add PREFIX [ tos TOS ] [ gw GW ] [ dev DEV ]
- [ metric METRIC ] [ reject ] ... (look at iproute utility)
-
- or use route utility from current net-tools.
-
-2. Opposite case. Just forget all that you know about routing
- tables. Every rule is supplied with its own gateway, device
- info. record. This approach is not appropriate for automated
- route maintenance, but it is ideal for manual configuration.
-
- HOWTO: iproute addrule [ from PREFIX ] [ to PREFIX ] [ tos TOS ]
- [ dev INPUTDEV] [ pref PREFERENCE ] route [ gw GATEWAY ]
- [ dev OUTDEV ] .....
-
- Warning: As of now the size of the routing table in this
- approach is limited to 256. If someone likes this model, I'll
- relax this limitation.
-
-3. OSPF classes (see RFC1583, RFC1812 E.3.3)
- Very clean, stable and robust algorithm for OSPF routing
- domains. Unfortunately, it is not widely used in the Internet.
-
- Proposed setup:
- 255 local addresses
- 254 interface routes
- 253 ASE routes with external metric
- 252 ASE routes with internal metric
- 251 inter-area routes
- 250 intra-area routes for 1st area
- 249 intra-area routes for 2nd area
- etc.
-
- Rules:
- iproute addrule class 253
- iproute addrule class 252
- iproute addrule class 251
- iproute addrule to a-prefix-for-1st-area class 250
- iproute addrule to another-prefix-for-1st-area class 250
- ...
- iproute addrule to a-prefix-for-2nd-area class 249
- ...
-
- Area classes must be terminated with reject record.
- iproute add default reject class 250
- iproute add default reject class 249
- ...
-
-4. The Variant Router Requirements Algorithm (RFC1812 E.3.2)
- Create 16 classes for different TOS values.
- It is a funny, but pretty useless algorithm.
- I listed it just to show the power of new routing code.
-
-5. All the variety of combinations......
-
-
-GATED
------
-
- Gated does not understand classes, but it will work
- happily in MAIN+DEFAULT. All policy routes can be set
- and maintained manually.
-
-IMPORTANT NOTE
---------------
- route.c has a compilation time switch CONFIG_IP_LOCAL_RT_POLICY.
- If it is set, locally originated packets are routed
- using all the policy list. This is not very convenient and
- pretty ambiguous when used with NAT and masquerading.
- I set it to FALSE by default.
-
-
-Alexey Kuznetov
-kuznet@ms2.inr.ac.ru
diff --git a/Documentation/networking/rxrpc.txt b/Documentation/networking/rxrpc.txt
index 1b63bbc6b94f..8c70ba5dee4d 100644
--- a/Documentation/networking/rxrpc.txt
+++ b/Documentation/networking/rxrpc.txt
@@ -325,6 +325,9 @@ calls, to invoke certain actions and to report certain conditions. These are:
RXRPC_LOCAL_ERROR -rt error num Local error encountered
RXRPC_NEW_CALL -r- n/a New call received
RXRPC_ACCEPT s-- n/a Accept new call
+ RXRPC_EXCLUSIVE_CALL s-- n/a Make an exclusive client call
+ RXRPC_UPGRADE_SERVICE s-- n/a Client call can be upgraded
+ RXRPC_TX_LENGTH s-- data len Total length of Tx data
(SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message)
@@ -387,6 +390,40 @@ calls, to invoke certain actions and to report certain conditions. These are:
return error ENODATA. If the user ID is already in use by another call,
then error EBADSLT will be returned.
+ (*) RXRPC_EXCLUSIVE_CALL
+
+ This is used to indicate that a client call should be made on a one-off
+ connection. The connection is discarded once the call has terminated.
+
+ (*) RXRPC_UPGRADE_SERVICE
+
+ This is used to make a client call to probe if the specified service ID
+ may be upgraded by the server. The caller must check msg_name returned to
+ recvmsg() for the service ID actually in use. The operation probed must
+ be one that takes the same arguments in both services.
+
+ Once this has been used to establish the upgrade capability (or lack
+ thereof) of the server, the service ID returned should be used for all
+ future communication to that server and RXRPC_UPGRADE_SERVICE should no
+ longer be set.
+
+ (*) RXRPC_TX_LENGTH
+
+ This is used to inform the kernel of the total amount of data that is
+ going to be transmitted by a call (whether in a client request or a
+ service response). If given, it allows the kernel to encrypt from the
+ userspace buffer directly to the packet buffers, rather than copying into
+ the buffer and then encrypting in place. This may only be given with the
+ first sendmsg() providing data for a call. EMSGSIZE will be generated if
+ the amount of data actually given is different.
+
+ This takes a parameter of __s64 type that indicates how much will be
+ transmitted. This may not be less than zero.
+
+The symbol RXRPC__SUPPORTED is defined as one more than the highest control
+message type supported. At run time this can be queried by means of the
+RXRPC_SUPPORTED_CMSG socket option (see below).
+
==============
SOCKET OPTIONS
@@ -433,6 +470,18 @@ AF_RXRPC sockets support a few socket options at the SOL_RXRPC level:
Encrypted checksum plus entire packet padded and encrypted, including
actual packet length.
+ (*) RXRPC_UPGRADEABLE_SERVICE
+
+ This is used to indicate that a service socket with two bindings may
+ upgrade one bound service to the other if requested by the client. optval
+ must point to an array of two unsigned short ints. The first is the
+ service ID to upgrade from and the second the service ID to upgrade to.
+
+ (*) RXRPC_SUPPORTED_CMSG
+
+ This is a read-only option that writes an int into the buffer indicating
+ the highest control message type supported.
+
========
SECURITY
@@ -542,6 +591,9 @@ A client would issue an operation by:
MSG_MORE should be set in msghdr::msg_flags on all but the last part of
the request. Multiple requests may be made simultaneously.
+ An RXRPC_TX_LENGTH control message can also be specified on the first
+ sendmsg() call.
+
If a call is intended to go to a destination other than the default
specified through connect(), then msghdr::msg_name should be set on the
first request message of that call.
@@ -559,6 +611,17 @@ A client would issue an operation by:
buffer instead, and MSG_EOR will be flagged to indicate the end of that
call.
+A client may ask for a service ID it knows and ask that this be upgraded to a
+better service if one is available by supplying RXRPC_UPGRADE_SERVICE on the
+first sendmsg() of a call. The client should then check srx_service in the
+msg_name filled in by recvmsg() when collecting the result. srx_service will
+hold the same value as given to sendmsg() if the upgrade request was ignored by
+the service - otherwise it will be altered to indicate the service ID the
+server upgraded to. Note that the upgraded service ID is chosen by the server.
+The caller has to wait until it sees the service ID in the reply before sending
+any more calls (further calls to the same destination will be blocked until the
+probe is concluded).
+
====================
EXAMPLE SERVER USAGE
@@ -588,7 +651,7 @@ A server would be set up to accept operations in the following manner:
The keyring can be manipulated after it has been given to the socket. This
permits the server to add more keys, replace keys, etc. whilst it is live.
- (2) A local address must then be bound:
+ (3) A local address must then be bound:
struct sockaddr_rxrpc srx = {
.srx_family = AF_RXRPC,
@@ -600,11 +663,26 @@ A server would be set up to accept operations in the following manner:
};
bind(server, &srx, sizeof(srx));
- (3) The server is then set to listen out for incoming calls:
+ More than one service ID may be bound to a socket, provided the transport
+ parameters are the same. The limit is currently two. To do this, bind()
+ should be called twice.
+
+ (4) If service upgrading is required, first two service IDs must have been
+ bound and then the following option must be set:
+
+ unsigned short service_ids[2] = { from_ID, to_ID };
+ setsockopt(server, SOL_RXRPC, RXRPC_UPGRADEABLE_SERVICE,
+ service_ids, sizeof(service_ids));
+
+ This will automatically upgrade connections on service from_ID to service
+ to_ID if they request it. This will be reflected in msg_name obtained
+ through recvmsg() when the request data is delivered to userspace.
+
+ (5) The server is then set to listen out for incoming calls:
listen(server, 100);
- (4) The kernel notifies the server of pending incoming connections by sending
+ (6) The kernel notifies the server of pending incoming connections by sending
it a message for each. This is received with recvmsg() on the server
socket. It has no data, and has a single dataless control message
attached:
@@ -616,13 +694,13 @@ A server would be set up to accept operations in the following manner:
the time it is accepted - in which case the first call still on the queue
will be accepted.
- (5) The server then accepts the new call by issuing a sendmsg() with two
+ (7) The server then accepts the new call by issuing a sendmsg() with two
pieces of control data and no actual data:
RXRPC_ACCEPT - indicate connection acceptance
RXRPC_USER_CALL_ID - specify user ID for this call
- (6) The first request data packet will then be posted to the server socket for
+ (8) The first request data packet will then be posted to the server socket for
recvmsg() to pick up. At that point, the RxRPC address for the call can
be read from the address fields in the msghdr struct.
@@ -634,7 +712,7 @@ A server would be set up to accept operations in the following manner:
RXRPC_USER_CALL_ID - specifies the user ID for this call
- (8) The reply data should then be posted to the server socket using a series
+ (9) The reply data should then be posted to the server socket using a series
of sendmsg() calls, each with the following control messages attached:
RXRPC_USER_CALL_ID - specifies the user ID for this call
@@ -642,7 +720,7 @@ A server would be set up to accept operations in the following manner:
MSG_MORE should be set in msghdr::msg_flags on all but the last message
for a particular call.
- (9) The final ACK from the client will be posted for retrieval by recvmsg()
+(10) The final ACK from the client will be posted for retrieval by recvmsg()
when it is received. It will take the form of a dataless message with two
control messages attached:
@@ -652,7 +730,7 @@ A server would be set up to accept operations in the following manner:
MSG_EOR will be flagged to indicate that this is the final message for
this call.
-(10) Up to the point the final packet of reply data is sent, the call can be
+(11) Up to the point the final packet of reply data is sent, the call can be
aborted by calling sendmsg() with a dataless message with the following
control messages attached:
@@ -703,6 +781,7 @@ The kernel interface functions are as follows:
struct sockaddr_rxrpc *srx,
struct key *key,
unsigned long user_call_ID,
+ s64 tx_total_len,
gfp_t gfp);
This allocates the infrastructure to make a new RxRPC call and assigns
@@ -719,6 +798,11 @@ The kernel interface functions are as follows:
control data buffer. It is entirely feasible to use this to point to a
kernel data structure.
+ tx_total_len is the amount of data the caller is intending to transmit
+ with this call (or -1 if unknown at this point). Setting the data size
+ allows the kernel to encrypt directly to the packet buffers, thereby
+ saving a copy. The value may not be less than -1.
+
If this function is successful, an opaque reference to the RxRPC call is
returned. The caller now holds a reference on this and it must be
properly ended.
@@ -870,6 +954,17 @@ The kernel interface functions are as follows:
This is used to find the remote peer address of a call.
+ (*) Set the total transmit data size on a call.
+
+ void rxrpc_kernel_set_tx_length(struct socket *sock,
+ struct rxrpc_call *call,
+ s64 tx_total_len);
+
+ This sets the amount of data that the caller is intending to transmit on a
+ call. It's intended to be used for setting the reply size as the request
+ size should be set when the call is begun. tx_total_len may not be less
+ than zero.
+
=======================
CONFIGURABLE PARAMETERS
diff --git a/Documentation/networking/segmentation-offloads.txt b/Documentation/networking/segmentation-offloads.txt
index f200467ade38..2f09455a993a 100644
--- a/Documentation/networking/segmentation-offloads.txt
+++ b/Documentation/networking/segmentation-offloads.txt
@@ -55,7 +55,7 @@ IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
In addition to the offloads described above it is possible for a frame to
contain additional headers such as an outer tunnel. In order to account
for such instances an additional set of segmentation offload types were
-introduced including SKB_GSO_IPIP, SKB_GSO_SIT, SKB_GSO_GRE, and
+introduced including SKB_GSO_IPXIP4, SKB_GSO_IPXIP6, SKB_GSO_GRE, and
SKB_GSO_UDP_TUNNEL. These extra segmentation types are used to identify
cases where there are more than just 1 set of headers. For example in the
case of IPIP and SIT we should have the network and transport headers moved
diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt
index 96f50694a748..1be0b6f9e0cb 100644
--- a/Documentation/networking/timestamping.txt
+++ b/Documentation/networking/timestamping.txt
@@ -44,8 +44,7 @@ timeval of SO_TIMESTAMP (ms).
Supports multiple types of timestamp requests. As a result, this
socket option takes a bitmap of flags, not a boolean. In
- err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val,
- sizeof(val));
+ err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
val is an integer with any of the following bits set. Setting other
bit returns EINVAL and does not change the current state.
@@ -193,6 +192,24 @@ SOF_TIMESTAMPING_OPT_STATS:
the transmit timestamps, such as how long a certain block of
data was limited by peer's receiver window.
+SOF_TIMESTAMPING_OPT_PKTINFO:
+
+ Enable the SCM_TIMESTAMPING_PKTINFO control message for incoming
+ packets with hardware timestamps. The message contains struct
+ scm_ts_pktinfo, which supplies the index of the real interface which
+ received the packet and its length at layer 2. A valid (non-zero)
+ interface index will be returned only if CONFIG_NET_RX_BUSY_POLL is
+ enabled and the driver is using NAPI. The struct contains also two
+ other fields, but they are reserved and undefined.
+
+SOF_TIMESTAMPING_OPT_TX_SWHW:
+
+ Request both hardware and software timestamps for outgoing packets
+ when SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE
+ are enabled at the same time. If both timestamps are generated,
+ two separate messages will be looped to the socket's error queue,
+ each containing just one timestamp.
+
New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to
disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate
regardless of the setting of sysctl net.core.tstamp_allow_data.
@@ -231,8 +248,7 @@ setsockopt to receive timestamps:
__u32 val = SOF_TIMESTAMPING_SOFTWARE |
SOF_TIMESTAMPING_OPT_ID /* or any other flag */;
- err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val,
- sizeof(val));
+ err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
1.4 Bytestream Timestamps
@@ -312,7 +328,7 @@ struct scm_timestamping {
};
The structure can return up to three timestamps. This is a legacy
-feature. Only one field is non-zero at any time. Most timestamps
+feature. At least one field is non-zero at any time. Most timestamps
are passed in ts[0]. Hardware timestamps are passed in ts[2].
ts[1] used to hold hardware timestamps converted to system time.
@@ -321,6 +337,12 @@ a HW PTP clock source, to allow time conversion in userspace and
optionally synchronize system time with a userspace PTP stack such
as linuxptp. For the PTP clock API, see Documentation/ptp/ptp.txt.
+Note that if the SO_TIMESTAMP or SO_TIMESTAMPNS option is enabled
+together with SO_TIMESTAMPING using SOF_TIMESTAMPING_SOFTWARE, a false
+software timestamp will be generated in the recvmsg() call and passed
+in ts[0] when a real software timestamp is missing. This happens also
+on hardware transmit timestamps.
+
2.1.1 Transmit timestamps with MSG_ERRQUEUE
For transmit timestamps the outgoing packet is looped back to the
diff --git a/Documentation/networking/tls.txt b/Documentation/networking/tls.txt
new file mode 100644
index 000000000000..77ed00631c12
--- /dev/null
+++ b/Documentation/networking/tls.txt
@@ -0,0 +1,135 @@
+Overview
+========
+
+Transport Layer Security (TLS) is a Upper Layer Protocol (ULP) that runs over
+TCP. TLS provides end-to-end data integrity and confidentiality.
+
+User interface
+==============
+
+Creating a TLS connection
+-------------------------
+
+First create a new TCP socket and set the TLS ULP.
+
+ sock = socket(AF_INET, SOCK_STREAM, 0);
+ setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls"));
+
+Setting the TLS ULP allows us to set/get TLS socket options. Currently
+only the symmetric encryption is handled in the kernel. After the TLS
+handshake is complete, we have all the parameters required to move the
+data-path to the kernel. There is a separate socket option for moving
+the transmit and the receive into the kernel.
+
+ /* From linux/tls.h */
+ struct tls_crypto_info {
+ unsigned short version;
+ unsigned short cipher_type;
+ };
+
+ struct tls12_crypto_info_aes_gcm_128 {
+ struct tls_crypto_info info;
+ unsigned char iv[TLS_CIPHER_AES_GCM_128_IV_SIZE];
+ unsigned char key[TLS_CIPHER_AES_GCM_128_KEY_SIZE];
+ unsigned char salt[TLS_CIPHER_AES_GCM_128_SALT_SIZE];
+ unsigned char rec_seq[TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE];
+ };
+
+
+ struct tls12_crypto_info_aes_gcm_128 crypto_info;
+
+ crypto_info.info.version = TLS_1_2_VERSION;
+ crypto_info.info.cipher_type = TLS_CIPHER_AES_GCM_128;
+ memcpy(crypto_info.iv, iv_write, TLS_CIPHER_AES_GCM_128_IV_SIZE);
+ memcpy(crypto_info.rec_seq, seq_number_write,
+ TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE);
+ memcpy(crypto_info.key, cipher_key_write, TLS_CIPHER_AES_GCM_128_KEY_SIZE);
+ memcpy(crypto_info.salt, implicit_iv_write, TLS_CIPHER_AES_GCM_128_SALT_SIZE);
+
+ setsockopt(sock, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info));
+
+Sending TLS application data
+----------------------------
+
+After setting the TLS_TX socket option all application data sent over this
+socket is encrypted using TLS and the parameters provided in the socket option.
+For example, we can send an encrypted hello world record as follows:
+
+ const char *msg = "hello world\n";
+ send(sock, msg, strlen(msg));
+
+send() data is directly encrypted from the userspace buffer provided
+to the encrypted kernel send buffer if possible.
+
+The sendfile system call will send the file's data over TLS records of maximum
+length (2^14).
+
+ file = open(filename, O_RDONLY);
+ fstat(file, &stat);
+ sendfile(sock, file, &offset, stat.st_size);
+
+TLS records are created and sent after each send() call, unless
+MSG_MORE is passed. MSG_MORE will delay creation of a record until
+MSG_MORE is not passed, or the maximum record size is reached.
+
+The kernel will need to allocate a buffer for the encrypted data.
+This buffer is allocated at the time send() is called, such that
+either the entire send() call will return -ENOMEM (or block waiting
+for memory), or the encryption will always succeed. If send() returns
+-ENOMEM and some data was left on the socket buffer from a previous
+call using MSG_MORE, the MSG_MORE data is left on the socket buffer.
+
+Send TLS control messages
+-------------------------
+
+Other than application data, TLS has control messages such as alert
+messages (record type 21) and handshake messages (record type 22), etc.
+These messages can be sent over the socket by providing the TLS record type
+via a CMSG. For example the following function sends @data of @length bytes
+using a record of type @record_type.
+
+/* send TLS control message using record_type */
+ static int klts_send_ctrl_message(int sock, unsigned char record_type,
+ void *data, size_t length)
+ {
+ struct msghdr msg = {0};
+ int cmsg_len = sizeof(record_type);
+ struct cmsghdr *cmsg;
+ char buf[CMSG_SPACE(cmsg_len)];
+ struct iovec msg_iov; /* Vector of data to send/receive into. */
+
+ msg.msg_control = buf;
+ msg.msg_controllen = sizeof(buf);
+ cmsg = CMSG_FIRSTHDR(&msg);
+ cmsg->cmsg_level = SOL_TLS;
+ cmsg->cmsg_type = TLS_SET_RECORD_TYPE;
+ cmsg->cmsg_len = CMSG_LEN(cmsg_len);
+ *CMSG_DATA(cmsg) = record_type;
+ msg.msg_controllen = cmsg->cmsg_len;
+
+ msg_iov.iov_base = data;
+ msg_iov.iov_len = length;
+ msg.msg_iov = &msg_iov;
+ msg.msg_iovlen = 1;
+
+ return sendmsg(sock, &msg, 0);
+ }
+
+Control message data should be provided unencrypted, and will be
+encrypted by the kernel.
+
+Integrating in to userspace TLS library
+---------------------------------------
+
+At a high level, the kernel TLS ULP is a replacement for the record
+layer of a userspace TLS library.
+
+A patchset to OpenSSL to use ktls as the record layer is here:
+
+https://github.com/Mellanox/tls-openssl
+
+An example of calling send directly after a handshake using
+gnutls. Since it doesn't implement a full record layer, control
+messages are not supported:
+
+https://github.com/Mellanox/tls-af_ktls_tool
diff --git a/Documentation/networking/z8530book.rst b/Documentation/networking/z8530book.rst
new file mode 100644
index 000000000000..fea2c40e7973
--- /dev/null
+++ b/Documentation/networking/z8530book.rst
@@ -0,0 +1,256 @@
+=======================
+Z8530 Programming Guide
+=======================
+
+:Author: Alan Cox
+
+Introduction
+============
+
+The Z85x30 family synchronous/asynchronous controller chips are used on
+a large number of cheap network interface cards. The kernel provides a
+core interface layer that is designed to make it easy to provide WAN
+services using this chip.
+
+The current driver only support synchronous operation. Merging the
+asynchronous driver support into this code to allow any Z85x30 device to
+be used as both a tty interface and as a synchronous controller is a
+project for Linux post the 2.4 release
+
+Driver Modes
+============
+
+The Z85230 driver layer can drive Z8530, Z85C30 and Z85230 devices in
+three different modes. Each mode can be applied to an individual channel
+on the chip (each chip has two channels).
+
+The PIO synchronous mode supports the most common Z8530 wiring. Here the
+chip is interface to the I/O and interrupt facilities of the host
+machine but not to the DMA subsystem. When running PIO the Z8530 has
+extremely tight timing requirements. Doing high speeds, even with a
+Z85230 will be tricky. Typically you should expect to achieve at best
+9600 baud with a Z8C530 and 64Kbits with a Z85230.
+
+The DMA mode supports the chip when it is configured to use dual DMA
+channels on an ISA bus. The better cards tend to support this mode of
+operation for a single channel. With DMA running the Z85230 tops out
+when it starts to hit ISA DMA constraints at about 512Kbits. It is worth
+noting here that many PC machines hang or crash when the chip is driven
+fast enough to hold the ISA bus solid.
+
+Transmit DMA mode uses a single DMA channel. The DMA channel is used for
+transmission as the transmit FIFO is smaller than the receive FIFO. it
+gives better performance than pure PIO mode but is nowhere near as ideal
+as pure DMA mode.
+
+Using the Z85230 driver
+=======================
+
+The Z85230 driver provides the back end interface to your board. To
+configure a Z8530 interface you need to detect the board and to identify
+its ports and interrupt resources. It is also your problem to verify the
+resources are available.
+
+Having identified the chip you need to fill in a struct z8530_dev,
+which describes each chip. This object must exist until you finally
+shutdown the board. Firstly zero the active field. This ensures nothing
+goes off without you intending it. The irq field should be set to the
+interrupt number of the chip. (Each chip has a single interrupt source
+rather than each channel). You are responsible for allocating the
+interrupt line. The interrupt handler should be set to
+:c:func:`z8530_interrupt()`. The device id should be set to the
+z8530_dev structure pointer. Whether the interrupt can be shared or not
+is board dependent, and up to you to initialise.
+
+The structure holds two channel structures. Initialise chanA.ctrlio and
+chanA.dataio with the address of the control and data ports. You can or
+this with Z8530_PORT_SLEEP to indicate your interface needs the 5uS
+delay for chip settling done in software. The PORT_SLEEP option is
+architecture specific. Other flags may become available on future
+platforms, eg for MMIO. Initialise the chanA.irqs to &z8530_nop to
+start the chip up as disabled and discarding interrupt events. This
+ensures that stray interrupts will be mopped up and not hang the bus.
+Set chanA.dev to point to the device structure itself. The private and
+name field you may use as you wish. The private field is unused by the
+Z85230 layer. The name is used for error reporting and it may thus make
+sense to make it match the network name.
+
+Repeat the same operation with the B channel if your chip has both
+channels wired to something useful. This isn't always the case. If it is
+not wired then the I/O values do not matter, but you must initialise
+chanB.dev.
+
+If your board has DMA facilities then initialise the txdma and rxdma
+fields for the relevant channels. You must also allocate the ISA DMA
+channels and do any necessary board level initialisation to configure
+them. The low level driver will do the Z8530 and DMA controller
+programming but not board specific magic.
+
+Having initialised the device you can then call
+:c:func:`z8530_init()`. This will probe the chip and reset it into
+a known state. An identification sequence is then run to identify the
+chip type. If the checks fail to pass the function returns a non zero
+error code. Typically this indicates that the port given is not valid.
+After this call the type field of the z8530_dev structure is
+initialised to either Z8530, Z85C30 or Z85230 according to the chip
+found.
+
+Once you have called z8530_init you can also make use of the utility
+function :c:func:`z8530_describe()`. This provides a consistent
+reporting format for the Z8530 devices, and allows all the drivers to
+provide consistent reporting.
+
+Attaching Network Interfaces
+============================
+
+If you wish to use the network interface facilities of the driver, then
+you need to attach a network device to each channel that is present and
+in use. In addition to use the generic HDLC you need to follow some
+additional plumbing rules. They may seem complex but a look at the
+example hostess_sv11 driver should reassure you.
+
+The network device used for each channel should be pointed to by the
+netdevice field of each channel. The hdlc-> priv field of the network
+device points to your private data - you will need to be able to find
+your private data from this.
+
+The way most drivers approach this particular problem is to create a
+structure holding the Z8530 device definition and put that into the
+private field of the network device. The network device fields of the
+channels then point back to the network devices.
+
+If you wish to use the generic HDLC then you need to register the HDLC
+device.
+
+Before you register your network device you will also need to provide
+suitable handlers for most of the network device callbacks. See the
+network device documentation for more details on this.
+
+Configuring And Activating The Port
+===================================
+
+The Z85230 driver provides helper functions and tables to load the port
+registers on the Z8530 chips. When programming the register settings for
+a channel be aware that the documentation recommends initialisation
+orders. Strange things happen when these are not followed.
+
+:c:func:`z8530_channel_load()` takes an array of pairs of
+initialisation values in an array of u8 type. The first value is the
+Z8530 register number. Add 16 to indicate the alternate register bank on
+the later chips. The array is terminated by a 255.
+
+The driver provides a pair of public tables. The z8530_hdlc_kilostream
+table is for the UK 'Kilostream' service and also happens to cover most
+other end host configurations. The z8530_hdlc_kilostream_85230 table
+is the same configuration using the enhancements of the 85230 chip. The
+configuration loaded is standard NRZ encoded synchronous data with HDLC
+bitstuffing. All of the timing is taken from the other end of the link.
+
+When writing your own tables be aware that the driver internally tracks
+register values. It may need to reload values. You should therefore be
+sure to set registers 1-7, 9-11, 14 and 15 in all configurations. Where
+the register settings depend on DMA selection the driver will update the
+bits itself when you open or close. Loading a new table with the
+interface open is not recommended.
+
+There are three standard configurations supported by the core code. In
+PIO mode the interface is programmed up to use interrupt driven PIO.
+This places high demands on the host processor to avoid latency. The
+driver is written to take account of latency issues but it cannot avoid
+latencies caused by other drivers, notably IDE in PIO mode. Because the
+drivers allocate buffers you must also prevent MTU changes while the
+port is open.
+
+Once the port is open it will call the rx_function of each channel
+whenever a completed packet arrived. This is invoked from interrupt
+context and passes you the channel and a network buffer (struct
+sk_buff) holding the data. The data includes the CRC bytes so most
+users will want to trim the last two bytes before processing the data.
+This function is very timing critical. When you wish to simply discard
+data the support code provides the function
+:c:func:`z8530_null_rx()` to discard the data.
+
+To active PIO mode sending and receiving the ``z8530_sync_open`` is called.
+This expects to be passed the network device and the channel. Typically
+this is called from your network device open callback. On a failure a
+non zero error status is returned.
+The :c:func:`z8530_sync_close()` function shuts down a PIO
+channel. This must be done before the channel is opened again and before
+the driver shuts down and unloads.
+
+The ideal mode of operation is dual channel DMA mode. Here the kernel
+driver will configure the board for DMA in both directions. The driver
+also handles ISA DMA issues such as controller programming and the
+memory range limit for you. This mode is activated by calling the
+:c:func:`z8530_sync_dma_open()` function. On failure a non zero
+error value is returned. Once this mode is activated it can be shut down
+by calling the :c:func:`z8530_sync_dma_close()`. You must call
+the close function matching the open mode you used.
+
+The final supported mode uses a single DMA channel to drive the transmit
+side. As the Z85C30 has a larger FIFO on the receive channel this tends
+to increase the maximum speed a little. This is activated by calling the
+``z8530_sync_txdma_open``. This returns a non zero error code on failure. The
+:c:func:`z8530_sync_txdma_close()` function closes down the Z8530
+interface from this mode.
+
+Network Layer Functions
+=======================
+
+The Z8530 layer provides functions to queue packets for transmission.
+The driver internally buffers the frame currently being transmitted and
+one further frame (in order to keep back to back transmission running).
+Any further buffering is up to the caller.
+
+The function :c:func:`z8530_queue_xmit()` takes a network buffer
+in sk_buff format and queues it for transmission. The caller must
+provide the entire packet with the exception of the bitstuffing and CRC.
+This is normally done by the caller via the generic HDLC interface
+layer. It returns 0 if the buffer has been queued and non zero values
+for queue full. If the function accepts the buffer it becomes property
+of the Z8530 layer and the caller should not free it.
+
+The function :c:func:`z8530_get_stats()` returns a pointer to an
+internally maintained per interface statistics block. This provides most
+of the interface code needed to implement the network layer get_stats
+callback.
+
+Porting The Z8530 Driver
+========================
+
+The Z8530 driver is written to be portable. In DMA mode it makes
+assumptions about the use of ISA DMA. These are probably warranted in
+most cases as the Z85230 in particular was designed to glue to PC type
+machines. The PIO mode makes no real assumptions.
+
+Should you need to retarget the Z8530 driver to another architecture the
+only code that should need changing are the port I/O functions. At the
+moment these assume PC I/O port accesses. This may not be appropriate
+for all platforms. Replacing :c:func:`z8530_read_port()` and
+``z8530_write_port`` is intended to be all that is required to port
+this driver layer.
+
+Known Bugs And Assumptions
+==========================
+
+Interrupt Locking
+ The locking in the driver is done via the global cli/sti lock. This
+ makes for relatively poor SMP performance. Switching this to use a
+ per device spin lock would probably materially improve performance.
+
+Occasional Failures
+ We have reports of occasional failures when run for very long
+ periods of time and the driver starts to receive junk frames. At the
+ moment the cause of this is not clear.
+
+Public Functions Provided
+=========================
+
+.. kernel-doc:: drivers/net/wan/z85230.c
+ :export:
+
+Internal Functions
+==================
+
+.. kernel-doc:: drivers/net/wan/z85230.c
+ :internal: