summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-bus-siox22
-rw-r--r--Documentation/ABI/testing/sysfs-class-net-qmi4
-rw-r--r--Documentation/admin-guide/cgroup-v2.rst9
-rw-r--r--Documentation/arm64/sve.txt16
-rw-r--r--Documentation/block/switching-sched.txt18
-rw-r--r--Documentation/cgroup-v1/blkio-controller.txt96
-rw-r--r--Documentation/cgroup-v1/hugetlb.txt22
-rw-r--r--Documentation/devicetree/bindings/net/can/microchip,mcp251x.txt1
-rw-r--r--Documentation/devicetree/bindings/net/dsa/ksz.txt2
-rw-r--r--Documentation/devicetree/bindings/net/keystone-netcp.txt44
-rw-r--r--Documentation/devicetree/bindings/net/macb.txt3
-rw-r--r--Documentation/devicetree/bindings/net/socfpga-dwmac.txt10
-rw-r--r--Documentation/devicetree/bindings/net/wiznet,w5x00.txt50
-rw-r--r--Documentation/devicetree/bindings/net/xilinx_axienet.txt29
-rw-r--r--Documentation/devicetree/bindings/ptp/ptp-qoriq.txt3
-rw-r--r--Documentation/devicetree/bindings/riscv/cpus.yaml168
-rw-r--r--Documentation/devicetree/bindings/riscv/sifive.yaml25
-rw-r--r--Documentation/driver-api/80211/mac80211-advanced.rst3
-rw-r--r--Documentation/driver-api/uio-howto.rst4
-rw-r--r--Documentation/fb/fbcon.txt2
-rw-r--r--Documentation/filesystems/overlayfs.txt16
-rw-r--r--Documentation/networking/af_xdp.rst8
-rw-r--r--Documentation/networking/device_drivers/index.rst1
-rw-r--r--Documentation/networking/device_drivers/mellanox/mlx5.rst173
-rw-r--r--Documentation/networking/ip-sysctl.txt37
-rw-r--r--Documentation/networking/phy.rst45
-rw-r--r--Documentation/networking/rds.txt2
-rw-r--r--Documentation/networking/tls-offload.rst54
-rw-r--r--Documentation/usb/rio.txt66
-rw-r--r--Documentation/virtual/kvm/api.txt48
-rw-r--r--Documentation/vm/hmm.rst8
31 files changed, 769 insertions, 220 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-siox b/Documentation/ABI/testing/sysfs-bus-siox
index fed7c3765a4e..c2a403f20b90 100644
--- a/Documentation/ABI/testing/sysfs-bus-siox
+++ b/Documentation/ABI/testing/sysfs-bus-siox
@@ -1,6 +1,6 @@
What: /sys/bus/siox/devices/siox-X/active
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
On reading represents the current state of the bus. If it
contains a "0" the bus is stopped and connected devices are
@@ -12,7 +12,7 @@ Description:
What: /sys/bus/siox/devices/siox-X/device_add
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
Write-only file. Write
@@ -27,13 +27,13 @@ Description:
What: /sys/bus/siox/devices/siox-X/device_remove
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
Write-only file. A single write removes the last device in the siox chain.
What: /sys/bus/siox/devices/siox-X/poll_interval_ns
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
Defines the interval between two poll cycles in nano seconds.
Note this is rounded to jiffies on writing. On reading the current value
@@ -41,33 +41,33 @@ Description:
What: /sys/bus/siox/devices/siox-X-Y/connected
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
Read-only value. "0" means the Yth device on siox bus X isn't "connected" i.e.
communication with it is not ensured. "1" signals a working connection.
What: /sys/bus/siox/devices/siox-X-Y/inbytes
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
Read-only value reporting the inbytes value provided to siox-X/device_add
What: /sys/bus/siox/devices/siox-X-Y/status_errors
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
Counts the number of time intervals when the read status byte doesn't yield the
expected value.
What: /sys/bus/siox/devices/siox-X-Y/type
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
Read-only value reporting the type value provided to siox-X/device_add.
What: /sys/bus/siox/devices/siox-X-Y/watchdog
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
Read-only value reporting if the watchdog of the siox device is
active. "0" means the watchdog is not active and the device is expected to
@@ -75,13 +75,13 @@ Description:
What: /sys/bus/siox/devices/siox-X-Y/watchdog_errors
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
Read-only value reporting the number to time intervals when the
watchdog was active.
What: /sys/bus/siox/devices/siox-X-Y/outbytes
KernelVersion: 4.16
-Contact: Gavin Schenk <g.schenk@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
+Contact: Thorsten Scherer <t.scherer@eckelmann.de>, Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Description:
Read-only value reporting the outbytes value provided to siox-X/device_add.
diff --git a/Documentation/ABI/testing/sysfs-class-net-qmi b/Documentation/ABI/testing/sysfs-class-net-qmi
index 7122d6264c49..c310db4ccbc2 100644
--- a/Documentation/ABI/testing/sysfs-class-net-qmi
+++ b/Documentation/ABI/testing/sysfs-class-net-qmi
@@ -29,7 +29,7 @@ Contact: Bjørn Mork <bjorn@mork.no>
Description:
Unsigned integer.
- Write a number ranging from 1 to 127 to add a qmap mux
+ Write a number ranging from 1 to 254 to add a qmap mux
based network device, supported by recent Qualcomm based
modems.
@@ -46,5 +46,5 @@ Contact: Bjørn Mork <bjorn@mork.no>
Description:
Unsigned integer.
- Write a number ranging from 1 to 127 to delete a previously
+ Write a number ranging from 1 to 254 to delete a previously
created qmap mux based network device.
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 88e746074252..cf88c1f98270 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -177,6 +177,15 @@ cgroup v2 currently supports the following mount options.
ignored on non-init namespace mounts. Please refer to the
Delegation section for details.
+ memory_localevents
+
+ Only populate memory.events with data for the current cgroup,
+ and not any subtrees. This is legacy behaviour, the default
+ behaviour without this option is to include subtree counts.
+ This option is system wide and can only be set on mount or
+ modified through remount from the init namespace. The mount
+ option is ignored on non-init namespace mounts.
+
Organizing Processes and Threads
--------------------------------
diff --git a/Documentation/arm64/sve.txt b/Documentation/arm64/sve.txt
index 9940e924a47e..5689fc9a976a 100644
--- a/Documentation/arm64/sve.txt
+++ b/Documentation/arm64/sve.txt
@@ -56,6 +56,18 @@ model features for SVE is included in Appendix A.
is to connect to a target process first and then attempt a
ptrace(PTRACE_GETREGSET, pid, NT_ARM_SVE, &iov).
+* Whenever SVE scalable register values (Zn, Pn, FFR) are exchanged in memory
+ between userspace and the kernel, the register value is encoded in memory in
+ an endianness-invariant layout, with bits [(8 * i + 7) : (8 * i)] encoded at
+ byte offset i from the start of the memory representation. This affects for
+ example the signal frame (struct sve_context) and ptrace interface
+ (struct user_sve_header) and associated data.
+
+ Beware that on big-endian systems this results in a different byte order than
+ for the FPSIMD V-registers, which are stored as single host-endian 128-bit
+ values, with bits [(127 - 8 * i) : (120 - 8 * i)] of the register encoded at
+ byte offset i. (struct fpsimd_context, struct user_fpsimd_state).
+
2. Vector length terminology
-----------------------------
@@ -124,6 +136,10 @@ the SVE instruction set architecture.
size and layout. Macros SVE_SIG_* are defined [1] to facilitate access to
the members.
+* Each scalable register (Zn, Pn, FFR) is stored in an endianness-invariant
+ layout, with bits [(8 * i + 7) : (8 * i)] stored at byte offset i from the
+ start of the register's representation in memory.
+
* If the SVE context is too big to fit in sigcontext.__reserved[], then extra
space is allocated on the stack, an extra_context record is written in
__reserved[] referencing this space. sve_context is then written in the
diff --git a/Documentation/block/switching-sched.txt b/Documentation/block/switching-sched.txt
index 3b2612e342f1..7977f6fb8b20 100644
--- a/Documentation/block/switching-sched.txt
+++ b/Documentation/block/switching-sched.txt
@@ -13,11 +13,9 @@ you can do so by typing:
# mount none /sys -t sysfs
-As of the Linux 2.6.10 kernel, it is now possible to change the
-IO scheduler for a given block device on the fly (thus making it possible,
-for instance, to set the CFQ scheduler for the system default, but
-set a specific device to use the deadline or noop schedulers - which
-can improve that device's throughput).
+It is possible to change the IO scheduler for a given block device on
+the fly to select one of mq-deadline, none, bfq, or kyber schedulers -
+which can improve that device's throughput.
To set a specific scheduler, simply do this:
@@ -30,8 +28,8 @@ The list of defined schedulers can be found by simply doing
a "cat /sys/block/DEV/queue/scheduler" - the list of valid names
will be displayed, with the currently selected scheduler in brackets:
-# cat /sys/block/hda/queue/scheduler
-noop deadline [cfq]
-# echo deadline > /sys/block/hda/queue/scheduler
-# cat /sys/block/hda/queue/scheduler
-noop [deadline] cfq
+# cat /sys/block/sda/queue/scheduler
+[mq-deadline] kyber bfq none
+# echo none >/sys/block/sda/queue/scheduler
+# cat /sys/block/sda/queue/scheduler
+[none] mq-deadline kyber bfq
diff --git a/Documentation/cgroup-v1/blkio-controller.txt b/Documentation/cgroup-v1/blkio-controller.txt
index 673dc34d3f78..d1a1b7bdd03a 100644
--- a/Documentation/cgroup-v1/blkio-controller.txt
+++ b/Documentation/cgroup-v1/blkio-controller.txt
@@ -8,61 +8,13 @@ both at leaf nodes as well as at intermediate nodes in a storage hierarchy.
Plan is to use the same cgroup based management interface for blkio controller
and based on user options switch IO policies in the background.
-Currently two IO control policies are implemented. First one is proportional
-weight time based division of disk policy. It is implemented in CFQ. Hence
-this policy takes effect only on leaf nodes when CFQ is being used. The second
-one is throttling policy which can be used to specify upper IO rate limits
-on devices. This policy is implemented in generic block layer and can be
-used on leaf nodes as well as higher level logical devices like device mapper.
+One IO control policy is throttling policy which can be used to
+specify upper IO rate limits on devices. This policy is implemented in
+generic block layer and can be used on leaf nodes as well as higher
+level logical devices like device mapper.
HOWTO
=====
-Proportional Weight division of bandwidth
------------------------------------------
-You can do a very simple testing of running two dd threads in two different
-cgroups. Here is what you can do.
-
-- Enable Block IO controller
- CONFIG_BLK_CGROUP=y
-
-- Enable group scheduling in CFQ
- CONFIG_CFQ_GROUP_IOSCHED=y
-
-- Compile and boot into kernel and mount IO controller (blkio); see
- cgroups.txt, Why are cgroups needed?.
-
- mount -t tmpfs cgroup_root /sys/fs/cgroup
- mkdir /sys/fs/cgroup/blkio
- mount -t cgroup -o blkio none /sys/fs/cgroup/blkio
-
-- Create two cgroups
- mkdir -p /sys/fs/cgroup/blkio/test1/ /sys/fs/cgroup/blkio/test2
-
-- Set weights of group test1 and test2
- echo 1000 > /sys/fs/cgroup/blkio/test1/blkio.weight
- echo 500 > /sys/fs/cgroup/blkio/test2/blkio.weight
-
-- Create two same size files (say 512MB each) on same disk (file1, file2) and
- launch two dd threads in different cgroup to read those files.
-
- sync
- echo 3 > /proc/sys/vm/drop_caches
-
- dd if=/mnt/sdb/zerofile1 of=/dev/null &
- echo $! > /sys/fs/cgroup/blkio/test1/tasks
- cat /sys/fs/cgroup/blkio/test1/tasks
-
- dd if=/mnt/sdb/zerofile2 of=/dev/null &
- echo $! > /sys/fs/cgroup/blkio/test2/tasks
- cat /sys/fs/cgroup/blkio/test2/tasks
-
-- At macro level, first dd should finish first. To get more precise data, keep
- on looking at (with the help of script), at blkio.disk_time and
- blkio.disk_sectors files of both test1 and test2 groups. This will tell how
- much disk time (in milliseconds), each group got and how many sectors each
- group dispatched to the disk. We provide fairness in terms of disk time, so
- ideally io.disk_time of cgroups should be in proportion to the weight.
-
Throttling/Upper Limit policy
-----------------------------
- Enable Block IO controller
@@ -94,7 +46,7 @@ Throttling/Upper Limit policy
Hierarchical Cgroups
====================
-Both CFQ and throttling implement hierarchy support; however,
+Throttling implements hierarchy support; however,
throttling's hierarchy support is enabled iff "sane_behavior" is
enabled from cgroup side, which currently is a development option and
not publicly available.
@@ -107,9 +59,8 @@ If somebody created a hierarchy like as follows.
|
test3
-CFQ by default and throttling with "sane_behavior" will handle the
-hierarchy correctly. For details on CFQ hierarchy support, refer to
-Documentation/block/cfq-iosched.txt. For throttling, all limits apply
+Throttling with "sane_behavior" will handle the
+hierarchy correctly. For throttling, all limits apply
to the whole subtree while all statistics are local to the IOs
directly generated by tasks in that cgroup.
@@ -130,10 +81,6 @@ CONFIG_DEBUG_BLK_CGROUP
- Debug help. Right now some additional stats file show up in cgroup
if this option is enabled.
-CONFIG_CFQ_GROUP_IOSCHED
- - Enables group scheduling in CFQ. Currently only 1 level of group
- creation is allowed.
-
CONFIG_BLK_DEV_THROTTLING
- Enable block device throttling support in block layer.
@@ -344,32 +291,3 @@ Common files among various policies
- blkio.reset_stats
- Writing an int to this file will result in resetting all the stats
for that cgroup.
-
-CFQ sysfs tunable
-=================
-/sys/block/<disk>/queue/iosched/slice_idle
-------------------------------------------
-On a faster hardware CFQ can be slow, especially with sequential workload.
-This happens because CFQ idles on a single queue and single queue might not
-drive deeper request queue depths to keep the storage busy. In such scenarios
-one can try setting slice_idle=0 and that would switch CFQ to IOPS
-(IO operations per second) mode on NCQ supporting hardware.
-
-That means CFQ will not idle between cfq queues of a cfq group and hence be
-able to driver higher queue depth and achieve better throughput. That also
-means that cfq provides fairness among groups in terms of IOPS and not in
-terms of disk time.
-
-/sys/block/<disk>/queue/iosched/group_idle
-------------------------------------------
-If one disables idling on individual cfq queues and cfq service trees by
-setting slice_idle=0, group_idle kicks in. That means CFQ will still idle
-on the group in an attempt to provide fairness among groups.
-
-By default group_idle is same as slice_idle and does not do anything if
-slice_idle is enabled.
-
-One can experience an overall throughput drop if you have created multiple
-groups and put applications in that group which are not driving enough
-IO to keep disk busy. In that case set group_idle=0, and CFQ will not idle
-on individual groups and throughput should improve.
diff --git a/Documentation/cgroup-v1/hugetlb.txt b/Documentation/cgroup-v1/hugetlb.txt
index 106245c3aecc..1260e5369b9b 100644
--- a/Documentation/cgroup-v1/hugetlb.txt
+++ b/Documentation/cgroup-v1/hugetlb.txt
@@ -32,14 +32,18 @@ Brief summary of control files
hugetlb.<hugepagesize>.usage_in_bytes # show current usage for "hugepagesize" hugetlb
hugetlb.<hugepagesize>.failcnt # show the number of allocation failure due to HugeTLB limit
-For a system supporting two hugepage size (16M and 16G) the control
+For a system supporting three hugepage sizes (64k, 32M and 1G), the control
files include:
-hugetlb.16GB.limit_in_bytes
-hugetlb.16GB.max_usage_in_bytes
-hugetlb.16GB.usage_in_bytes
-hugetlb.16GB.failcnt
-hugetlb.16MB.limit_in_bytes
-hugetlb.16MB.max_usage_in_bytes
-hugetlb.16MB.usage_in_bytes
-hugetlb.16MB.failcnt
+hugetlb.1GB.limit_in_bytes
+hugetlb.1GB.max_usage_in_bytes
+hugetlb.1GB.usage_in_bytes
+hugetlb.1GB.failcnt
+hugetlb.64KB.limit_in_bytes
+hugetlb.64KB.max_usage_in_bytes
+hugetlb.64KB.usage_in_bytes
+hugetlb.64KB.failcnt
+hugetlb.32MB.limit_in_bytes
+hugetlb.32MB.max_usage_in_bytes
+hugetlb.32MB.usage_in_bytes
+hugetlb.32MB.failcnt
diff --git a/Documentation/devicetree/bindings/net/can/microchip,mcp251x.txt b/Documentation/devicetree/bindings/net/can/microchip,mcp251x.txt
index 188c8bd4eb67..5a0111d4de58 100644
--- a/Documentation/devicetree/bindings/net/can/microchip,mcp251x.txt
+++ b/Documentation/devicetree/bindings/net/can/microchip,mcp251x.txt
@@ -4,6 +4,7 @@ Required properties:
- compatible: Should be one of the following:
- "microchip,mcp2510" for MCP2510.
- "microchip,mcp2515" for MCP2515.
+ - "microchip,mcp25625" for MCP25625.
- reg: SPI chip select.
- clocks: The clock feeding the CAN controller.
- interrupts: Should contain IRQ line for the CAN controller.
diff --git a/Documentation/devicetree/bindings/net/dsa/ksz.txt b/Documentation/devicetree/bindings/net/dsa/ksz.txt
index e7db7268fd0f..4ac21cef370e 100644
--- a/Documentation/devicetree/bindings/net/dsa/ksz.txt
+++ b/Documentation/devicetree/bindings/net/dsa/ksz.txt
@@ -16,6 +16,8 @@ Required properties:
Optional properties:
- reset-gpios : Should be a gpio specifier for a reset line
+- microchip,synclko-125 : Set if the output SYNCLKO frequency should be set to
+ 125MHz instead of 25MHz.
See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional
required and optional properties.
diff --git a/Documentation/devicetree/bindings/net/keystone-netcp.txt b/Documentation/devicetree/bindings/net/keystone-netcp.txt
index 6262c2f293b0..24f11e042f8d 100644
--- a/Documentation/devicetree/bindings/net/keystone-netcp.txt
+++ b/Documentation/devicetree/bindings/net/keystone-netcp.txt
@@ -104,6 +104,23 @@ Required properties:
- 10Gb mac<->mac forced mode : 11
----phy-handle: phandle to PHY device
+- cpts: sub-node time synchronization (CPTS) submodule configuration
+-- clocks: CPTS reference clock. Should point on cpts-refclk-mux clock.
+-- clock-names: should be "cpts"
+-- cpts-refclk-mux: multiplexer clock definition sub-node for CPTS reference (RFTCLK) clock
+--- #clock-cells: should be 0
+--- clocks: list of CPTS reference (RFTCLK) clock's parents as defined in Data manual
+--- ti,mux-tbl: array of multiplexer indexes as defined in Data manual
+--- assigned-clocks: should point on cpts-refclk-mux clock
+--- assigned-clock-parents: should point on required RFTCLK clock parent to be selected
+-- cpts_clock_mult: (optional) Numerator to convert input clock ticks
+ into nanoseconds
+-- cpts_clock_shift: (optional) Denominator to convert input clock ticks into
+ nanoseconds.
+ Mult and shift will be calculated basing on CPTS
+ rftclk frequency if both cpts_clock_shift and
+ cpts_clock_mult properties are not provided.
+
Optional properties:
- enable-ale: NetCP driver keeps the address learning feature in the ethernet
switch module disabled. This attribute is to enable the address
@@ -168,6 +185,23 @@ netcp: netcp@2000000 {
tx-queue = <648>;
tx-channel = <8>;
+ cpts {
+ clocks = <&cpts_refclk_mux>;
+ clock-names = "cpts";
+
+ cpts_refclk_mux: cpts-refclk-mux {
+ #clock-cells = <0>;
+ clocks = <&chipclk12>, <&chipclk13>,
+ <&timi0>, <&timi1>,
+ <&tsipclka>, <&tsrefclk>,
+ <&tsipclkb>;
+ ti,mux-tbl = <0x0>, <0x1>, <0x2>,
+ <0x3>, <0x4>, <0x8>, <0xC>;
+ assigned-clocks = <&cpts_refclk_mux>;
+ assigned-clock-parents = <&chipclk12>;
+ };
+ };
+
interfaces {
gbe0: interface-0 {
slave-port = <0>;
@@ -219,3 +253,13 @@ netcp: netcp@2000000 {
};
};
};
+
+CPTS board configuration - select external CPTS RFTCLK:
+
+&tsrefclk{
+ clock-frequency = <500000000>;
+};
+
+&cpts_refclk_mux {
+ assigned-clock-parents = <&tsrefclk>;
+};
diff --git a/Documentation/devicetree/bindings/net/macb.txt b/Documentation/devicetree/bindings/net/macb.txt
index 9c5e94482b5f..63c73fafe26d 100644
--- a/Documentation/devicetree/bindings/net/macb.txt
+++ b/Documentation/devicetree/bindings/net/macb.txt
@@ -15,8 +15,11 @@ Required properties:
Use "atmel,sama5d4-gem" for the GEM IP (10/100) available on Atmel sama5d4 SoCs.
Use "cdns,zynq-gem" Xilinx Zynq-7xxx SoC.
Use "cdns,zynqmp-gem" for Zynq Ultrascale+ MPSoC.
+ Use "sifive,fu540-macb" for SiFive FU540-C000 SoC.
Or the generic form: "cdns,emac".
- reg: Address and length of the register set for the device
+ For "sifive,fu540-macb", second range is required to specify the
+ address and length of the registers for GEMGXL Management block.
- interrupts: Should contain macb interrupt
- phy-mode: See ethernet.txt file in the same directory.
- clock-names: Tuple listing input clock names.
diff --git a/Documentation/devicetree/bindings/net/socfpga-dwmac.txt b/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
index 17d6819669c8..612a8e8abc88 100644
--- a/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
+++ b/Documentation/devicetree/bindings/net/socfpga-dwmac.txt
@@ -6,11 +6,17 @@ present in Documentation/devicetree/bindings/net/stmmac.txt.
The device node has additional properties:
Required properties:
- - compatible : Should contain "altr,socfpga-stmmac" along with
- "snps,dwmac" and any applicable more detailed
+ - compatible : For Cyclone5/Arria5 SoCs it should contain
+ "altr,socfpga-stmmac". For Arria10/Agilex/Stratix10 SoCs
+ "altr,socfpga-stmmac-a10-s10".
+ Along with "snps,dwmac" and any applicable more detailed
designware version numbers documented in stmmac.txt
- altr,sysmgr-syscon : Should be the phandle to the system manager node that
encompasses the glue register, the register offset, and the register shift.
+ On Cyclone5/Arria5, the register shift represents the PHY mode bits, while
+ on the Arria10/Stratix10/Agilex platforms, the register shift represents
+ bit for each emac to enable/disable signals from the FPGA fabric to the
+ EMAC modules.
- altr,f2h_ptp_ref_clk use f2h_ptp_ref_clk instead of default eosc1 clock
for ptp ref clk. This affects all emacs as the clock is common.
diff --git a/Documentation/devicetree/bindings/net/wiznet,w5x00.txt b/Documentation/devicetree/bindings/net/wiznet,w5x00.txt
new file mode 100644
index 000000000000..e9665798c4be
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/wiznet,w5x00.txt
@@ -0,0 +1,50 @@
+* Wiznet w5x00
+
+This is a standalone 10/100 MBit Ethernet controller with SPI interface.
+
+For each device connected to a SPI bus, define a child node within
+the SPI master node.
+
+Required properties:
+- compatible: Should be one of the following strings:
+ "wiznet,w5100"
+ "wiznet,w5200"
+ "wiznet,w5500"
+- reg: Specify the SPI chip select the chip is wired to.
+- interrupts: Specify the interrupt index within the interrupt controller (referred
+ to above in interrupt-parent) and interrupt type. w5x00 natively
+ generates falling edge interrupts, however, additional board logic
+ might invert the signal.
+- pinctrl-names: List of assigned state names, see pinctrl binding documentation.
+- pinctrl-0: List of phandles to configure the GPIO pin used as interrupt line,
+ see also generic and your platform specific pinctrl binding
+ documentation.
+
+Optional properties:
+- spi-max-frequency: Maximum frequency of the SPI bus when accessing the w5500.
+ According to the w5500 datasheet, the chip allows a maximum of 80 MHz, however,
+ board designs may need to limit this value.
+- local-mac-address: See ethernet.txt in the same directory.
+
+
+Example (for Raspberry Pi with pin control stuff for GPIO irq):
+
+&spi {
+ ethernet@0: w5500@0 {
+ compatible = "wiznet,w5500";
+ reg = <0>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&eth1_pins>;
+ interrupt-parent = <&gpio>;
+ interrupts = <25 IRQ_TYPE_EDGE_FALLING>;
+ spi-max-frequency = <30000000>;
+ };
+};
+
+&gpio {
+ eth1_pins: eth1_pins {
+ brcm,pins = <25>;
+ brcm,function = <0>; /* in */
+ brcm,pull = <0>; /* none */
+ };
+};
diff --git a/Documentation/devicetree/bindings/net/xilinx_axienet.txt b/Documentation/devicetree/bindings/net/xilinx_axienet.txt
index 38f9ec076743..7360617cdedb 100644
--- a/Documentation/devicetree/bindings/net/xilinx_axienet.txt
+++ b/Documentation/devicetree/bindings/net/xilinx_axienet.txt
@@ -17,8 +17,15 @@ For more details about mdio please refer phy.txt file in the same directory.
Required properties:
- compatible : Must be one of "xlnx,axi-ethernet-1.00.a",
"xlnx,axi-ethernet-1.01.a", "xlnx,axi-ethernet-2.01.a"
-- reg : Address and length of the IO space.
-- interrupts : Should be a list of two interrupt, TX and RX.
+- reg : Address and length of the IO space, as well as the address
+ and length of the AXI DMA controller IO space, unless
+ axistream-connected is specified, in which case the reg
+ attribute of the node referenced by it is used.
+- interrupts : Should be a list of 2 or 3 interrupts: TX DMA, RX DMA,
+ and optionally Ethernet core. If axistream-connected is
+ specified, the TX/RX DMA interrupts should be on that node
+ instead, and only the Ethernet core interrupt is optionally
+ specified here.
- phy-handle : Should point to the external phy device.
See ethernet.txt file in the same directory.
- xlnx,rxmem : Set to allocated memory buffer for Rx/Tx in the hardware
@@ -31,15 +38,29 @@ Optional properties:
1 to enable partial TX checksum offload,
2 to enable full TX checksum offload
- xlnx,rxcsum : Same values as xlnx,txcsum but for RX checksum offload
+- clocks : AXI bus clock for the device. Refer to common clock bindings.
+ Used to calculate MDIO clock divisor. If not specified, it is
+ auto-detected from the CPU clock (but only on platforms where
+ this is possible). New device trees should specify this - the
+ auto detection is only for backward compatibility.
+- axistream-connected: Reference to another node which contains the resources
+ for the AXI DMA controller used by this device.
+ If this is specified, the DMA-related resources from that
+ device (DMA registers and DMA TX/RX interrupts) rather
+ than this one will be used.
+ - mdio : Child node for MDIO bus. Must be defined if PHY access is
+ required through the core's MDIO interface (i.e. always,
+ unless the PHY is accessed through a different bus).
Example:
axi_ethernet_eth: ethernet@40c00000 {
compatible = "xlnx,axi-ethernet-1.00.a";
device_type = "network";
interrupt-parent = <&microblaze_0_axi_intc>;
- interrupts = <2 0>;
+ interrupts = <2 0 1>;
+ clocks = <&axi_clk>;
phy-mode = "mii";
- reg = <0x40c00000 0x40000>;
+ reg = <0x40c00000 0x40000 0x50c00000 0x40000>;
xlnx,rxcsum = <0x2>;
xlnx,rxmem = <0x800>;
xlnx,txcsum = <0x2>;
diff --git a/Documentation/devicetree/bindings/ptp/ptp-qoriq.txt b/Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
index 6ec053449278..d48f9eb3636e 100644
--- a/Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
+++ b/Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
@@ -4,7 +4,8 @@ General Properties:
- compatible Should be "fsl,etsec-ptp" for eTSEC
Should be "fsl,fman-ptp-timer" for DPAA FMan
- Should be "fsl,enetc-ptp" for ENETC
+ Should be "fsl,dpaa2-ptp" for DPAA2
+ Should be "fsl,enetc-ptp" for ENETC
- reg Offset and length of the register set for the device
- interrupts There should be at least two interrupts. Some devices
have as many as four PTP related interrupts.
diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml
new file mode 100644
index 000000000000..27f02ec4bb45
--- /dev/null
+++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
@@ -0,0 +1,168 @@
+# SPDX-License-Identifier: (GPL-2.0 OR MIT)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/riscv/cpus.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V bindings for 'cpus' DT nodes
+
+maintainers:
+ - Paul Walmsley <paul.walmsley@sifive.com>
+ - Palmer Dabbelt <palmer@sifive.com>
+
+allOf:
+ - $ref: /schemas/cpus.yaml#
+
+properties:
+ $nodename:
+ const: cpus
+ description: Container of cpu nodes
+
+ '#address-cells':
+ const: 1
+ description: |
+ A single unsigned 32-bit integer uniquely identifies each RISC-V
+ hart in a system. (See the "reg" node under the "cpu" node,
+ below).
+
+ '#size-cells':
+ const: 0
+
+patternProperties:
+ '^cpu@[0-9a-f]+$':
+ properties:
+ compatible:
+ type: array
+ items:
+ - enum:
+ - sifive,rocket0
+ - sifive,e5
+ - sifive,e51
+ - sifive,u54-mc
+ - sifive,u54
+ - sifive,u5
+ - const: riscv
+ description:
+ Identifies that the hart uses the RISC-V instruction set
+ and identifies the type of the hart.
+
+ mmu-type:
+ allOf:
+ - $ref: "/schemas/types.yaml#/definitions/string"
+ - enum:
+ - riscv,sv32
+ - riscv,sv39
+ - riscv,sv48
+ description:
+ Identifies the MMU address translation mode used on this
+ hart. These values originate from the RISC-V Privileged
+ Specification document, available from
+ https://riscv.org/specifications/
+
+ riscv,isa:
+ allOf:
+ - $ref: "/schemas/types.yaml#/definitions/string"
+ - enum:
+ - rv64imac
+ - rv64imafdc
+ description:
+ Identifies the specific RISC-V instruction set architecture
+ supported by the hart. These are documented in the RISC-V
+ User-Level ISA document, available from
+ https://riscv.org/specifications/
+
+ timebase-frequency:
+ type: integer
+ minimum: 1
+ description:
+ Specifies the clock frequency of the system timer in Hz.
+ This value is common to all harts on a single system image.
+
+ interrupt-controller:
+ type: object
+ description: Describes the CPU's local interrupt controller
+
+ properties:
+ '#interrupt-cells':
+ const: 1
+
+ compatible:
+ const: riscv,cpu-intc
+
+ interrupt-controller: true
+
+ required:
+ - '#interrupt-cells'
+ - compatible
+ - interrupt-controller
+
+ required:
+ - riscv,isa
+ - timebase-frequency
+ - interrupt-controller
+
+examples:
+ - |
+ // Example 1: SiFive Freedom U540G Development Kit
+ cpus {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ timebase-frequency = <1000000>;
+ cpu@0 {
+ clock-frequency = <0>;
+ compatible = "sifive,rocket0", "riscv";
+ device_type = "cpu";
+ i-cache-block-size = <64>;
+ i-cache-sets = <128>;
+ i-cache-size = <16384>;
+ reg = <0>;
+ riscv,isa = "rv64imac";
+ cpu_intc0: interrupt-controller {
+ #interrupt-cells = <1>;
+ compatible = "riscv,cpu-intc";
+ interrupt-controller;
+ };
+ };
+ cpu@1 {
+ clock-frequency = <0>;
+ compatible = "sifive,rocket0", "riscv";
+ d-cache-block-size = <64>;
+ d-cache-sets = <64>;
+ d-cache-size = <32768>;
+ d-tlb-sets = <1>;
+ d-tlb-size = <32>;
+ device_type = "cpu";
+ i-cache-block-size = <64>;
+ i-cache-sets = <64>;
+ i-cache-size = <32768>;
+ i-tlb-sets = <1>;
+ i-tlb-size = <32>;
+ mmu-type = "riscv,sv39";
+ reg = <1>;
+ riscv,isa = "rv64imafdc";
+ tlb-split;
+ cpu_intc1: interrupt-controller {
+ #interrupt-cells = <1>;
+ compatible = "riscv,cpu-intc";
+ interrupt-controller;
+ };
+ };
+ };
+
+ - |
+ // Example 2: Spike ISA Simulator with 1 Hart
+ cpus {
+ cpu@0 {
+ device_type = "cpu";
+ reg = <0>;
+ compatible = "riscv";
+ riscv,isa = "rv64imafdc";
+ mmu-type = "riscv,sv48";
+ interrupt-controller {
+ #interrupt-cells = <1>;
+ interrupt-controller;
+ compatible = "riscv,cpu-intc";
+ };
+ };
+ };
+...
diff --git a/Documentation/devicetree/bindings/riscv/sifive.yaml b/Documentation/devicetree/bindings/riscv/sifive.yaml
new file mode 100644
index 000000000000..9d17dc2f3f84
--- /dev/null
+++ b/Documentation/devicetree/bindings/riscv/sifive.yaml
@@ -0,0 +1,25 @@
+# SPDX-License-Identifier: (GPL-2.0 OR MIT)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/riscv/sifive.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: SiFive SoC-based boards
+
+maintainers:
+ - Paul Walmsley <paul.walmsley@sifive.com>
+ - Palmer Dabbelt <palmer@sifive.com>
+
+description:
+ SiFive SoC-based boards
+
+properties:
+ $nodename:
+ const: '/'
+ compatible:
+ items:
+ - enum:
+ - sifive,freedom-unleashed-a00
+ - const: sifive,fu540-c000
+ - const: sifive,fu540
+...
diff --git a/Documentation/driver-api/80211/mac80211-advanced.rst b/Documentation/driver-api/80211/mac80211-advanced.rst
index 70a89b2163c2..9f1c5bb7ac35 100644
--- a/Documentation/driver-api/80211/mac80211-advanced.rst
+++ b/Documentation/driver-api/80211/mac80211-advanced.rst
@@ -226,9 +226,6 @@ TBD
.. kernel-doc:: include/net/mac80211.h
:functions: ieee80211_tx_rate_control
-.. kernel-doc:: include/net/mac80211.h
- :functions: rate_control_send_low
-
TBD
This part of the book describes mac80211 internals.
diff --git a/Documentation/driver-api/uio-howto.rst b/Documentation/driver-api/uio-howto.rst
index 25f50eace28b..8fecfa11d4ff 100644
--- a/Documentation/driver-api/uio-howto.rst
+++ b/Documentation/driver-api/uio-howto.rst
@@ -276,8 +276,8 @@ fields of ``struct uio_mem``:
- ``int memtype``: Required if the mapping is used. Set this to
``UIO_MEM_PHYS`` if you you have physical memory on your card to be
mapped. Use ``UIO_MEM_LOGICAL`` for logical memory (e.g. allocated
- with :c:func:`kmalloc()`). There's also ``UIO_MEM_VIRTUAL`` for
- virtual memory.
+ with :c:func:`__get_free_pages()` but not kmalloc()). There's also
+ ``UIO_MEM_VIRTUAL`` for virtual memory.
- ``phys_addr_t addr``: Required if the mapping is used. Fill in the
address of your memory block. This address is the one that appears in
diff --git a/Documentation/fb/fbcon.txt b/Documentation/fb/fbcon.txt
index 60a5ec04e8f0..5a865437b33f 100644
--- a/Documentation/fb/fbcon.txt
+++ b/Documentation/fb/fbcon.txt
@@ -79,7 +79,7 @@ C. Boot options
Select the initial font to use. The value 'name' can be any of the
compiled-in fonts: 10x18, 6x10, 7x14, Acorn8x8, MINI4x6,
- PEARL8x8, ProFont6x11, SUN12x22, SUN8x16, VGA8x16, VGA8x8.
+ PEARL8x8, ProFont6x11, SUN12x22, SUN8x16, TER16x32, VGA8x16, VGA8x8.
Note, not all drivers can handle font with widths not divisible by 8,
such as vga16fb.
diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index eef7d9d259e8..1da2f1668f08 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -336,8 +336,20 @@ the copied layers will fail the verification of the lower root file handle.
Non-standard behavior
---------------------
-Overlayfs can now act as a POSIX compliant filesystem with the following
-features turned on:
+Current version of overlayfs can act as a mostly POSIX compliant
+filesystem.
+
+This is the list of cases that overlayfs doesn't currently handle:
+
+a) POSIX mandates updating st_atime for reads. This is currently not
+done in the case when the file resides on a lower layer.
+
+b) If a file residing on a lower layer is opened for read-only and then
+memory mapped with MAP_SHARED, then subsequent changes to the file are not
+reflected in the memory mapping.
+
+The following options allow overlayfs to act more like a standards
+compliant filesystem:
1) "redirect_dir"
diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst
index e14d7d40fc75..50bccbf68308 100644
--- a/Documentation/networking/af_xdp.rst
+++ b/Documentation/networking/af_xdp.rst
@@ -316,16 +316,16 @@ A: When a netdev of a physical NIC is initialized, Linux usually
all the traffic, you can force the netdev to only have 1 queue, queue
id 0, and then bind to queue 0. You can use ethtool to do this::
- sudo ethtool -L <interface> combined 1
+ sudo ethtool -L <interface> combined 1
If you want to only see part of the traffic, you can program the
NIC through ethtool to filter out your traffic to a single queue id
that you can bind your XDP socket to. Here is one example in which
UDP traffic to and from port 4242 are sent to queue 2::
- sudo ethtool -N <interface> rx-flow-hash udp4 fn
- sudo ethtool -N <interface> flow-type udp4 src-port 4242 dst-port \
- 4242 action 2
+ sudo ethtool -N <interface> rx-flow-hash udp4 fn
+ sudo ethtool -N <interface> flow-type udp4 src-port 4242 dst-port \
+ 4242 action 2
A number of other ways are possible all up to the capabilitites of
the NIC you have.
diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index 75fa537763a4..24598d5f8ffa 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -21,6 +21,7 @@ Contents:
intel/i40e
intel/iavf
intel/ice
+ mellanox/mlx5
.. only:: subproject
diff --git a/Documentation/networking/device_drivers/mellanox/mlx5.rst b/Documentation/networking/device_drivers/mellanox/mlx5.rst
new file mode 100644
index 000000000000..4eeef2df912f
--- /dev/null
+++ b/Documentation/networking/device_drivers/mellanox/mlx5.rst
@@ -0,0 +1,173 @@
+.. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+
+=================================================
+Mellanox ConnectX(R) mlx5 core VPI Network Driver
+=================================================
+
+Copyright (c) 2019, Mellanox Technologies LTD.
+
+Contents
+========
+
+- `Enabling the driver and kconfig options`_
+- `Devlink health reporters`_
+
+Enabling the driver and kconfig options
+================================================
+
+| mlx5 core is modular and most of the major mlx5 core driver features can be selected (compiled in/out)
+| at build time via kernel Kconfig flags.
+| Basic features, ethernet net device rx/tx offloads and XDP, are available with the most basic flags
+| CONFIG_MLX5_CORE=y/m and CONFIG_MLX5_CORE_EN=y.
+| For the list of advanced features please see below.
+
+**CONFIG_MLX5_CORE=(y/m/n)** (module mlx5_core.ko)
+
+| The driver can be enabled by choosing CONFIG_MLX5_CORE=y/m in kernel config.
+| This will provide mlx5 core driver for mlx5 ulps to interface with (mlx5e, mlx5_ib).
+
+
+**CONFIG_MLX5_CORE_EN=(y/n)**
+
+| Choosing this option will allow basic ethernet netdevice support with all of the standard rx/tx offloads.
+| mlx5e is the mlx5 ulp driver which provides netdevice kernel interface, when chosen, mlx5e will be
+| built-in into mlx5_core.ko.
+
+
+**CONFIG_MLX5_EN_ARFS=(y/n)**
+
+| Enables Hardware-accelerated receive flow steering (arfs) support, and ntuple filtering.
+| https://community.mellanox.com/s/article/howto-configure-arfs-on-connectx-4
+
+
+**CONFIG_MLX5_EN_RXNFC=(y/n)**
+
+| Enables ethtool receive network flow classification, which allows user defined
+| flow rules to direct traffic into arbitrary rx queue via ethtool set/get_rxnfc API.
+
+
+**CONFIG_MLX5_CORE_EN_DCB=(y/n)**:
+
+| Enables `Data Center Bridging (DCB) Support <https://community.mellanox.com/s/article/howto-auto-config-pfc-and-ets-on-connectx-4-via-lldp-dcbx>`_.
+
+
+**CONFIG_MLX5_MPFS=(y/n)**
+
+| Ethernet Multi-Physical Function Switch (MPFS) support in ConnectX NIC.
+| MPFs is required for when `Multi-Host <http://www.mellanox.com/page/multihost>`_ configuration is enabled to allow passing
+| user configured unicast MAC addresses to the requesting PF.
+
+
+**CONFIG_MLX5_ESWITCH=(y/n)**
+
+| Ethernet SRIOV E-Switch support in ConnectX NIC. E-Switch provides internal SRIOV packet steering
+| and switching for the enabled VFs and PF in two available modes:
+| 1) `Legacy SRIOV mode (L2 mac vlan steering based) <https://community.mellanox.com/s/article/howto-configure-sr-iov-for-connectx-4-connectx-5-with-kvm--ethernet-x>`_.
+| 2) `Switchdev mode (eswitch offloads) <https://www.mellanox.com/related-docs/prod_software/ASAP2_Hardware_Offloading_for_vSwitches_User_Manual_v4.4.pdf>`_.
+
+
+**CONFIG_MLX5_CORE_IPOIB=(y/n)**
+
+| IPoIB offloads & acceleration support.
+| Requires CONFIG_MLX5_CORE_EN to provide an accelerated interface for the rdma
+| IPoIB ulp netdevice.
+
+
+**CONFIG_MLX5_FPGA=(y/n)**
+
+| Build support for the Innova family of network cards by Mellanox Technologies.
+| Innova network cards are comprised of a ConnectX chip and an FPGA chip on one board.
+| If you select this option, the mlx5_core driver will include the Innova FPGA core and allow
+| building sandbox-specific client drivers.
+
+
+**CONFIG_MLX5_EN_IPSEC=(y/n)**
+
+| Enables `IPSec XFRM cryptography-offload accelaration <http://www.mellanox.com/related-docs/prod_software/Mellanox_Innova_IPsec_Ethernet_Adapter_Card_User_Manual.pdf>`_.
+
+**CONFIG_MLX5_EN_TLS=(y/n)**
+
+| TLS cryptography-offload accelaration.
+
+
+**CONFIG_MLX5_INFINIBAND=(y/n/m)** (module mlx5_ib.ko)
+
+| Provides low-level InfiniBand/RDMA and `RoCE <https://community.mellanox.com/s/article/recommended-network-configuration-examples-for-roce-deployment>`_ support.
+
+
+**External options** ( Choose if the corresponding mlx5 feature is required )
+
+- CONFIG_PTP_1588_CLOCK: When chosen, mlx5 ptp support will be enabled
+- CONFIG_VXLAN: When chosen, mlx5 vxaln support will be enabled.
+- CONFIG_MLXFW: When chosen, mlx5 firmware flashing support will be enabled (via devlink and ethtool).
+
+
+Devlink health reporters
+========================
+
+tx reporter
+-----------
+The tx reporter is responsible of two error scenarios:
+
+- TX timeout
+ Report on kernel tx timeout detection.
+ Recover by searching lost interrupts.
+- TX error completion
+ Report on error tx completion.
+ Recover by flushing the TX queue and reset it.
+
+TX reporter also support Diagnose callback, on which it provides
+real time information of its send queues status.
+
+User commands examples:
+
+- Diagnose send queues status::
+
+ $ devlink health diagnose pci/0000:82:00.0 reporter tx
+
+- Show number of tx errors indicated, number of recover flows ended successfully,
+ is autorecover enabled and graceful period from last recover::
+
+ $ devlink health show pci/0000:82:00.0 reporter tx
+
+fw reporter
+-----------
+The fw reporter implements diagnose and dump callbacks.
+It follows symptoms of fw error such as fw syndrome by triggering
+fw core dump and storing it into the dump buffer.
+The fw reporter diagnose command can be triggered any time by the user to check
+current fw status.
+
+User commands examples:
+
+- Check fw heath status::
+
+ $ devlink health diagnose pci/0000:82:00.0 reporter fw
+
+- Read FW core dump if already stored or trigger new one::
+
+ $ devlink health dump show pci/0000:82:00.0 reporter fw
+
+NOTE: This command can run only on the PF which has fw tracer ownership,
+running it on other PF or any VF will return "Operation not permitted".
+
+fw fatal reporter
+-----------------
+The fw fatal reporter implements dump and recover callbacks.
+It follows fatal errors indications by CR-space dump and recover flow.
+The CR-space dump uses vsc interface which is valid even if the FW command
+interface is not functional, which is the case in most FW fatal errors.
+The recover function runs recover flow which reloads the driver and triggers fw
+reset if needed.
+
+User commands examples:
+
+- Run fw recover flow manually::
+
+ $ devlink health recover pci/0000:82:00.0 reporter fw_fatal
+
+- Read FW CR-space dump if already strored or trigger new one::
+
+ $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal
+
+NOTE: This command can run only on PF.
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index a73b3a02e49a..e0d8a96e2c67 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -80,6 +80,7 @@ fib_multipath_hash_policy - INTEGER
Possible values:
0 - Layer 3
1 - Layer 4
+ 2 - Layer 3 or inner Layer 3 if present
fib_sync_mem - UNSIGNED INTEGER
Amount of dirty memory from fib entries that can be backlogged before
@@ -255,6 +256,14 @@ tcp_base_mss - INTEGER
Path MTU discovery (MTU probing). If MTU probing is enabled,
this is the initial MSS used by the connection.
+tcp_min_snd_mss - INTEGER
+ TCP SYN and SYNACK messages usually advertise an ADVMSS option,
+ as described in RFC 1122 and RFC 6691.
+ If this ADVMSS option is smaller than tcp_min_snd_mss,
+ it is silently capped to tcp_min_snd_mss.
+
+ Default : 48 (at least 8 bytes of payload per segment)
+
tcp_congestion_control - STRING
Set the congestion control algorithm to be used for new
connections. The algorithm "reno" is always available, but
@@ -792,6 +801,14 @@ tcp_challenge_ack_limit - INTEGER
in RFC 5961 (Improving TCP's Robustness to Blind In-Window Attacks)
Default: 100
+tcp_rx_skb_cache - BOOLEAN
+ Controls a per TCP socket cache of one skb, that might help
+ performance of some workloads. This might be dangerous
+ on systems with a lot of TCP sockets, since it increases
+ memory usage.
+
+ Default: 0 (disabled)
+
UDP variables:
udp_l3mdev_accept - BOOLEAN
@@ -1429,14 +1446,24 @@ flowlabel_state_ranges - BOOLEAN
FALSE: disabled
Default: true
-flowlabel_reflect - BOOLEAN
- Automatically reflect the flow label. Needed for Path MTU
+flowlabel_reflect - INTEGER
+ Control flow label reflection. Needed for Path MTU
Discovery to work with Equal Cost Multipath Routing in anycast
environments. See RFC 7690 and:
https://tools.ietf.org/html/draft-wang-6man-flow-label-reflection-01
- TRUE: enabled
- FALSE: disabled
- Default: FALSE
+
+ This is a mask of two bits.
+ 1: enabled for established flows
+
+ Note that this prevents automatic flowlabel changes, as done
+ in "tcp: change IPv6 flow-label upon receiving spurious retransmission"
+ and "tcp: Change txhash on every SYN and RTO retransmit"
+
+ 2: enabled for TCP RESET packets (no active listener)
+ If set, a RST packet sent in response to a SYN packet on a closed
+ port will reflect the incoming flow label.
+
+ Default: 0
fib_multipath_hash_policy - INTEGER
Controls which hash policy to use for multipath routes.
diff --git a/Documentation/networking/phy.rst b/Documentation/networking/phy.rst
index 0dd90d7df5ec..a689966bc4be 100644
--- a/Documentation/networking/phy.rst
+++ b/Documentation/networking/phy.rst
@@ -202,7 +202,8 @@ the PHY/controller, of which the PHY needs to be aware.
*interface* is a u32 which specifies the connection type used
between the controller and the PHY. Examples are GMII, MII,
-RGMII, and SGMII. For a full list, see include/linux/phy.h
+RGMII, and SGMII. See "PHY interface mode" below. For a full
+list, see include/linux/phy.h
Now just make sure that phydev->supported and phydev->advertising have any
values pruned from them which don't make sense for your controller (a 10/100
@@ -225,6 +226,48 @@ When you want to disconnect from the network (even if just briefly), you call
phy_stop(phydev). This function also stops the phylib state machine and
disables PHY interrupts.
+PHY interface modes
+===================
+
+The PHY interface mode supplied in the phy_connect() family of functions
+defines the initial operating mode of the PHY interface. This is not
+guaranteed to remain constant; there are PHYs which dynamically change
+their interface mode without software interaction depending on the
+negotiation results.
+
+Some of the interface modes are described below:
+
+``PHY_INTERFACE_MODE_1000BASEX``
+ This defines the 1000BASE-X single-lane serdes link as defined by the
+ 802.3 standard section 36. The link operates at a fixed bit rate of
+ 1.25Gbaud using a 10B/8B encoding scheme, resulting in an underlying
+ data rate of 1Gbps. Embedded in the data stream is a 16-bit control
+ word which is used to negotiate the duplex and pause modes with the
+ remote end. This does not include "up-clocked" variants such as 2.5Gbps
+ speeds (see below.)
+
+``PHY_INTERFACE_MODE_2500BASEX``
+ This defines a variant of 1000BASE-X which is clocked 2.5 times faster,
+ than the 802.3 standard giving a fixed bit rate of 3.125Gbaud.
+
+``PHY_INTERFACE_MODE_SGMII``
+ This is used for Cisco SGMII, which is a modification of 1000BASE-X
+ as defined by the 802.3 standard. The SGMII link consists of a single
+ serdes lane running at a fixed bit rate of 1.25Gbaud with 10B/8B
+ encoding. The underlying data rate is 1Gbps, with the slower speeds of
+ 100Mbps and 10Mbps being achieved through replication of each data symbol.
+ The 802.3 control word is re-purposed to send the negotiated speed and
+ duplex information from to the MAC, and for the MAC to acknowledge
+ receipt. This does not include "up-clocked" variants such as 2.5Gbps
+ speeds.
+
+ Note: mismatched SGMII vs 1000BASE-X configuration on a link can
+ successfully pass data in some circumstances, but the 16-bit control
+ word will not be correctly interpreted, which may cause mismatches in
+ duplex, pause or other settings. This is dependent on the MAC and/or
+ PHY behaviour.
+
+
Pause frames / flow control
===========================
diff --git a/Documentation/networking/rds.txt b/Documentation/networking/rds.txt
index 0235ae69af2a..f2a0147c933d 100644
--- a/Documentation/networking/rds.txt
+++ b/Documentation/networking/rds.txt
@@ -389,7 +389,7 @@ Multipath RDS (mprds)
a common (to all paths) part, and a per-path struct rds_conn_path. All
I/O workqs and reconnect threads are driven from the rds_conn_path.
Transports such as TCP that are multipath capable may then set up a
- TPC socket per rds_conn_path, and this is managed by the transport via
+ TCP socket per rds_conn_path, and this is managed by the transport via
the transport privatee cp_transport_data pointer.
Transports announce themselves as multipath capable by setting the
diff --git a/Documentation/networking/tls-offload.rst b/Documentation/networking/tls-offload.rst
index eb7c9b81ccf5..048e5ca44824 100644
--- a/Documentation/networking/tls-offload.rst
+++ b/Documentation/networking/tls-offload.rst
@@ -206,7 +206,11 @@ TX
Segments transmitted from an offloaded socket can get out of sync
in similar ways to the receive side-retransmissions - local drops
-are possible, though network reorders are not.
+are possible, though network reorders are not. There are currently
+two mechanisms for dealing with out of order segments.
+
+Crypto state rebuilding
+~~~~~~~~~~~~~~~~~~~~~~~
Whenever an out of order segment is transmitted the driver provides
the device with enough information to perform cryptographic operations.
@@ -225,6 +229,35 @@ was just a retransmission. The former is simpler, and does not require
retransmission detection therefore it is the recommended method until
such time it is proven inefficient.
+Next record sync
+~~~~~~~~~~~~~~~~
+
+Whenever an out of order segment is detected the driver requests
+that the ``ktls`` software fallback code encrypt it. If the segment's
+sequence number is lower than expected the driver assumes retransmission
+and doesn't change device state. If the segment is in the future, it
+may imply a local drop, the driver asks the stack to sync the device
+to the next record state and falls back to software.
+
+Resync request is indicated with:
+
+.. code-block:: c
+
+ void tls_offload_tx_resync_request(struct sock *sk, u32 got_seq, u32 exp_seq)
+
+Until resync is complete driver should not access its expected TCP
+sequence number (as it will be updated from a different context).
+Following helper should be used to test if resync is complete:
+
+.. code-block:: c
+
+ bool tls_offload_tx_resync_pending(struct sock *sk)
+
+Next time ``ktls`` pushes a record it will first send its TCP sequence number
+and TLS record number to the driver. Stack will also make sure that
+the new record will start on a segment boundary (like it does when
+the connection is initially added).
+
RX
--
@@ -268,6 +301,9 @@ Device can only detect that segment 4 also contains a TLS header
if it knows the length of the previous record from segment 2. In this case
the device will lose synchronization with the stream.
+Stream scan resynchronization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
When the device gets out of sync and the stream reaches TCP sequence
numbers more than a max size record past the expected TCP sequence number,
the device starts scanning for a known header pattern. For example
@@ -298,6 +334,22 @@ Special care has to be taken if the confirmation request is passed
asynchronously to the packet stream and record may get processed
by the kernel before the confirmation request.
+Stack-driven resynchronization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The driver may also request the stack to perform resynchronization
+whenever it sees the records are no longer getting decrypted.
+If the connection is configured in this mode the stack automatically
+schedules resynchronization after it has received two completely encrypted
+records.
+
+The stack waits for the socket to drain and informs the device about
+the next expected record number and its TCP sequence number. If the
+records continue to be received fully encrypted stack retries the
+synchronization with an exponential back off (first after 2 encrypted
+records, then after 4 records, after 8, after 16... up until every
+128 records).
+
Error handling
==============
diff --git a/Documentation/usb/rio.txt b/Documentation/usb/rio.txt
index ca9adcf56355..ea73475471db 100644
--- a/Documentation/usb/rio.txt
+++ b/Documentation/usb/rio.txt
@@ -76,70 +76,30 @@ Additional Information and userspace tools
Requirements
============
-A host with a USB port. Ideally, either a UHCI (Intel) or OHCI
-(Compaq and others) hardware port should work.
+A host with a USB port running a Linux kernel with RIO 500 support enabled.
-A Linux development kernel (2.3.x) with USB support enabled or a
-backported version to linux-2.2.x. See http://www.linux-usb.org for
-more information on accomplishing this.
+The driver is a module called rio500, which should be automatically loaded
+as you plug in your device. If that fails you can manually load it with
-A Linux kernel with RIO 500 support enabled.
+ modprobe rio500
-'lspci' which is only needed to determine the type of USB hardware
-available in your machine.
-
-Configuration
-
-Using `lspci -v`, determine the type of USB hardware available.
-
- If you see something like::
-
- USB Controller: ......
- Flags: .....
- I/O ports at ....
-
- Then you have a UHCI based controller.
-
- If you see something like::
-
- USB Controller: .....
- Flags: ....
- Memory at .....
-
- Then you have a OHCI based controller.
-
-Using `make menuconfig` or your preferred method for configuring the
-kernel, select 'Support for USB', 'OHCI/UHCI' depending on your
-hardware (determined from the steps above), 'USB Diamond Rio500 support', and
-'Preliminary USB device filesystem'. Compile and install the modules
-(you may need to execute `depmod -a` to update the module
-dependencies).
-
-Add a device for the USB rio500::
+Udev should automatically create a device node as soon as plug in your device.
+If that fails, you can manually add a device for the USB rio500::
mknod /dev/usb/rio500 c 180 64
-Set appropriate permissions for /dev/usb/rio500 (don't forget about
-group and world permissions). Both read and write permissions are
+In that case, set appropriate permissions for /dev/usb/rio500 (don't forget
+about group and world permissions). Both read and write permissions are
required for proper operation.
-Load the appropriate modules (if compiled as modules):
-
- OHCI::
-
- modprobe usbcore
- modprobe usb-ohci
- modprobe rio500
-
- UHCI::
-
- modprobe usbcore
- modprobe usb-uhci (or uhci)
- modprobe rio500
-
That's it. The Rio500 Utils at: http://rio500.sourceforge.net should
be able to access the rio500.
+Limits
+======
+
+You can use only a single rio500 device at a time with your computer.
+
Bugs
====
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index ba6c42c576dd..2a4531bb06bd 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1079,7 +1079,7 @@ yet and must be cleared on entry.
4.35 KVM_SET_USER_MEMORY_REGION
-Capability: KVM_CAP_USER_MEM
+Capability: KVM_CAP_USER_MEMORY
Architectures: all
Type: vm ioctl
Parameters: struct kvm_userspace_memory_region (in)
@@ -3857,43 +3857,59 @@ Type: vcpu ioctl
Parameters: struct kvm_nested_state (in/out)
Returns: 0 on success, -1 on error
Errors:
- E2BIG: the total state size (including the fixed-size part of struct
- kvm_nested_state) exceeds the value of 'size' specified by
+ E2BIG: the total state size exceeds the value of 'size' specified by
the user; the size required will be written into size.
struct kvm_nested_state {
__u16 flags;
__u16 format;
__u32 size;
+
union {
- struct kvm_vmx_nested_state vmx;
- struct kvm_svm_nested_state svm;
+ struct kvm_vmx_nested_state_hdr vmx;
+ struct kvm_svm_nested_state_hdr svm;
+
+ /* Pad the header to 128 bytes. */
__u8 pad[120];
- };
- __u8 data[0];
+ } hdr;
+
+ union {
+ struct kvm_vmx_nested_state_data vmx[0];
+ struct kvm_svm_nested_state_data svm[0];
+ } data;
};
#define KVM_STATE_NESTED_GUEST_MODE 0x00000001
#define KVM_STATE_NESTED_RUN_PENDING 0x00000002
+#define KVM_STATE_NESTED_EVMCS 0x00000004
-#define KVM_STATE_NESTED_SMM_GUEST_MODE 0x00000001
-#define KVM_STATE_NESTED_SMM_VMXON 0x00000002
+#define KVM_STATE_NESTED_FORMAT_VMX 0
+#define KVM_STATE_NESTED_FORMAT_SVM 1
-struct kvm_vmx_nested_state {
+#define KVM_STATE_NESTED_VMX_VMCS_SIZE 0x1000
+
+#define KVM_STATE_NESTED_VMX_SMM_GUEST_MODE 0x00000001
+#define KVM_STATE_NESTED_VMX_SMM_VMXON 0x00000002
+
+struct kvm_vmx_nested_state_hdr {
__u64 vmxon_pa;
- __u64 vmcs_pa;
+ __u64 vmcs12_pa;
struct {
__u16 flags;
} smm;
};
+struct kvm_vmx_nested_state_data {
+ __u8 vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
+ __u8 shadow_vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
+};
+
This ioctl copies the vcpu's nested virtualization state from the kernel to
userspace.
-The maximum size of the state, including the fixed-size part of struct
-kvm_nested_state, can be retrieved by passing KVM_CAP_NESTED_STATE to
-the KVM_CHECK_EXTENSION ioctl().
+The maximum size of the state can be retrieved by passing KVM_CAP_NESTED_STATE
+to the KVM_CHECK_EXTENSION ioctl().
4.115 KVM_SET_NESTED_STATE
@@ -3903,8 +3919,8 @@ Type: vcpu ioctl
Parameters: struct kvm_nested_state (in)
Returns: 0 on success, -1 on error
-This copies the vcpu's kvm_nested_state struct from userspace to the kernel. For
-the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE.
+This copies the vcpu's kvm_nested_state struct from userspace to the kernel.
+For the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE.
4.116 KVM_(UN)REGISTER_COALESCED_MMIO
diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index ec1efa32af3c..7cdf7282e022 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -288,15 +288,17 @@ For instance if the device flags for device entries are:
WRITE (1 << 62)
Now let say that device driver wants to fault with at least read a range then
-it does set:
- range->default_flags = (1 << 63)
+it does set::
+
+ range->default_flags = (1 << 63);
range->pfn_flags_mask = 0;
and calls hmm_range_fault() as described above. This will fill fault all page
in the range with at least read permission.
Now let say driver wants to do the same except for one page in the range for
-which its want to have write. Now driver set:
+which its want to have write. Now driver set::
+
range->default_flags = (1 << 63);
range->pfn_flags_mask = (1 << 62);
range->pfns[index_of_write] = (1 << 62);