summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-block12
-rw-r--r--Documentation/ABI/testing/sysfs-block-device43
-rw-r--r--Documentation/ABI/testing/sysfs-bus-event_source-devices-uncore13
-rw-r--r--Documentation/ABI/testing/sysfs-bus-platform14
-rw-r--r--Documentation/ABI/testing/sysfs-driver-ge-achc15
-rw-r--r--Documentation/ABI/testing/sysfs-platform-dptf40
-rw-r--r--Documentation/ABI/testing/sysfs-platform_profile7
-rw-r--r--Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst29
-rw-r--r--Documentation/RCU/Design/Requirements/Requirements.rst8
-rw-r--r--Documentation/RCU/checklist.rst24
-rw-r--r--Documentation/RCU/rcu_dereference.rst6
-rw-r--r--Documentation/RCU/stallwarn.rst31
-rw-r--r--Documentation/admin-guide/binderfs.rst13
-rw-r--r--Documentation/admin-guide/cgroup-v2.rst11
-rw-r--r--Documentation/admin-guide/device-mapper/dm-ima.rst715
-rw-r--r--Documentation/admin-guide/device-mapper/index.rst1
-rw-r--r--Documentation/admin-guide/device-mapper/writecache.rst16
-rw-r--r--Documentation/admin-guide/hw-vuln/index.rst1
-rw-r--r--Documentation/admin-guide/hw-vuln/l1d_flush.rst69
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt21
-rw-r--r--Documentation/atomic_t.txt94
-rw-r--r--Documentation/bpf/index.rst10
-rw-r--r--Documentation/bpf/libbpf/index.rst (renamed from Documentation/bpf/libbpf/libbpf.rst)8
-rw-r--r--Documentation/bpf/libbpf/libbpf_api.rst27
-rw-r--r--Documentation/bpf/libbpf/libbpf_naming_convention.rst6
-rw-r--r--Documentation/core-api/cpu_hotplug.rst2
-rw-r--r--Documentation/core-api/irq/irq-domain.rst28
-rw-r--r--Documentation/devicetree/bindings/firmware/xilinx/xlnx,zynqmp-firmware.txt44
-rw-r--r--Documentation/devicetree/bindings/firmware/xilinx/xlnx,zynqmp-firmware.yaml89
-rw-r--r--Documentation/devicetree/bindings/fpga/xlnx,versal-fpga.yaml33
-rw-r--r--Documentation/devicetree/bindings/fsi/ibm,fsi2spi.yaml1
-rw-r--r--Documentation/devicetree/bindings/gpio/rockchip,gpio-bank.yaml5
-rw-r--r--Documentation/devicetree/bindings/hwmon/amd,sbrmi.yaml53
-rw-r--r--Documentation/devicetree/bindings/hwmon/winbond,w83781d.yaml41
-rw-r--r--Documentation/devicetree/bindings/iio/st,st-sensors.yaml41
-rw-r--r--Documentation/devicetree/bindings/interconnect/qcom,osm-l3.yaml1
-rw-r--r--Documentation/devicetree/bindings/interconnect/qcom,rpmh.yaml11
-rw-r--r--Documentation/devicetree/bindings/leds/common.yaml6
-rw-r--r--Documentation/devicetree/bindings/misc/ge-achc.txt26
-rw-r--r--Documentation/devicetree/bindings/misc/ge-achc.yaml65
-rw-r--r--Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.yaml18
-rw-r--r--Documentation/devicetree/bindings/mmc/mmc-pwrseq-sd8787.yaml4
-rw-r--r--Documentation/devicetree/bindings/mmc/renesas,sdhi.yaml133
-rw-r--r--Documentation/devicetree/bindings/mmc/sdhci-msm.txt1
-rw-r--r--Documentation/devicetree/bindings/net/brcm,unimac-mdio.txt43
-rw-r--r--Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml84
-rw-r--r--Documentation/devicetree/bindings/net/can/bosch,c_can.yaml119
-rw-r--r--Documentation/devicetree/bindings/net/can/bosch,m_can.yaml9
-rw-r--r--Documentation/devicetree/bindings/net/can/c_can.txt65
-rw-r--r--Documentation/devicetree/bindings/net/can/can-controller.yaml9
-rw-r--r--Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml17
-rw-r--r--Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml69
-rw-r--r--Documentation/devicetree/bindings/net/fsl,fec.yaml244
-rw-r--r--Documentation/devicetree/bindings/net/fsl-fec.txt95
-rw-r--r--Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml54
-rw-r--r--Documentation/devicetree/bindings/net/litex,liteeth.yaml98
-rw-r--r--Documentation/devicetree/bindings/net/macb.txt1
-rw-r--r--Documentation/devicetree/bindings/net/qcom,ipa.yaml24
-rw-r--r--Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml15
-rw-r--r--Documentation/devicetree/bindings/nvmem/nintendo-otp.yaml44
-rw-r--r--Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml3
-rw-r--r--Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.txt20
-rw-r--r--Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml53
-rw-r--r--Documentation/devicetree/bindings/phy/intel,keembay-phy-usb.yaml (renamed from Documentation/devicetree/bindings/phy/intel,phy-keembay-usb.yaml)2
-rw-r--r--Documentation/devicetree/bindings/phy/mediatek,tphy.yaml30
-rw-r--r--Documentation/devicetree/bindings/phy/qcom,qmp-phy.yaml4
-rw-r--r--Documentation/devicetree/bindings/phy/qcom,qmp-usb3-dp-phy.yaml1
-rw-r--r--Documentation/devicetree/bindings/phy/renesas,usb2-phy.yaml15
-rw-r--r--Documentation/devicetree/bindings/phy/samsung,ufs-phy.yaml1
-rw-r--r--Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.txt82
-rw-r--r--Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.yaml103
-rw-r--r--Documentation/devicetree/bindings/power/supply/battery.yaml14
-rw-r--r--Documentation/devicetree/bindings/power/supply/maxim,max17042.yaml3
-rw-r--r--Documentation/devicetree/bindings/power/supply/mt6360_charger.yaml48
-rw-r--r--Documentation/devicetree/bindings/power/supply/summit,smb347-charger.yaml30
-rw-r--r--Documentation/devicetree/bindings/power/supply/x-powers,axp20x-ac-power-supply.yaml11
-rw-r--r--Documentation/devicetree/bindings/power/supply/x-powers,axp20x-battery-power-supply.yaml12
-rw-r--r--Documentation/devicetree/bindings/power/supply/x-powers,axp20x-usb-power-supply.yaml14
-rw-r--r--Documentation/devicetree/bindings/regulator/richtek,rtq2134-regulator.yaml106
-rw-r--r--Documentation/devicetree/bindings/regulator/richtek,rtq6752-regulator.yaml76
-rw-r--r--Documentation/devicetree/bindings/regulator/socionext,uniphier-regulator.yaml85
-rw-r--r--Documentation/devicetree/bindings/regulator/uniphier-regulator.txt58
-rw-r--r--Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml8
-rw-r--r--Documentation/devicetree/bindings/spi/omap-spi.txt48
-rw-r--r--Documentation/devicetree/bindings/spi/omap-spi.yaml117
-rw-r--r--Documentation/devicetree/bindings/spi/rockchip-sfc.yaml91
-rw-r--r--Documentation/devicetree/bindings/spi/spi-mt65xx.txt1
-rw-r--r--Documentation/devicetree/bindings/spi/spi-sprd-adi.txt63
-rw-r--r--Documentation/devicetree/bindings/spi/sprd,spi-adi.yaml104
-rw-r--r--Documentation/devicetree/bindings/timer/rockchip,rk-timer.txt27
-rw-r--r--Documentation/devicetree/bindings/timer/rockchip,rk-timer.yaml64
-rw-r--r--Documentation/driver-api/fpga/fpga-bridge.rst10
-rw-r--r--Documentation/driver-api/fpga/fpga-mgr.rst12
-rw-r--r--Documentation/driver-api/fpga/fpga-programming.rst8
-rw-r--r--Documentation/driver-api/fpga/fpga-region.rst20
-rw-r--r--Documentation/driver-api/index.rst1
-rw-r--r--Documentation/driver-api/lightnvm-pblk.rst21
-rw-r--r--Documentation/driver-api/nfc/nfc-hci.rst2
-rw-r--r--Documentation/fault-injection/fault-injection.rst18
-rw-r--r--Documentation/fault-injection/provoke-crashes.rst3
-rw-r--r--Documentation/filesystems/cifs/index.rst10
-rw-r--r--Documentation/filesystems/cifs/ksmbd.rst165
-rw-r--r--Documentation/filesystems/fscrypt.rst15
-rw-r--r--Documentation/filesystems/idmappings.rst1026
-rw-r--r--Documentation/filesystems/index.rst3
-rw-r--r--Documentation/filesystems/locking.rst79
-rw-r--r--Documentation/filesystems/mandatory-locking.rst188
-rw-r--r--Documentation/fpga/dfl.rst4
-rw-r--r--Documentation/gpu/rfc/i915_gem_lmem.rst109
-rw-r--r--Documentation/hwmon/aquacomputer_d5next.rst61
-rw-r--r--Documentation/hwmon/index.rst2
-rw-r--r--Documentation/hwmon/sbrmi.rst79
-rw-r--r--Documentation/hwmon/scpi-hwmon.rst2
-rw-r--r--Documentation/hwmon/sht4x.rst2
-rw-r--r--Documentation/i2c/index.rst1
-rw-r--r--Documentation/leds/well-known-leds.txt58
-rw-r--r--Documentation/networking/batman-adv.rst2
-rw-r--r--Documentation/networking/bonding.rst12
-rw-r--r--Documentation/networking/device_drivers/ethernet/freescale/dpaa2/index.rst1
-rw-r--r--Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst217
-rw-r--r--Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst44
-rw-r--r--Documentation/networking/devlink/devlink-params.rst12
-rw-r--r--Documentation/networking/devlink/hns3.rst25
-rw-r--r--Documentation/networking/devlink/index.rst2
-rw-r--r--Documentation/networking/devlink/sja1105.rst49
-rw-r--r--Documentation/networking/dsa/dsa.rst29
-rw-r--r--Documentation/networking/dsa/sja1105.rst218
-rw-r--r--Documentation/networking/ethtool-netlink.rst23
-rw-r--r--Documentation/networking/filter.rst27
-rw-r--r--Documentation/networking/index.rst2
-rw-r--r--Documentation/networking/ioam6-sysctl.rst26
-rw-r--r--Documentation/networking/ip-sysctl.rst17
-rw-r--r--Documentation/networking/mctp.rst213
-rw-r--r--Documentation/networking/mptcp-sysctl.rst12
-rw-r--r--Documentation/networking/netdevices.rst29
-rw-r--r--Documentation/networking/nf_conntrack-sysctl.rst17
-rw-r--r--Documentation/networking/pktgen.rst18
-rw-r--r--Documentation/networking/timestamping.rst6
-rw-r--r--Documentation/networking/vrf.rst13
-rw-r--r--Documentation/trace/coresight/coresight-config.rst244
-rw-r--r--Documentation/trace/coresight/coresight.rst15
-rw-r--r--Documentation/trace/ftrace.rst2
-rw-r--r--Documentation/userspace-api/ioctl/ioctl-number.rst1
-rw-r--r--Documentation/userspace-api/seccomp_filter.rst2
-rw-r--r--Documentation/userspace-api/spec_ctrl.rst8
-rw-r--r--Documentation/virt/kvm/locking.rst8
-rw-r--r--Documentation/x86/x86_64/boot-options.rst11
147 files changed, 5832 insertions, 1492 deletions
diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index e34cdeeeb9d4..a0ed87386639 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -28,6 +28,18 @@ Description:
For more details refer Documentation/admin-guide/iostats.rst
+What: /sys/block/<disk>/diskseq
+Date: February 2021
+Contact: Matteo Croce <mcroce@microsoft.com>
+Description:
+ The /sys/block/<disk>/diskseq files reports the disk
+ sequence number, which is a monotonically increasing
+ number assigned to every drive.
+ Some devices, like the loop device, refresh such number
+ every time the backing file is changed.
+ The value type is 64 bit unsigned.
+
+
What: /sys/block/<disk>/<part>/stat
Date: February 2008
Contact: Jerome Marchand <jmarchan@redhat.com>
diff --git a/Documentation/ABI/testing/sysfs-block-device b/Documentation/ABI/testing/sysfs-block-device
index aa0fb500e3c9..7ac7b19b2f72 100644
--- a/Documentation/ABI/testing/sysfs-block-device
+++ b/Documentation/ABI/testing/sysfs-block-device
@@ -55,6 +55,43 @@ Date: Oct, 2016
KernelVersion: v4.10
Contact: linux-ide@vger.kernel.org
Description:
- (RW) Write to the file to turn on or off the SATA ncq (native
- command queueing) support. By default this feature is turned
- off.
+ (RW) Write to the file to turn on or off the SATA NCQ (native
+ command queueing) priority support. By default this feature is
+ turned off. If the device does not support the SATA NCQ
+ priority feature, writing "1" to this file results in an error
+ (see ncq_prio_supported).
+
+
+What: /sys/block/*/device/sas_ncq_prio_enable
+Date: Oct, 2016
+KernelVersion: v4.10
+Contact: linux-ide@vger.kernel.org
+Description:
+ (RW) This is the equivalent of the ncq_prio_enable attribute
+ file for SATA devices connected to a SAS host-bus-adapter
+ (HBA) implementing support for the SATA NCQ priority feature.
+ This file does not exist if the HBA driver does not implement
+ support for the SATA NCQ priority feature, regardless of the
+ device support for this feature (see sas_ncq_prio_supported).
+
+
+What: /sys/block/*/device/ncq_prio_supported
+Date: Aug, 2021
+KernelVersion: v5.15
+Contact: linux-ide@vger.kernel.org
+Description:
+ (RO) Indicates if the device supports the SATA NCQ (native
+ command queueing) priority feature.
+
+
+What: /sys/block/*/device/sas_ncq_prio_supported
+Date: Aug, 2021
+KernelVersion: v5.15
+Contact: linux-ide@vger.kernel.org
+Description:
+ (RO) This is the equivalent of the ncq_prio_supported attribute
+ file for SATA devices connected to a SAS host-bus-adapter
+ (HBA) implementing support for the SATA NCQ priority feature.
+ This file does not exist if the HBA driver does not implement
+ support for the SATA NCQ priority feature, regardless of the
+ device support for this feature.
diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-uncore b/Documentation/ABI/testing/sysfs-bus-event_source-devices-uncore
new file mode 100644
index 000000000000..b56e8f019fd4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-uncore
@@ -0,0 +1,13 @@
+What: /sys/bus/event_source/devices/uncore_*/alias
+Date: June 2021
+KernelVersion: 5.15
+Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description: Read-only. An attribute to describe the alias name of
+ the uncore PMU if an alias exists on some platforms.
+ The 'perf(1)' tool should treat both names the same.
+ They both can be used to access the uncore PMU.
+
+ Example:
+
+ $ cat /sys/devices/uncore_cha_2/alias
+ uncore_type_0_2
diff --git a/Documentation/ABI/testing/sysfs-bus-platform b/Documentation/ABI/testing/sysfs-bus-platform
index 194ca700e962..ff30728595ef 100644
--- a/Documentation/ABI/testing/sysfs-bus-platform
+++ b/Documentation/ABI/testing/sysfs-bus-platform
@@ -28,3 +28,17 @@ Description:
value comes from an ACPI _PXM method or a similar firmware
source. Initial users for this file would be devices like
arm smmu which are populated by arm64 acpi_iort.
+
+What: /sys/bus/platform/devices/.../msi_irqs/
+Date: August 2021
+Contact: Barry Song <song.bao.hua@hisilicon.com>
+Description:
+ The /sys/devices/.../msi_irqs directory contains a variable set
+ of files, with each file being named after a corresponding msi
+ irq vector allocated to that device.
+
+What: /sys/bus/platform/devices/.../msi_irqs/<N>
+Date: August 2021
+Contact: Barry Song <song.bao.hua@hisilicon.com>
+Description:
+ This attribute will show "msi" if <N> is a valid msi irq
diff --git a/Documentation/ABI/testing/sysfs-driver-ge-achc b/Documentation/ABI/testing/sysfs-driver-ge-achc
new file mode 100644
index 000000000000..a9e7a079190c
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-ge-achc
@@ -0,0 +1,15 @@
+What: /sys/bus/spi/<dev>/update_firmware
+Date: Jul 2021
+Contact: sebastian.reichel@collabora.com
+Description: Write 1 to this file to update the ACHC microcontroller
+ firmware via the EzPort interface. For this the kernel
+ will load "achc.bin" via the firmware API (so usually
+ from /lib/firmware). The write will block until the FW
+ has either been flashed successfully or an error occured.
+
+What: /sys/bus/spi/<dev>/reset
+Date: Jul 2021
+Contact: sebastian.reichel@collabora.com
+Description: This file represents the microcontroller's reset line.
+ 1 means the reset line is asserted, 0 means it's not
+ asserted. The file is read and writable.
diff --git a/Documentation/ABI/testing/sysfs-platform-dptf b/Documentation/ABI/testing/sysfs-platform-dptf
index 141834342a4d..53c6b1000320 100644
--- a/Documentation/ABI/testing/sysfs-platform-dptf
+++ b/Documentation/ABI/testing/sysfs-platform-dptf
@@ -111,3 +111,43 @@ Contact: linux-acpi@vger.kernel.org
Description:
(RW) The PCH FIVR (Fully Integrated Voltage Regulator) switching frequency in MHz,
when FIVR clock is 38.4MHz.
+
+What: /sys/bus/platform/devices/INTC1045:00/pch_fivr_switch_frequency/fivr_switching_freq_mhz
+Date: September, 2021
+KernelVersion: v5.15
+Contact: linux-acpi@vger.kernel.org
+Description:
+ (RO) Get the FIVR switching control frequency in MHz.
+
+What: /sys/bus/platform/devices/INTC1045:00/pch_fivr_switch_frequency/fivr_switching_fault_status
+Date: September, 2021
+KernelVersion: v5.15
+Contact: linux-acpi@vger.kernel.org
+Description:
+ (RO) Read the FIVR switching frequency control fault status.
+
+What: /sys/bus/platform/devices/INTC1045:00/pch_fivr_switch_frequency/ssc_clock_info
+Date: September, 2021
+KernelVersion: v5.15
+Contact: linux-acpi@vger.kernel.org
+Description:
+ (RO) Presents SSC (spread spectrum clock) information for EMI
+ (Electro magnetic interference) control. This is a bit mask.
+ Bits Description
+ [7:0] Sets clock spectrum spread percentage:
+ 0x00=0.2% , 0x3F=10%
+ 1 LSB = 0.1% increase in spread (for
+ settings 0x01 thru 0x1C)
+ 1 LSB = 0.2% increase in spread (for
+ settings 0x1E thru 0x3F)
+ [8] When set to 1, enables spread
+ spectrum clock
+ [9] 0: Triangle mode. FFC frequency
+ walks around the Fcenter in a linear
+ fashion
+ 1: Random walk mode. FFC frequency
+ changes randomly within the SSC
+ (Spread spectrum clock) range
+ [10] 0: No white noise. 1: Add white noise
+ to spread waveform
+ [11] When 1, future writes are ignored.
diff --git a/Documentation/ABI/testing/sysfs-platform_profile b/Documentation/ABI/testing/sysfs-platform_profile
index dae9c8941905..baf1d125f9f8 100644
--- a/Documentation/ABI/testing/sysfs-platform_profile
+++ b/Documentation/ABI/testing/sysfs-platform_profile
@@ -26,3 +26,10 @@ Contact: Hans de Goede <hdegoede@redhat.com>
Description: Reading this file gives the current selected profile for this
device. Writing this file with one of the strings from
platform_profile_choices changes the profile to the new value.
+
+ This file can be monitored for changes by polling for POLLPRI,
+ POLLPRI will be signalled on any changes, independent of those
+ changes coming from a userspace write; or coming from another
+ source such as e.g. a hotkey triggered profile change handled
+ either directly by the embedded-controller or fully handled
+ inside the kernel.
diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
index 11cdab037bff..eeb351296df1 100644
--- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
+++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
@@ -112,6 +112,35 @@ on PowerPC.
The ``smp_mb__after_unlock_lock()`` invocations prevent this
``WARN_ON()`` from triggering.
++-----------------------------------------------------------------------+
+| **Quick Quiz**: |
++-----------------------------------------------------------------------+
+| But the chain of rcu_node-structure lock acquisitions guarantees |
+| that new readers will see all of the updater's pre-grace-period |
+| accesses and also guarantees that the updater's post-grace-period |
+| accesses will see all of the old reader's accesses. So why do we |
+| need all of those calls to smp_mb__after_unlock_lock()? |
++-----------------------------------------------------------------------+
+| **Answer**: |
++-----------------------------------------------------------------------+
+| Because we must provide ordering for RCU's polling grace-period |
+| primitives, for example, get_state_synchronize_rcu() and |
+| poll_state_synchronize_rcu(). Consider this code:: |
+| |
+| CPU 0 CPU 1 |
+| ---- ---- |
+| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
+| g = get_state_synchronize_rcu() smp_mb() |
+| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
+| continue; |
+| r0 = READ_ONCE(Y) |
+| |
+| RCU guarantees that the outcome r0 == 0 && r1 == 0 will not |
+| happen, even if CPU 1 is in an RCU extended quiescent state |
+| (idle or offline) and thus won't interact directly with the RCU |
+| core processing at all. |
++-----------------------------------------------------------------------+
+
This approach must be extended to include idle CPUs, which need
RCU's grace-period memory ordering guarantee to extend to any
RCU read-side critical sections preceding and following the current
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index 38a39476fc24..45278e2974c0 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -362,9 +362,8 @@ do_something_gp() uses rcu_dereference() to fetch from ``gp``:
12 }
The rcu_dereference() uses volatile casts and (for DEC Alpha) memory
-barriers in the Linux kernel. Should a `high-quality implementation of
-C11 ``memory_order_consume``
-[PDF] <http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf>`__
+barriers in the Linux kernel. Should a |high-quality implementation of
+C11 memory_order_consume [PDF]|_
ever appear, then rcu_dereference() could be implemented as a
``memory_order_consume`` load. Regardless of the exact implementation, a
pointer fetched by rcu_dereference() may not be used outside of the
@@ -374,6 +373,9 @@ element has been passed from RCU to some other synchronization
mechanism, most commonly locking or `reference
counting <https://www.kernel.org/doc/Documentation/RCU/rcuref.txt>`__.
+.. |high-quality implementation of C11 memory_order_consume [PDF]| replace:: high-quality implementation of C11 ``memory_order_consume`` [PDF]
+.. _high-quality implementation of C11 memory_order_consume [PDF]: http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf
+
In short, updaters use rcu_assign_pointer() and readers use
rcu_dereference(), and these two RCU API elements work together to
ensure that readers have a consistent view of newly added data elements.
diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst
index 01cc21f17f7b..f4545b7c9a63 100644
--- a/Documentation/RCU/checklist.rst
+++ b/Documentation/RCU/checklist.rst
@@ -37,7 +37,7 @@ over a rather long period of time, but improvements are always welcome!
1. Does the update code have proper mutual exclusion?
- RCU does allow -readers- to run (almost) naked, but -writers- must
+ RCU does allow *readers* to run (almost) naked, but *writers* must
still use some sort of mutual exclusion, such as:
a. locking,
@@ -73,7 +73,7 @@ over a rather long period of time, but improvements are always welcome!
critical section is every bit as bad as letting them leak out
from under a lock. Unless, of course, you have arranged some
other means of protection, such as a lock or a reference count
- -before- letting them out of the RCU read-side critical section.
+ *before* letting them out of the RCU read-side critical section.
3. Does the update code tolerate concurrent accesses?
@@ -101,7 +101,7 @@ over a rather long period of time, but improvements are always welcome!
c. Make updates appear atomic to readers. For example,
pointer updates to properly aligned fields will
appear atomic, as will individual atomic primitives.
- Sequences of operations performed under a lock will -not-
+ Sequences of operations performed under a lock will *not*
appear to be atomic to RCU readers, nor will sequences
of multiple atomic primitives.
@@ -333,7 +333,7 @@ over a rather long period of time, but improvements are always welcome!
for example) may be omitted.
10. Conversely, if you are in an RCU read-side critical section,
- and you don't hold the appropriate update-side lock, you -must-
+ and you don't hold the appropriate update-side lock, you *must*
use the "_rcu()" variants of the list macros. Failing to do so
will break Alpha, cause aggressive compilers to generate bad code,
and confuse people trying to read your code.
@@ -359,12 +359,12 @@ over a rather long period of time, but improvements are always welcome!
callback pending, then that RCU callback will execute on some
surviving CPU. (If this was not the case, a self-spawning RCU
callback would prevent the victim CPU from ever going offline.)
- Furthermore, CPUs designated by rcu_nocbs= might well -always-
+ Furthermore, CPUs designated by rcu_nocbs= might well *always*
have their RCU callbacks executed on some other CPUs, in fact,
for some real-time workloads, this is the whole point of using
the rcu_nocbs= kernel boot parameter.
-13. Unlike other forms of RCU, it -is- permissible to block in an
+13. Unlike other forms of RCU, it *is* permissible to block in an
SRCU read-side critical section (demarked by srcu_read_lock()
and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
Please note that if you don't need to sleep in read-side critical
@@ -411,16 +411,16 @@ over a rather long period of time, but improvements are always welcome!
14. The whole point of call_rcu(), synchronize_rcu(), and friends
is to wait until all pre-existing readers have finished before
carrying out some otherwise-destructive operation. It is
- therefore critically important to -first- remove any path
+ therefore critically important to *first* remove any path
that readers can follow that could be affected by the
- destructive operation, and -only- -then- invoke call_rcu(),
+ destructive operation, and *only then* invoke call_rcu(),
synchronize_rcu(), or friends.
Because these primitives only wait for pre-existing readers, it
is the caller's responsibility to guarantee that any subsequent
readers will execute safely.
-15. The various RCU read-side primitives do -not- necessarily contain
+15. The various RCU read-side primitives do *not* necessarily contain
memory barriers. You should therefore plan for the CPU
and the compiler to freely reorder code into and out of RCU
read-side critical sections. It is the responsibility of the
@@ -459,8 +459,8 @@ over a rather long period of time, but improvements are always welcome!
pass in a function defined within a loadable module, then it in
necessary to wait for all pending callbacks to be invoked after
the last invocation and before unloading that module. Note that
- it is absolutely -not- sufficient to wait for a grace period!
- The current (say) synchronize_rcu() implementation is -not-
+ it is absolutely *not* sufficient to wait for a grace period!
+ The current (say) synchronize_rcu() implementation is *not*
guaranteed to wait for callbacks registered on other CPUs.
Or even on the current CPU if that CPU recently went offline
and came back online.
@@ -470,7 +470,7 @@ over a rather long period of time, but improvements are always welcome!
- call_rcu() -> rcu_barrier()
- call_srcu() -> srcu_barrier()
- However, these barrier functions are absolutely -not- guaranteed
+ However, these barrier functions are absolutely *not* guaranteed
to wait for a grace period. In fact, if there are no call_rcu()
callbacks waiting anywhere in the system, rcu_barrier() is within
its rights to return immediately.
diff --git a/Documentation/RCU/rcu_dereference.rst b/Documentation/RCU/rcu_dereference.rst
index f3e587acb4de..0b418a5b243c 100644
--- a/Documentation/RCU/rcu_dereference.rst
+++ b/Documentation/RCU/rcu_dereference.rst
@@ -43,7 +43,7 @@ Follow these rules to keep your RCU code working properly:
- Set bits and clear bits down in the must-be-zero low-order
bits of that pointer. This clearly means that the pointer
must have alignment constraints, for example, this does
- -not- work in general for char* pointers.
+ *not* work in general for char* pointers.
- XOR bits to translate pointers, as is done in some
classic buddy-allocator algorithms.
@@ -174,7 +174,7 @@ Follow these rules to keep your RCU code working properly:
Please see the "CONTROL DEPENDENCIES" section of
Documentation/memory-barriers.txt for more details.
- - The pointers are not equal -and- the compiler does
+ - The pointers are not equal *and* the compiler does
not have enough information to deduce the value of the
pointer. Note that the volatile cast in rcu_dereference()
will normally prevent the compiler from knowing too much.
@@ -360,7 +360,7 @@ in turn destroying the ordering between this load and the loads of the
return values. This can result in "p->b" returning pre-initialization
garbage values.
-In short, rcu_dereference() is -not- optional when you are going to
+In short, rcu_dereference() is *not* optional when you are going to
dereference the resulting pointer.
diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst
index 7148e9be08c3..5036df24ae61 100644
--- a/Documentation/RCU/stallwarn.rst
+++ b/Documentation/RCU/stallwarn.rst
@@ -32,7 +32,7 @@ warnings:
- Booting Linux using a console connection that is too slow to
keep up with the boot-time console-message rate. For example,
- a 115Kbaud serial console can be -way- too slow to keep up
+ a 115Kbaud serial console can be *way* too slow to keep up
with boot-time message rates, and will frequently result in
RCU CPU stall warning messages. Especially if you have added
debug printk()s.
@@ -105,7 +105,7 @@ warnings:
leading the realization that the CPU had failed.
The RCU, RCU-sched, and RCU-tasks implementations have CPU stall warning.
-Note that SRCU does -not- have CPU stall warnings. Please note that
+Note that SRCU does *not* have CPU stall warnings. Please note that
RCU only detects CPU stalls when there is a grace period in progress.
No grace period, no CPU stall warnings.
@@ -145,7 +145,7 @@ CONFIG_RCU_CPU_STALL_TIMEOUT
this parameter is checked only at the beginning of a cycle.
So if you are 10 seconds into a 40-second stall, setting this
sysfs parameter to (say) five will shorten the timeout for the
- -next- stall, or the following warning for the current stall
+ *next* stall, or the following warning for the current stall
(assuming the stall lasts long enough). It will not affect the
timing of the next warning for the current stall.
@@ -189,8 +189,8 @@ rcupdate.rcu_task_stall_timeout
Interpreting RCU's CPU Stall-Detector "Splats"
==============================================
-For non-RCU-tasks flavors of RCU, when a CPU detects that it is stalling,
-it will print a message similar to the following::
+For non-RCU-tasks flavors of RCU, when a CPU detects that some other
+CPU is stalling, it will print a message similar to the following::
INFO: rcu_sched detected stalls on CPUs/tasks:
2-...: (3 GPs behind) idle=06c/0/0 softirq=1453/1455 fqs=0
@@ -202,8 +202,10 @@ causing stalls, and that the stall was affecting RCU-sched. This message
will normally be followed by stack dumps for each CPU. Please note that
PREEMPT_RCU builds can be stalled by tasks as well as by CPUs, and that
the tasks will be indicated by PID, for example, "P3421". It is even
-possible for an rcu_state stall to be caused by both CPUs -and- tasks,
+possible for an rcu_state stall to be caused by both CPUs *and* tasks,
in which case the offending CPUs and tasks will all be called out in the list.
+In some cases, CPUs will detect themselves stalling, which will result
+in a self-detected stall.
CPU 2's "(3 GPs behind)" indicates that this CPU has not interacted with
the RCU core for the past three grace periods. In contrast, CPU 16's "(0
@@ -224,7 +226,7 @@ is the number that had executed since boot at the time that this CPU
last noted the beginning of a grace period, which might be the current
(stalled) grace period, or it might be some earlier grace period (for
example, if the CPU might have been in dyntick-idle mode for an extended
-time period. The number after the "/" is the number that have executed
+time period). The number after the "/" is the number that have executed
since boot until the current time. If this latter number stays constant
across repeated stall-warning messages, it is possible that RCU's softirq
handlers are no longer able to execute on this CPU. This can happen if
@@ -283,7 +285,8 @@ If the relevant grace-period kthread has been unable to run prior to
the stall warning, as was the case in the "All QSes seen" line above,
the following additional line is printed::
- kthread starved for 23807 jiffies! g7075 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 ->cpu=5
+ rcu_sched kthread starved for 23807 jiffies! g7075 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 ->cpu=5
+ Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
Starving the grace-period kthreads of CPU time can of course result
in RCU CPU stall warnings even when all CPUs and tasks have passed
@@ -313,15 +316,21 @@ is the current ``TIMER_SOFTIRQ`` count on cpu 4. If this value does not
change on successive RCU CPU stall warnings, there is further reason to
suspect a timer problem.
+These messages are usually followed by stack dumps of the CPUs and tasks
+involved in the stall. These stack traces can help you locate the cause
+of the stall, keeping in mind that the CPU detecting the stall will have
+an interrupt frame that is mainly devoted to detecting the stall.
+
Multiple Warnings From One Stall
================================
-If a stall lasts long enough, multiple stall-warning messages will be
-printed for it. The second and subsequent messages are printed at
+If a stall lasts long enough, multiple stall-warning messages will
+be printed for it. The second and subsequent messages are printed at
longer intervals, so that the time between (say) the first and second
message will be about three times the interval between the beginning
-of the stall and the first message.
+of the stall and the first message. It can be helpful to compare the
+stack dumps for the different messages for the same stalled grace period.
Stall Warnings for Expedited Grace Periods
diff --git a/Documentation/admin-guide/binderfs.rst b/Documentation/admin-guide/binderfs.rst
index 199d84314a14..41a4db00df8d 100644
--- a/Documentation/admin-guide/binderfs.rst
+++ b/Documentation/admin-guide/binderfs.rst
@@ -72,3 +72,16 @@ that the `rm() <rm_>`_ tool can be used to delete them. Note that the
``binder-control`` device cannot be deleted since this would make the binderfs
instance unusable. The ``binder-control`` device will be deleted when the
binderfs instance is unmounted and all references to it have been dropped.
+
+Binder features
+---------------
+
+Assuming an instance of binderfs has been mounted at ``/dev/binderfs``, the
+features supported by the binder driver can be located under
+``/dev/binderfs/features/``. The presence of individual files can be tested
+to determine whether a particular feature is supported by the driver.
+
+Example::
+
+ cat /dev/binderfs/features/oneway_spam_detection
+ 1
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 5c7377b5bd3e..babbe04c8d37 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2056,6 +2056,17 @@ Cpuset Interface Files
The value of "cpuset.mems" stays constant until the next update
and won't be affected by any memory nodes hotplug events.
+ Setting a non-empty value to "cpuset.mems" causes memory of
+ tasks within the cgroup to be migrated to the designated nodes if
+ they are currently using memory outside of the designated nodes.
+
+ There is a cost for this memory migration. The migration
+ may not be complete and some memory pages may be left behind.
+ So it is recommended that "cpuset.mems" should be set properly
+ before spawning new tasks into the cpuset. Even if there is
+ a need to change "cpuset.mems" with active tasks, it shouldn't
+ be done frequently.
+
cpuset.mems.effective
A read-only multiple values file which exists on all
cpuset-enabled cgroups.
diff --git a/Documentation/admin-guide/device-mapper/dm-ima.rst b/Documentation/admin-guide/device-mapper/dm-ima.rst
new file mode 100644
index 000000000000..a4aa50a828e0
--- /dev/null
+++ b/Documentation/admin-guide/device-mapper/dm-ima.rst
@@ -0,0 +1,715 @@
+======
+dm-ima
+======
+
+For a given system, various external services/infrastructure tools
+(including the attestation service) interact with it - both during the
+setup and during rest of the system run-time. They share sensitive data
+and/or execute critical workload on that system. The external services
+may want to verify the current run-time state of the relevant kernel
+subsystems before fully trusting the system with business-critical
+data/workload.
+
+Device mapper plays a critical role on a given system by providing
+various important functionalities to the block devices using various
+target types like crypt, verity, integrity etc. Each of these target
+types’ functionalities can be configured with various attributes.
+The attributes chosen to configure these target types can significantly
+impact the security profile of the block device, and in-turn, of the
+system itself. For instance, the type of encryption algorithm and the
+key size determines the strength of encryption for a given block device.
+
+Therefore, verifying the current state of various block devices as well
+as their various target attributes is crucial for external services before
+fully trusting the system with business-critical data/workload.
+
+IMA kernel subsystem provides the necessary functionality for
+device mapper to measure the state and configuration of
+various block devices -
+
+- by device mapper itself, from within the kernel,
+- in a tamper resistant way,
+- and re-measured - triggered on state/configuration change.
+
+Setting the IMA Policy:
+=======================
+For IMA to measure the data on a given system, the IMA policy on the
+system needs to be updated to have following line, and the system needs
+to be restarted for the measurements to take effect.
+
+::
+
+ /etc/ima/ima-policy
+ measure func=CRITICAL_DATA label=device-mapper template=ima-buf
+
+The measurements will be reflected in the IMA logs, which are located at:
+
+::
+
+ /sys/kernel/security/integrity/ima/ascii_runtime_measurements
+ /sys/kernel/security/integrity/ima/binary_runtime_measurements
+
+Then IMA ASCII measurement log has the following format:
+
+::
+
+ <PCR> <TEMPLATE_DATA_DIGEST> <TEMPLATE_NAME> <TEMPLATE_DATA>
+
+ PCR := Platform Configuration Register, in which the values are registered.
+ This is applicable if TPM chip is in use.
+
+ TEMPLATE_DATA_DIGEST := Template data digest of the IMA record.
+ TEMPLATE_NAME := Template name that registered the integrity value (e.g. ima-buf).
+
+ TEMPLATE_DATA := <ALG> ":" <EVENT_DIGEST> <EVENT_NAME> <EVENT_DATA>
+ It contains data for the specific event to be measured,
+ in a given template data format.
+
+ ALG := Algorithm to compute event digest
+ EVENT_DIGEST := Digest of the event data
+ EVENT_NAME := Description of the event (e.g. 'dm_table_load').
+ EVENT_DATA := The event data to be measured.
+
+|
+
+| *NOTE #1:*
+| The DM target data measured by IMA subsystem can alternatively
+ be queried from userspace by setting DM_IMA_MEASUREMENT_FLAG with
+ DM_TABLE_STATUS_CMD.
+
+|
+
+| *NOTE #2:*
+| The Kernel configuration CONFIG_IMA_DISABLE_HTABLE allows measurement of duplicate records.
+| To support recording duplicate IMA events in the IMA log, the Kernel needs to be configured with
+ CONFIG_IMA_DISABLE_HTABLE=y.
+
+Supported Device States:
+========================
+Following device state changes will trigger IMA measurements:
+
+ 1. Table load
+ #. Device resume
+ #. Device remove
+ #. Table clear
+ #. Device rename
+
+1. Table load:
+---------------
+When a new table is loaded in a device's inactive table slot,
+the device information and target specific details from the
+targets in the table are measured.
+
+The IMA measurement log has the following format for 'dm_table_load':
+
+::
+
+ EVENT_NAME := "dm_table_load"
+ EVENT_DATA := <dm_version_str> ";" <device_metadata> ";" <table_load_data>
+
+ dm_version_str := "dm_version=" <N> "." <N> "." <N>
+ Same as Device Mapper driver version.
+ device_metadata := <device_name> "," <device_uuid> "," <device_major> "," <device_minor> ","
+ <minor_count> "," <num_device_targets> ";"
+
+ device_name := "name=" <dm-device-name>
+ device_uuid := "uuid=" <dm-device-uuid>
+ device_major := "major=" <N>
+ device_minor := "minor=" <N>
+ minor_count := "minor_count=" <N>
+ num_device_targets := "num_targets=" <N>
+ dm-device-name := Name of the device. If it contains special characters like '\', ',', ';',
+ they are prefixed with '\'.
+ dm-device-uuid := UUID of the device. If it contains special characters like '\', ',', ';',
+ they are prefixed with '\'.
+
+ table_load_data := <target_data>
+ Represents the data (as name=value pairs) from various targets in the table,
+ which is being loaded into the DM device's inactive table slot.
+ target_data := <target_data_row> | <target_data><target_data_row>
+
+ target_data_row := <target_index> "," <target_begin> "," <target_len> "," <target_name> ","
+ <target_version> "," <target_attributes> ";"
+ target_index := "target_index=" <N>
+ Represents nth target in the table (from 0 to N-1 targets specified in <num_device_targets>)
+ If all the data for N targets doesn't fit in the given buffer - then the data that fits
+ in the buffer (say from target 0 to x) is measured in a given IMA event.
+ The remaining data from targets x+1 to N-1 is measured in the subsequent IMA events,
+ with the same format as that of 'dm_table_load'
+ i.e. <dm_version_str> ";" <device_metadata> ";" <table_load_data>.
+
+ target_begin := "target_begin=" <N>
+ target_len := "target_len=" <N>
+ target_name := Name of the target. 'linear', 'crypt', 'integrity' etc.
+ The targets that are supported for IMA measurements are documented below in the
+ 'Supported targets' section.
+ target_version := "target_version=" <N> "." <N> "." <N>
+ target_attributes := Data containing comma separated list of name=value pairs of target specific attributes.
+
+ For instance, if a linear device is created with the following table entries,
+ # dmsetup create linear1
+ 0 2 linear /dev/loop0 512
+ 2 2 linear /dev/loop0 512
+ 4 2 linear /dev/loop0 512
+ 6 2 linear /dev/loop0 512
+
+ Then IMA ASCII measurement log will have the following entry:
+ (converted from ASCII to text for readability)
+
+ 10 a8c5ff755561c7a28146389d1514c318592af49a ima-buf sha256:4d73481ecce5eadba8ab084640d85bb9ca899af4d0a122989252a76efadc5b72
+ dm_table_load
+ dm_version=4.45.0;
+ name=linear1,uuid=,major=253,minor=0,minor_count=1,num_targets=4;
+ target_index=0,target_begin=0,target_len=2,target_name=linear,target_version=1.4.0,device_name=7:0,start=512;
+ target_index=1,target_begin=2,target_len=2,target_name=linear,target_version=1.4.0,device_name=7:0,start=512;
+ target_index=2,target_begin=4,target_len=2,target_name=linear,target_version=1.4.0,device_name=7:0,start=512;
+ target_index=3,target_begin=6,target_len=2,target_name=linear,target_version=1.4.0,device_name=7:0,start=512;
+
+2. Device resume:
+------------------
+When a suspended device is resumed, the device information and the hash of the
+data from previous load of an active table are measured.
+
+The IMA measurement log has the following format for 'dm_device_resume':
+
+::
+
+ EVENT_NAME := "dm_device_resume"
+ EVENT_DATA := <dm_version_str> ";" <device_metadata> ";" <active_table_hash> ";" <current_device_capacity> ";"
+
+ dm_version_str := As described in the 'Table load' section above.
+ device_metadata := As described in the 'Table load' section above.
+ active_table_hash := "active_table_hash=" <table_hash_alg> ":" <table_hash>
+ Rerpresents the hash of the IMA data being measured for the
+ active table for the device.
+ table_hash_alg := Algorithm used to compute the hash.
+ table_hash := Hash of the (<dm_version_str> ";" <device_metadata> ";" <table_load_data> ";")
+ as described in the 'dm_table_load' above.
+ Note: If the table_load data spans across multiple IMA 'dm_table_load'
+ events for a given device, the hash is computed combining all the event data
+ i.e. (<dm_version_str> ";" <device_metadata> ";" <table_load_data> ";")
+ across all those events.
+ current_device_capacity := "current_device_capacity=" <N>
+
+ For instance, if a linear device is resumed with the following command,
+ #dmsetup resume linear1
+
+ then IMA ASCII measurement log will have an entry with:
+ (converted from ASCII to text for readability)
+
+ 10 56c00cc062ffc24ccd9ac2d67d194af3282b934e ima-buf sha256:e7d12c03b958b4e0e53e7363a06376be88d98a1ac191fdbd3baf5e4b77f329b6
+ dm_device_resume
+ dm_version=4.45.0;
+ name=linear1,uuid=,major=253,minor=0,minor_count=1,num_targets=4;
+ active_table_hash=sha256:4d73481ecce5eadba8ab084640d85bb9ca899af4d0a122989252a76efadc5b72;current_device_capacity=8;
+
+3. Device remove:
+------------------
+When a device is removed, the device information and a sha256 hash of the
+data from an active and inactive table are measured.
+
+The IMA measurement log has the following format for 'dm_device_remove':
+
+::
+
+ EVENT_NAME := "dm_device_remove"
+ EVENT_DATA := <dm_version_str> ";" <device_active_metadata> ";" <device_inactive_metadata> ";"
+ <active_table_hash> "," <inactive_table_hash> "," <remove_all> ";" <current_device_capacity> ";"
+
+ dm_version_str := As described in the 'Table load' section above.
+ device_active_metadata := Device metadata that reflects the currently loaded active table.
+ The format is same as 'device_metadata' described in the 'Table load' section above.
+ device_inactive_metadata := Device metadata that reflects the inactive table.
+ The format is same as 'device_metadata' described in the 'Table load' section above.
+ active_table_hash := Hash of the currently loaded active table.
+ The format is same as 'active_table_hash' described in the 'Device resume' section above.
+ inactive_table_hash := Hash of the inactive table.
+ The format is same as 'active_table_hash' described in the 'Device resume' section above.
+ remove_all := "remove_all=" <yes_no>
+ yes_no := "y" | "n"
+ current_device_capacity := "current_device_capacity=" <N>
+
+ For instance, if a linear device is removed with the following command,
+ #dmsetup remove l1
+
+ then IMA ASCII measurement log will have the following entry:
+ (converted from ASCII to text for readability)
+
+ 10 790e830a3a7a31590824ac0642b3b31c2d0e8b38 ima-buf sha256:ab9f3c959367a8f5d4403d6ce9c3627dadfa8f9f0e7ec7899299782388de3840
+ dm_device_remove
+ dm_version=4.45.0;
+ device_active_metadata=name=l1,uuid=,major=253,minor=2,minor_count=1,num_targets=2;
+ device_inactive_metadata=name=l1,uuid=,major=253,minor=2,minor_count=1,num_targets=1;
+ active_table_hash=sha256:4a7e62efaebfc86af755831998b7db6f59b60d23c9534fb16a4455907957953a,
+ inactive_table_hash=sha256:9d79c175bc2302d55a183e8f50ad4bafd60f7692fd6249e5fd213e2464384b86,remove_all=n;
+ current_device_capacity=2048;
+
+4. Table clear:
+----------------
+When an inactive table is cleared from the device, the device information and a sha256 hash of the
+data from an inactive table are measured.
+
+The IMA measurement log has the following format for 'dm_table_clear':
+
+::
+
+ EVENT_NAME := "dm_table_clear"
+ EVENT_DATA := <dm_version_str> ";" <device_inactive_metadata> ";" <inactive_table_hash> ";" <current_device_capacity> ";"
+
+ dm_version_str := As described in the 'Table load' section above.
+ device_inactive_metadata := Device metadata that was captured during the load time inactive table being cleared.
+ The format is same as 'device_metadata' described in the 'Table load' section above.
+ inactive_table_hash := Hash of the inactive table being cleared from the device.
+ The format is same as 'active_table_hash' described in the 'Device resume' section above.
+ current_device_capacity := "current_device_capacity=" <N>
+
+ For instance, if a linear device's inactive table is cleared,
+ #dmsetup clear l1
+
+ then IMA ASCII measurement log will have an entry with:
+ (converted from ASCII to text for readability)
+
+ 10 77d347408f557f68f0041acb0072946bb2367fe5 ima-buf sha256:42f9ca22163fdfa548e6229dece2959bc5ce295c681644240035827ada0e1db5
+ dm_table_clear
+ dm_version=4.45.0;
+ name=l1,uuid=,major=253,minor=2,minor_count=1,num_targets=1;
+ inactive_table_hash=sha256:75c0dc347063bf474d28a9907037eba060bfe39d8847fc0646d75e149045d545;current_device_capacity=1024;
+
+5. Device rename:
+------------------
+When an device's NAME or UUID is changed, the device information and the new NAME and UUID
+are measured.
+
+The IMA measurement log has the following format for 'dm_device_rename':
+
+::
+
+ EVENT_NAME := "dm_device_rename"
+ EVENT_DATA := <dm_version_str> ";" <device_active_metadata> ";" <new_device_name> "," <new_device_uuid> ";" <current_device_capacity> ";"
+
+ dm_version_str := As described in the 'Table load' section above.
+ device_active_metadata := Device metadata that reflects the currently loaded active table.
+ The format is same as 'device_metadata' described in the 'Table load' section above.
+ new_device_name := "new_name=" <dm-device-name>
+ dm-device-name := Same as <dm-device-name> described in 'Table load' section above
+ new_device_uuid := "new_uuid=" <dm-device-uuid>
+ dm-device-uuid := Same as <dm-device-uuid> described in 'Table load' section above
+ current_device_capacity := "current_device_capacity=" <N>
+
+ E.g 1: if a linear device's name is changed with the following command,
+ #dmsetup rename linear1 --setuuid 1234-5678
+
+ then IMA ASCII measurement log will have an entry with:
+ (converted from ASCII to text for readability)
+
+ 10 8b0423209b4c66ac1523f4c9848c9b51ee332f48 ima-buf sha256:6847b7258134189531db593e9230b257c84f04038b5a18fd2e1473860e0569ac
+ dm_device_rename
+ dm_version=4.45.0;
+ name=linear1,uuid=,major=253,minor=2,minor_count=1,num_targets=1;new_name=linear1,new_uuid=1234-5678;
+ current_device_capacity=1024;
+
+ E.g 2: if a linear device's name is changed with the following command,
+ # dmsetup rename linear1 linear=2
+
+ then IMA ASCII measurement log will have an entry with:
+ (converted from ASCII to text for readability)
+
+ 10 bef70476b99c2bdf7136fae033aa8627da1bf76f ima-buf sha256:8c6f9f53b9ef9dc8f92a2f2cca8910e622543d0f0d37d484870cb16b95111402
+ dm_device_rename
+ dm_version=4.45.0;
+ name=linear1,uuid=1234-5678,major=253,minor=2,minor_count=1,num_targets=1;
+ new_name=linear\=2,new_uuid=1234-5678;
+ current_device_capacity=1024;
+
+Supported targets:
+==================
+
+Following targets are supported to measure their data using IMA:
+
+ 1. cache
+ #. crypt
+ #. integrity
+ #. linear
+ #. mirror
+ #. multipath
+ #. raid
+ #. snapshot
+ #. striped
+ #. verity
+
+1. cache
+---------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'cache' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <metadata_mode> "," <cache_metadata_device> ","
+ <cache_device> "," <cache_origin_device> "," <writethrough> "," <writeback> ","
+ <passthrough> "," <no_discard_passdown> ";"
+
+ target_name := "target_name=cache"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ metadata_mode := "metadata_mode=" <cache_metadata_mode>
+ cache_metadata_mode := "fail" | "ro" | "rw"
+ cache_device := "cache_device=" <cache_device_name_string>
+ cache_origin_device := "cache_origin_device=" <cache_origin_device_string>
+ writethrough := "writethrough=" <yes_no>
+ writeback := "writeback=" <yes_no>
+ passthrough := "passthrough=" <yes_no>
+ no_discard_passdown := "no_discard_passdown=" <yes_no>
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'cache' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'cache' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;name=cache1,uuid=cache_uuid,major=253,minor=2,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=28672,target_name=cache,target_version=2.2.0,metadata_mode=rw,
+ cache_metadata_device=253:4,cache_device=253:3,cache_origin_device=253:5,writethrough=y,writeback=n,
+ passthrough=n,metadata2=y,no_discard_passdown=n;
+
+
+2. crypt
+---------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'crypt' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <allow_discards> "," <same_cpu_crypt> ","
+ <submit_from_crypt_cpus> "," <no_read_workqueue> "," <no_write_workqueue> ","
+ <iv_large_sectors> "," <iv_large_sectors> "," [<integrity_tag_size> ","] [<cipher_auth> ","]
+ [<sector_size> ","] [<cipher_string> ","] <key_size> "," <key_parts> ","
+ <key_extra_size> "," <key_mac_size> ";"
+
+ target_name := "target_name=crypt"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ allow_discards := "allow_discards=" <yes_no>
+ same_cpu_crypt := "same_cpu_crypt=" <yes_no>
+ submit_from_crypt_cpus := "submit_from_crypt_cpus=" <yes_no>
+ no_read_workqueue := "no_read_workqueue=" <yes_no>
+ no_write_workqueue := "no_write_workqueue=" <yes_no>
+ iv_large_sectors := "iv_large_sectors=" <yes_no>
+ integrity_tag_size := "integrity_tag_size=" <N>
+ cipher_auth := "cipher_auth=" <string>
+ sector_size := "sector_size=" <N>
+ cipher_string := "cipher_string="
+ key_size := "key_size=" <N>
+ key_parts := "key_parts=" <N>
+ key_extra_size := "key_extra_size=" <N>
+ key_mac_size := "key_mac_size=" <N>
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'crypt' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'crypt' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=crypt1,uuid=crypt_uuid1,major=253,minor=0,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=1953125,target_name=crypt,target_version=1.23.0,
+ allow_discards=y,same_cpu=n,submit_from_crypt_cpus=n,no_read_workqueue=n,no_write_workqueue=n,
+ iv_large_sectors=n,cipher_string=aes-xts-plain64,key_size=32,key_parts=1,key_extra_size=0,key_mac_size=0;
+
+3. integrity
+-------------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'integrity' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <dev_name> "," <start>
+ <tag_size> "," <mode> "," [<meta_device> ","] [<block_size> ","] <recalculate> ","
+ <allow_discards> "," <fix_padding> "," <fix_hmac> "," <legacy_recalculate> ","
+ <journal_sectors> "," <interleave_sectors> "," <buffer_sectors> ";"
+
+ target_name := "target_name=integrity"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ dev_name := "dev_name=" <device_name_str>
+ start := "start=" <N>
+ tag_size := "tag_size=" <N>
+ mode := "mode=" <integrity_mode_str>
+ integrity_mode_str := "J" | "B" | "D" | "R"
+ meta_device := "meta_device=" <meta_device_str>
+ block_size := "block_size=" <N>
+ recalculate := "recalculate=" <yes_no>
+ allow_discards := "allow_discards=" <yes_no>
+ fix_padding := "fix_padding=" <yes_no>
+ fix_hmac := "fix_hmac=" <yes_no>
+ legacy_recalculate := "legacy_recalculate=" <yes_no>
+ journal_sectors := "journal_sectors=" <N>
+ interleave_sectors := "interleave_sectors=" <N>
+ buffer_sectors := "buffer_sectors=" <N>
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'integrity' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'integrity' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=integrity1,uuid=,major=253,minor=1,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=7856,target_name=integrity,target_version=1.10.0,
+ dev_name=253:0,start=0,tag_size=32,mode=J,recalculate=n,allow_discards=n,fix_padding=n,
+ fix_hmac=n,legacy_recalculate=n,journal_sectors=88,interleave_sectors=32768,buffer_sectors=128;
+
+
+4. linear
+----------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'linear' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <device_name> <,> <start> ";"
+
+ target_name := "target_name=linear"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ device_name := "device_name=" <linear_device_name_str>
+ start := "start=" <N>
+
+ E.g.
+ When a 'linear' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'linear' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=linear1,uuid=linear_uuid1,major=253,minor=2,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=28672,target_name=linear,target_version=1.4.0,
+ device_name=253:1,start=2048;
+
+5. mirror
+----------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'mirror' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <nr_mirrors> ","
+ <mirror_device_data> "," <handle_errors> "," <keep_log> "," <log_type_status> ";"
+
+ target_name := "target_name=mirror"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ nr_mirrors := "nr_mirrors=" <NR>
+ mirror_device_data := <mirror_device_row> | <mirror_device_data><mirror_device_row>
+ mirror_device_row is repeated <NR> times - for <NR> described in <nr_mirrors>.
+ mirror_device_row := <mirror_device_name> "," <mirror_device_status>
+ mirror_device_name := "mirror_device_" <X> "=" <mirror_device_name_str>
+ where <X> ranges from 0 to (<NR> -1) - for <NR> described in <nr_mirrors>.
+ mirror_device_status := "mirror_device_" <X> "_status=" <mirror_device_status_char>
+ where <X> ranges from 0 to (<NR> -1) - for <NR> described in <nr_mirrors>.
+ mirror_device_status_char := "A" | "F" | "D" | "S" | "R" | "U"
+ handle_errors := "handle_errors=" <yes_no>
+ keep_log := "keep_log=" <yes_no>
+ log_type_status := "log_type_status=" <log_type_status_str>
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'mirror' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'mirror' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=mirror1,uuid=mirror_uuid1,major=253,minor=6,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=2048,target_name=mirror,target_version=1.14.0,nr_mirrors=2,
+ mirror_device_0=253:4,mirror_device_0_status=A,
+ mirror_device_1=253:5,mirror_device_1_status=A,
+ handle_errors=y,keep_log=n,log_type_status=;
+
+6. multipath
+-------------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'multipath' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <nr_priority_groups>
+ ["," <pg_state> "," <priority_groups> "," <priority_group_paths>] ";"
+
+ target_name := "target_name=multipath"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ nr_priority_groups := "nr_priority_groups=" <NPG>
+ priority_groups := <priority_groups_row>|<priority_groups_row><priority_groups>
+ priority_groups_row := "pg_state_" <X> "=" <pg_state_str> "," "nr_pgpaths_" <X> "=" <NPGP> ","
+ "path_selector_name_" <X> "=" <string> "," <priority_group_paths>
+ where <X> ranges from 0 to (<NPG> -1) - for <NPG> described in <nr_priority_groups>.
+ pg_state_str := "E" | "A" | "D"
+ <priority_group_paths> := <priority_group_paths_row> | <priority_group_paths_row><priority_group_paths>
+ priority_group_paths_row := "path_name_" <X> "_" <Y> "=" <string> "," "is_active_" <X> "_" <Y> "=" <is_active_str>
+ "fail_count_" <X> "_" <Y> "=" <N> "," "path_selector_status_" <X> "_" <Y> "=" <path_selector_status_str>
+ where <X> ranges from 0 to (<NPG> -1) - for <NPG> described in <nr_priority_groups>,
+ and <Y> ranges from 0 to (<NPGP> -1) - for <NPGP> described in <priority_groups_row>.
+ is_active_str := "A" | "F"
+
+ E.g.
+ When a 'multipath' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'multipath' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=mp,uuid=,major=253,minor=0,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=2097152,target_name=multipath,target_version=1.14.0,nr_priority_groups=2,
+ pg_state_0=E,nr_pgpaths_0=2,path_selector_name_0=queue-length,
+ path_name_0_0=8:16,is_active_0_0=A,fail_count_0_0=0,path_selector_status_0_0=,
+ path_name_0_1=8:32,is_active_0_1=A,fail_count_0_1=0,path_selector_status_0_1=,
+ pg_state_1=E,nr_pgpaths_1=2,path_selector_name_1=queue-length,
+ path_name_1_0=8:48,is_active_1_0=A,fail_count_1_0=0,path_selector_status_1_0=,
+ path_name_1_1=8:64,is_active_1_1=A,fail_count_1_1=0,path_selector_status_1_1=;
+
+7. raid
+--------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'raid' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <raid_type> "," <raid_disks> "," <raid_state>
+ <raid_device_status> ["," journal_dev_mode] ";"
+
+ target_name := "target_name=raid"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ raid_type := "raid_type=" <raid_type_str>
+ raid_disks := "raid_disks=" <NRD>
+ raid_state := "raid_state=" <raid_state_str>
+ raid_state_str := "frozen" | "reshape" |"resync" | "check" | "repair" | "recover" | "idle" |"undef"
+ raid_device_status := <raid_device_status_row> | <raid_device_status_row><raid_device_status>
+ <raid_device_status_row> is repeated <NRD> times - for <NRD> described in <raid_disks>.
+ raid_device_status_row := "raid_device_" <X> "_status=" <raid_device_status_str>
+ where <X> ranges from 0 to (<NRD> -1) - for <NRD> described in <raid_disks>.
+ raid_device_status_str := "A" | "D" | "a" | "-"
+ journal_dev_mode := "journal_dev_mode=" <journal_dev_mode_str>
+ journal_dev_mode_str := "writethrough" | "writeback" | "invalid"
+
+ E.g.
+ When a 'raid' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'raid' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=raid_LV1,uuid=uuid_raid_LV1,major=253,minor=12,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=2048,target_name=raid,target_version=1.15.1,
+ raid_type=raid10,raid_disks=4,raid_state=idle,
+ raid_device_0_status=A,
+ raid_device_1_status=A,
+ raid_device_2_status=A,
+ raid_device_3_status=A;
+
+
+8. snapshot
+------------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'snapshot' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <snap_origin_name> ","
+ <snap_cow_name> "," <snap_valid> "," <snap_merge_failed> "," <snapshot_overflowed> ";"
+
+ target_name := "target_name=snapshot"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ snap_origin_name := "snap_origin_name=" <string>
+ snap_cow_name := "snap_cow_name=" <string>
+ snap_valid := "snap_valid=" <yes_no>
+ snap_merge_failed := "snap_merge_failed=" <yes_no>
+ snapshot_overflowed := "snapshot_overflowed=" <yes_no>
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'snapshot' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'snapshot' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=snap1,uuid=snap_uuid1,major=253,minor=13,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=4096,target_name=snapshot,target_version=1.16.0,
+ snap_origin_name=253:11,snap_cow_name=253:12,snap_valid=y,snap_merge_failed=n,snapshot_overflowed=n;
+
+9. striped
+-----------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'striped' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <stripes> "," <chunk_size> ","
+ <stripe_data> ";"
+
+ target_name := "target_name=striped"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ stripes := "stripes=" <NS>
+ chunk_size := "chunk_size=" <N>
+ stripe_data := <stripe_data_row>|<stripe_data><stripe_data_row>
+ stripe_data_row := <stripe_device_name> "," <stripe_physical_start> "," <stripe_status>
+ stripe_device_name := "stripe_" <X> "_device_name=" <stripe_device_name_str>
+ where <X> ranges from 0 to (<NS> -1) - for <NS> described in <stripes>.
+ stripe_physical_start := "stripe_" <X> "_physical_start=" <N>
+ where <X> ranges from 0 to (<NS> -1) - for <NS> described in <stripes>.
+ stripe_status := "stripe_" <X> "_status=" <stripe_status_str>
+ where <X> ranges from 0 to (<NS> -1) - for <NS> described in <stripes>.
+ stripe_status_str := "D" | "A"
+
+ E.g.
+ When a 'striped' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'striped' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=striped1,uuid=striped_uuid1,major=253,minor=5,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=640,target_name=striped,target_version=1.6.0,stripes=2,chunk_size=64,
+ stripe_0_device_name=253:0,stripe_0_physical_start=2048,stripe_0_status=A,
+ stripe_1_device_name=253:3,stripe_1_physical_start=2048,stripe_1_status=A;
+
+10. verity
+----------
+The 'target_attributes' (described as part of EVENT_DATA in 'Table load'
+section above) has the following data format for 'verity' target.
+
+::
+
+ target_attributes := <target_name> "," <target_version> "," <hash_failed> "," <verity_version> ","
+ <data_device_name> "," <hash_device_name> "," <verity_algorithm> "," <root_digest> ","
+ <salt> "," <ignore_zero_blocks> "," <check_at_most_once> ["," <root_hash_sig_key_desc>]
+ ["," <verity_mode>] ";"
+
+ target_name := "target_name=verity"
+ target_version := "target_version=" <N> "." <N> "." <N>
+ hash_failed := "hash_failed=" <hash_failed_str>
+ hash_failed_str := "C" | "V"
+ verity_version := "verity_version=" <verity_version_str>
+ data_device_name := "data_device_name=" <data_device_name_str>
+ hash_device_name := "hash_device_name=" <hash_device_name_str>
+ verity_algorithm := "verity_algorithm=" <verity_algorithm_str>
+ root_digest := "root_digest=" <root_digest_str>
+ salt := "salt=" <salt_str>
+ salt_str := "-" <verity_salt_str>
+ ignore_zero_blocks := "ignore_zero_blocks=" <yes_no>
+ check_at_most_once := "check_at_most_once=" <yes_no>
+ root_hash_sig_key_desc := "root_hash_sig_key_desc="
+ verity_mode := "verity_mode=" <verity_mode_str>
+ verity_mode_str := "ignore_corruption" | "restart_on_corruption" | "panic_on_corruption" | "invalid"
+ yes_no := "y" | "n"
+
+ E.g.
+ When a 'verity' target is loaded, then IMA ASCII measurement log will have an entry
+ similar to the following, depicting what 'verity' attributes are measured in EVENT_DATA
+ for 'dm_table_load' event.
+ (converted from ASCII to text for readability)
+
+ dm_version=4.45.0;
+ name=test-verity,uuid=,major=253,minor=2,minor_count=1,num_targets=1;
+ target_index=0,target_begin=0,target_len=1953120,target_name=verity,target_version=1.8.0,hash_failed=V,
+ verity_version=1,data_device_name=253:1,hash_device_name=253:0,verity_algorithm=sha256,
+ root_digest=29cb87e60ce7b12b443ba6008266f3e41e93e403d7f298f8e3f316b29ff89c5e,
+ salt=e48da609055204e89ae53b655ca2216dd983cf3cb829f34f63a297d106d53e2d,
+ ignore_zero_blocks=n,check_at_most_once=n;
diff --git a/Documentation/admin-guide/device-mapper/index.rst b/Documentation/admin-guide/device-mapper/index.rst
index 6cf8adc86fa8..cde52cc09645 100644
--- a/Documentation/admin-guide/device-mapper/index.rst
+++ b/Documentation/admin-guide/device-mapper/index.rst
@@ -13,6 +13,7 @@ Device Mapper
dm-dust
dm-ebs
dm-flakey
+ dm-ima
dm-init
dm-integrity
dm-io
diff --git a/Documentation/admin-guide/device-mapper/writecache.rst b/Documentation/admin-guide/device-mapper/writecache.rst
index 65427d8dfca6..10429779a91a 100644
--- a/Documentation/admin-guide/device-mapper/writecache.rst
+++ b/Documentation/admin-guide/device-mapper/writecache.rst
@@ -78,13 +78,23 @@ Status:
2. the number of blocks
3. the number of free blocks
4. the number of blocks under writeback
+5. the number of read requests
+6. the number of read requests that hit the cache
+7. the number of write requests
+8. the number of write requests that hit uncommitted block
+9. the number of write requests that hit committed block
+10. the number of write requests that bypass the cache
+11. the number of write requests that are allocated in the cache
+12. the number of write requests that are blocked on the freelist
+13. the number of flush requests
+14. the number of discard requests
Messages:
flush
- flush the cache device. The message returns successfully
+ Flush the cache device. The message returns successfully
if the cache device was flushed without an error
flush_on_suspend
- flush the cache device on next suspend. Use this message
+ Flush the cache device on next suspend. Use this message
when you are going to remove the cache device. The proper
sequence for removing the cache device is:
@@ -98,3 +108,5 @@ Messages:
6. the cache device is now inactive and it can be deleted
cleaner
See above "cleaner" constructor documentation.
+ clear_stats
+ Clear the statistics that are reported on the status line
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index f12cda55538b..8cbc711cda93 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -16,3 +16,4 @@ are configurable at compile, boot or run time.
multihit.rst
special-register-buffer-data-sampling.rst
core-scheduling.rst
+ l1d_flush.rst
diff --git a/Documentation/admin-guide/hw-vuln/l1d_flush.rst b/Documentation/admin-guide/hw-vuln/l1d_flush.rst
new file mode 100644
index 000000000000..210020bc3f56
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/l1d_flush.rst
@@ -0,0 +1,69 @@
+L1D Flushing
+============
+
+With an increasing number of vulnerabilities being reported around data
+leaks from the Level 1 Data cache (L1D) the kernel provides an opt-in
+mechanism to flush the L1D cache on context switch.
+
+This mechanism can be used to address e.g. CVE-2020-0550. For applications
+the mechanism keeps them safe from vulnerabilities, related to leaks
+(snooping of) from the L1D cache.
+
+
+Related CVEs
+------------
+The following CVEs can be addressed by this
+mechanism
+
+ ============= ======================== ==================
+ CVE-2020-0550 Improper Data Forwarding OS related aspects
+ ============= ======================== ==================
+
+Usage Guidelines
+----------------
+
+Please see document: :ref:`Documentation/userspace-api/spec_ctrl.rst
+<set_spec_ctrl>` for details.
+
+**NOTE**: The feature is disabled by default, applications need to
+specifically opt into the feature to enable it.
+
+Mitigation
+----------
+
+When PR_SET_L1D_FLUSH is enabled for a task a flush of the L1D cache is
+performed when the task is scheduled out and the incoming task belongs to a
+different process and therefore to a different address space.
+
+If the underlying CPU supports L1D flushing in hardware, the hardware
+mechanism is used, software fallback for the mitigation, is not supported.
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+The kernel command line allows to control the L1D flush mitigations at boot
+time with the option "l1d_flush=". The valid arguments for this option are:
+
+ ============ =============================================================
+ on Enables the prctl interface, applications trying to use
+ the prctl() will fail with an error if l1d_flush is not
+ enabled
+ ============ =============================================================
+
+By default the mechanism is disabled.
+
+Limitations
+-----------
+
+The mechanism does not mitigate L1D data leaks between tasks belonging to
+different processes which are concurrently executing on sibling threads of
+a physical CPU core when SMT is enabled on the system.
+
+This can be addressed by controlled placement of processes on physical CPU
+cores or by disabling SMT. See the relevant chapter in the L1TF mitigation
+document: :ref:`Documentation/admin-guide/hw-vuln/l1tf.rst <smt_control>`.
+
+**NOTE** : The opt-in of a task for L1D flushing works only when the task's
+affinity is limited to cores running in non-SMT mode. If a task which
+requested L1D flushing is scheduled on a SMT-enabled core the kernel sends
+a SIGBUS to the task.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index bdb22006f713..2102467faad6 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2421,6 +2421,23 @@
feature (tagged TLBs) on capable Intel chips.
Default is 1 (enabled)
+ l1d_flush= [X86,INTEL]
+ Control mitigation for L1D based snooping vulnerability.
+
+ Certain CPUs are vulnerable to an exploit against CPU
+ internal buffers which can forward information to a
+ disclosure gadget under certain conditions.
+
+ In vulnerable processors, the speculatively
+ forwarded data can be used in a cache side channel
+ attack, to access data to which the attacker does
+ not have direct access.
+
+ This parameter controls the mitigation. The
+ options are:
+
+ on - enable the interface for the mitigation
+
l1tf= [X86] Control mitigation of the L1TF vulnerability on
affected CPUs
@@ -4777,7 +4794,7 @@
reboot= [KNL]
Format (x86 or x86_64):
- [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \
+ [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] | d[efault] \
[[,]s[mp]#### \
[[,]b[ios] | a[cpi] | k[bd] | t[riple] | e[fi] | p[ci]] \
[[,]f[orce]
@@ -4945,8 +4962,6 @@
sa1100ir [NET]
See drivers/net/irda/sa1100_ir.c.
- sbni= [NET] Granch SBNI12 leased line adapter
-
sched_verbose [KNL] Enables verbose scheduler debug messages.
schedstats= [KNL,X86] Enable or disable scheduled statistics.
diff --git a/Documentation/atomic_t.txt b/Documentation/atomic_t.txt
index 0f1fdedf36bb..0f1ffa03db09 100644
--- a/Documentation/atomic_t.txt
+++ b/Documentation/atomic_t.txt
@@ -271,3 +271,97 @@ WRITE_ONCE. Thus:
SC *y, t;
is allowed.
+
+
+CMPXCHG vs TRY_CMPXCHG
+----------------------
+
+ int atomic_cmpxchg(atomic_t *ptr, int old, int new);
+ bool atomic_try_cmpxchg(atomic_t *ptr, int *oldp, int new);
+
+Both provide the same functionality, but try_cmpxchg() can lead to more
+compact code. The functions relate like:
+
+ bool atomic_try_cmpxchg(atomic_t *ptr, int *oldp, int new)
+ {
+ int ret, old = *oldp;
+ ret = atomic_cmpxchg(ptr, old, new);
+ if (ret != old)
+ *oldp = ret;
+ return ret == old;
+ }
+
+and:
+
+ int atomic_cmpxchg(atomic_t *ptr, int old, int new)
+ {
+ (void)atomic_try_cmpxchg(ptr, &old, new);
+ return old;
+ }
+
+Usage:
+
+ old = atomic_read(&v); old = atomic_read(&v);
+ for (;;) { do {
+ new = func(old); new = func(old);
+ tmp = atomic_cmpxchg(&v, old, new); } while (!atomic_try_cmpxchg(&v, &old, new));
+ if (tmp == old)
+ break;
+ old = tmp;
+ }
+
+NB. try_cmpxchg() also generates better code on some platforms (notably x86)
+where the function more closely matches the hardware instruction.
+
+
+FORWARD PROGRESS
+----------------
+
+In general strong forward progress is expected of all unconditional atomic
+operations -- those in the Arithmetic and Bitwise classes and xchg(). However
+a fair amount of code also requires forward progress from the conditional
+atomic operations.
+
+Specifically 'simple' cmpxchg() loops are expected to not starve one another
+indefinitely. However, this is not evident on LL/SC architectures, because
+while an LL/SC architecure 'can/should/must' provide forward progress
+guarantees between competing LL/SC sections, such a guarantee does not
+transfer to cmpxchg() implemented using LL/SC. Consider:
+
+ old = atomic_read(&v);
+ do {
+ new = func(old);
+ } while (!atomic_try_cmpxchg(&v, &old, new));
+
+which on LL/SC becomes something like:
+
+ old = atomic_read(&v);
+ do {
+ new = func(old);
+ } while (!({
+ volatile asm ("1: LL %[oldval], %[v]\n"
+ " CMP %[oldval], %[old]\n"
+ " BNE 2f\n"
+ " SC %[new], %[v]\n"
+ " BNE 1b\n"
+ "2:\n"
+ : [oldval] "=&r" (oldval), [v] "m" (v)
+ : [old] "r" (old), [new] "r" (new)
+ : "memory");
+ success = (oldval == old);
+ if (!success)
+ old = oldval;
+ success; }));
+
+However, even the forward branch from the failed compare can cause the LL/SC
+to fail on some architectures, let alone whatever the compiler makes of the C
+loop body. As a result there is no guarantee what so ever the cacheline
+containing @v will stay on the local CPU and progress is made.
+
+Even native CAS architectures can fail to provide forward progress for their
+primitive (See Sparc64 for an example).
+
+Such implementations are strongly encouraged to add exponential backoff loops
+to a failed CAS in order to ensure some progress. Affected architectures are
+also strongly encouraged to inspect/audit the atomic fallbacks, refcount_t and
+their locking primitives.
diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index baea6c2abba5..1ceb5d704a97 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -15,15 +15,7 @@ that goes into great technical depth about the BPF Architecture.
libbpf
======
-Libbpf is a userspace library for loading and interacting with bpf programs.
-
-.. toctree::
- :maxdepth: 1
-
- libbpf/libbpf
- libbpf/libbpf_api
- libbpf/libbpf_build
- libbpf/libbpf_naming_convention
+Documentation/bpf/libbpf/libbpf.rst is a userspace library for loading and interacting with bpf programs.
BPF Type Format (BTF)
=====================
diff --git a/Documentation/bpf/libbpf/libbpf.rst b/Documentation/bpf/libbpf/index.rst
index 1b1e61d5ead1..4f8adfc3ab83 100644
--- a/Documentation/bpf/libbpf/libbpf.rst
+++ b/Documentation/bpf/libbpf/index.rst
@@ -3,6 +3,14 @@
libbpf
======
+For API documentation see the `versioned API documentation site <https://libbpf.readthedocs.io/en/latest/api.html>`_.
+
+.. toctree::
+ :maxdepth: 1
+
+ libbpf_naming_convention
+ libbpf_build
+
This is documentation for libbpf, a userspace library for loading and
interacting with bpf programs.
diff --git a/Documentation/bpf/libbpf/libbpf_api.rst b/Documentation/bpf/libbpf/libbpf_api.rst
deleted file mode 100644
index f07eecd054da..000000000000
--- a/Documentation/bpf/libbpf/libbpf_api.rst
+++ /dev/null
@@ -1,27 +0,0 @@
-.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
-
-API
-===
-
-This documentation is autogenerated from header files in libbpf, tools/lib/bpf
-
-.. kernel-doc:: tools/lib/bpf/libbpf.h
- :internal:
-
-.. kernel-doc:: tools/lib/bpf/bpf.h
- :internal:
-
-.. kernel-doc:: tools/lib/bpf/btf.h
- :internal:
-
-.. kernel-doc:: tools/lib/bpf/xsk.h
- :internal:
-
-.. kernel-doc:: tools/lib/bpf/bpf_tracing.h
- :internal:
-
-.. kernel-doc:: tools/lib/bpf/bpf_core_read.h
- :internal:
-
-.. kernel-doc:: tools/lib/bpf/bpf_endian.h
- :internal: \ No newline at end of file
diff --git a/Documentation/bpf/libbpf/libbpf_naming_convention.rst b/Documentation/bpf/libbpf/libbpf_naming_convention.rst
index 3de1d51e41da..9c68d5014ff1 100644
--- a/Documentation/bpf/libbpf/libbpf_naming_convention.rst
+++ b/Documentation/bpf/libbpf/libbpf_naming_convention.rst
@@ -69,7 +69,7 @@ functions. These can be mixed and matched. Note that these functions
are not reentrant for performance reasons.
ABI
-==========
+---
libbpf can be both linked statically or used as DSO. To avoid possible
conflicts with other libraries an application is linked with, all
@@ -108,7 +108,7 @@ This bump in ABI version is at most once per kernel development cycle.
For example, if current state of ``libbpf.map`` is:
-.. code-block:: c
+.. code-block:: none
LIBBPF_0.0.1 {
global:
@@ -121,7 +121,7 @@ For example, if current state of ``libbpf.map`` is:
, and a new symbol ``bpf_func_c`` is being introduced, then
``libbpf.map`` should be changed like this:
-.. code-block:: c
+.. code-block:: none
LIBBPF_0.0.1 {
global:
diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst
index a2c96bec5ee8..1122cd3044c0 100644
--- a/Documentation/core-api/cpu_hotplug.rst
+++ b/Documentation/core-api/cpu_hotplug.rst
@@ -220,7 +220,7 @@ goes online (offline) and during initial setup (shutdown) of the driver. However
each registration and removal function is also available with a ``_nocalls``
suffix which does not invoke the provided callbacks if the invocation of the
callbacks is not desired. During the manual setup (or teardown) the functions
-``get_online_cpus()`` and ``put_online_cpus()`` should be used to inhibit CPU
+``cpus_read_lock()`` and ``cpus_read_unlock()`` should be used to inhibit CPU
hotplug operations.
diff --git a/Documentation/core-api/irq/irq-domain.rst b/Documentation/core-api/irq/irq-domain.rst
index 53283b3729a1..6979b4af2c1f 100644
--- a/Documentation/core-api/irq/irq-domain.rst
+++ b/Documentation/core-api/irq/irq-domain.rst
@@ -55,8 +55,24 @@ exist then it will allocate a new Linux irq_desc, associate it with
the hwirq, and call the .map() callback so the driver can perform any
required hardware setup.
-When an interrupt is received, irq_find_mapping() function should
-be used to find the Linux IRQ number from the hwirq number.
+Once a mapping has been established, it can be retrieved or used via a
+variety of methods:
+
+- irq_resolve_mapping() returns a pointer to the irq_desc structure
+ for a given domain and hwirq number, and NULL if there was no
+ mapping.
+- irq_find_mapping() returns a Linux IRQ number for a given domain and
+ hwirq number, and 0 if there was no mapping
+- irq_linear_revmap() is now identical to irq_find_mapping(), and is
+ deprecated
+- generic_handle_domain_irq() handles an interrupt described by a
+ domain and a hwirq number
+- handle_domain_irq() does the same thing for root interrupt
+ controllers and deals with the set_irq_reg()/irq_enter() sequences
+ that most architecture requires
+
+Note that irq domain lookups must happen in contexts that are
+compatible with a RCU read-side critical section.
The irq_create_mapping() function must be called *atleast once*
before any call to irq_find_mapping(), lest the descriptor will not
@@ -137,7 +153,9 @@ required. Calling irq_create_direct_mapping() will allocate a Linux
IRQ number and call the .map() callback so that driver can program the
Linux IRQ number into the hardware.
-Most drivers cannot use this mapping.
+Most drivers cannot use this mapping, and it is now gated on the
+CONFIG_IRQ_DOMAIN_NOMAP option. Please refrain from introducing new
+users of this API.
Legacy
------
@@ -157,6 +175,10 @@ for IRQ numbers that are passed to struct device registrations. In that
case the Linux IRQ numbers cannot be dynamically assigned and the legacy
mapping should be used.
+As the name implies, the *_legacy() functions are deprecated and only
+exist to ease the support of ancient platforms. No new users should be
+added.
+
The legacy map assumes a contiguous range of IRQ numbers has already
been allocated for the controller and that the IRQ number can be
calculated by adding a fixed offset to the hwirq number, and
diff --git a/Documentation/devicetree/bindings/firmware/xilinx/xlnx,zynqmp-firmware.txt b/Documentation/devicetree/bindings/firmware/xilinx/xlnx,zynqmp-firmware.txt
deleted file mode 100644
index 18c3aea90df2..000000000000
--- a/Documentation/devicetree/bindings/firmware/xilinx/xlnx,zynqmp-firmware.txt
+++ /dev/null
@@ -1,44 +0,0 @@
------------------------------------------------------------------
-Device Tree Bindings for the Xilinx Zynq MPSoC Firmware Interface
------------------------------------------------------------------
-
-The zynqmp-firmware node describes the interface to platform firmware.
-ZynqMP has an interface to communicate with secure firmware. Firmware
-driver provides an interface to firmware APIs. Interface APIs can be
-used by any driver to communicate to PMUFW(Platform Management Unit).
-These requests include clock management, pin control, device control,
-power management service, FPGA service and other platform management
-services.
-
-Required properties:
- - compatible: Must contain any of below:
- "xlnx,zynqmp-firmware" for Zynq Ultrascale+ MPSoC
- "xlnx,versal-firmware" for Versal
- - method: The method of calling the PM-API firmware layer.
- Permitted values are:
- - "smc" : SMC #0, following the SMCCC
- - "hvc" : HVC #0, following the SMCCC
-
--------
-Example
--------
-
-Zynq Ultrascale+ MPSoC
-----------------------
-firmware {
- zynqmp_firmware: zynqmp-firmware {
- compatible = "xlnx,zynqmp-firmware";
- method = "smc";
- ...
- };
-};
-
-Versal
-------
-firmware {
- versal_firmware: versal-firmware {
- compatible = "xlnx,versal-firmware";
- method = "smc";
- ...
- };
-};
diff --git a/Documentation/devicetree/bindings/firmware/xilinx/xlnx,zynqmp-firmware.yaml b/Documentation/devicetree/bindings/firmware/xilinx/xlnx,zynqmp-firmware.yaml
new file mode 100644
index 000000000000..f14f7b454f07
--- /dev/null
+++ b/Documentation/devicetree/bindings/firmware/xilinx/xlnx,zynqmp-firmware.yaml
@@ -0,0 +1,89 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/firmware/xilinx/xlnx,zynqmp-firmware.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Xilinx firmware driver
+
+maintainers:
+ - Nava kishore Manne <nava.manne@xilinx.com>
+
+description: The zynqmp-firmware node describes the interface to platform
+ firmware. ZynqMP has an interface to communicate with secure firmware.
+ Firmware driver provides an interface to firmware APIs. Interface APIs
+ can be used by any driver to communicate to PMUFW(Platform Management Unit).
+ These requests include clock management, pin control, device control,
+ power management service, FPGA service and other platform management
+ services.
+
+properties:
+ compatible:
+ oneOf:
+ - description: For implementations complying for Zynq Ultrascale+ MPSoC.
+ const: xlnx,zynqmp-firmware
+
+ - description: For implementations complying for Versal.
+ const: xlnx,versal-firmware
+
+ method:
+ description: |
+ The method of calling the PM-API firmware layer.
+ Permitted values are.
+ - "smc" : SMC #0, following the SMCCC
+ - "hvc" : HVC #0, following the SMCCC
+
+ $ref: /schemas/types.yaml#/definitions/string-array
+ enum:
+ - smc
+ - hvc
+
+ versal_fpga:
+ $ref: /schemas/fpga/xlnx,versal-fpga.yaml#
+ description: Compatible of the FPGA device.
+ type: object
+
+ zynqmp-aes:
+ $ref: /schemas/crypto/xlnx,zynqmp-aes.yaml#
+ description: The ZynqMP AES-GCM hardened cryptographic accelerator is
+ used to encrypt or decrypt the data with provided key and initialization
+ vector.
+ type: object
+
+ clock-controller:
+ $ref: /schemas/clock/xlnx,versal-clk.yaml#
+ description: The clock controller is a hardware block of Xilinx versal
+ clock tree. It reads required input clock frequencies from the devicetree
+ and acts as clock provider for all clock consumers of PS clocks.list of
+ clock specifiers which are external input clocks to the given clock
+ controller.
+ type: object
+
+required:
+ - compatible
+
+additionalProperties: false
+
+examples:
+ - |
+ versal-firmware {
+ compatible = "xlnx,versal-firmware";
+ method = "smc";
+
+ versal_fpga: versal_fpga {
+ compatible = "xlnx,versal-fpga";
+ };
+
+ xlnx_aes: zynqmp-aes {
+ compatible = "xlnx,zynqmp-aes";
+ };
+
+ versal_clk: clock-controller {
+ #clock-cells = <1>;
+ compatible = "xlnx,versal-clk";
+ clocks = <&ref>, <&alt_ref>, <&pl_alt_ref>;
+ clock-names = "ref", "alt_ref", "pl_alt_ref";
+ };
+ };
+
+...
diff --git a/Documentation/devicetree/bindings/fpga/xlnx,versal-fpga.yaml b/Documentation/devicetree/bindings/fpga/xlnx,versal-fpga.yaml
new file mode 100644
index 000000000000..ac6a207278d5
--- /dev/null
+++ b/Documentation/devicetree/bindings/fpga/xlnx,versal-fpga.yaml
@@ -0,0 +1,33 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/fpga/xlnx,versal-fpga.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Xilinx Versal FPGA driver.
+
+maintainers:
+ - Nava kishore Manne <nava.manne@xilinx.com>
+
+description: |
+ Device Tree Versal FPGA bindings for the Versal SoC, controlled
+ using firmware interface.
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - xlnx,versal-fpga
+
+required:
+ - compatible
+
+additionalProperties: false
+
+examples:
+ - |
+ versal_fpga: versal_fpga {
+ compatible = "xlnx,versal-fpga";
+ };
+
+...
diff --git a/Documentation/devicetree/bindings/fsi/ibm,fsi2spi.yaml b/Documentation/devicetree/bindings/fsi/ibm,fsi2spi.yaml
index e425278653f5..e2ca0b000471 100644
--- a/Documentation/devicetree/bindings/fsi/ibm,fsi2spi.yaml
+++ b/Documentation/devicetree/bindings/fsi/ibm,fsi2spi.yaml
@@ -19,7 +19,6 @@ properties:
compatible:
enum:
- ibm,fsi2spi
- - ibm,fsi2spi-restricted
reg:
items:
diff --git a/Documentation/devicetree/bindings/gpio/rockchip,gpio-bank.yaml b/Documentation/devicetree/bindings/gpio/rockchip,gpio-bank.yaml
index d993e002cebe..0d62c28fb58d 100644
--- a/Documentation/devicetree/bindings/gpio/rockchip,gpio-bank.yaml
+++ b/Documentation/devicetree/bindings/gpio/rockchip,gpio-bank.yaml
@@ -22,7 +22,10 @@ properties:
maxItems: 1
clocks:
- maxItems: 1
+ minItems: 1
+ items:
+ - description: APB interface clock source
+ - description: GPIO debounce reference clock source
gpio-controller: true
diff --git a/Documentation/devicetree/bindings/hwmon/amd,sbrmi.yaml b/Documentation/devicetree/bindings/hwmon/amd,sbrmi.yaml
new file mode 100644
index 000000000000..7598b083979c
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/amd,sbrmi.yaml
@@ -0,0 +1,53 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/hwmon/amd,sbrmi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: >
+ Sideband Remote Management Interface (SB-RMI) compliant
+ AMD SoC power device.
+
+maintainers:
+ - Akshay Gupta <Akshay.Gupta@amd.com>
+
+description: |
+ SB Remote Management Interface (SB-RMI) is an SMBus compatible
+ interface that reports AMD SoC's Power (normalized Power) using,
+ Mailbox Service Request and resembles a typical 8-pin remote power
+ sensor's I2C interface to BMC. The power attributes in hwmon
+ reports power in microwatts.
+
+properties:
+ compatible:
+ enum:
+ - amd,sbrmi
+
+ reg:
+ maxItems: 1
+ description: |
+ I2C bus address of the device as specified in Section SBI SMBus Address
+ of the SoC register reference. The SB-RMI address is normally 78h for
+ socket 0 and 70h for socket 1, but it could vary based on hardware
+ address select pins.
+ \[open source SoC register reference\]
+ https://www.amd.com/en/support/tech-docs?keyword=55898
+
+required:
+ - compatible
+ - reg
+
+additionalProperties: false
+
+examples:
+ - |
+ i2c0 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ sbrmi@3c {
+ compatible = "amd,sbrmi";
+ reg = <0x3c>;
+ };
+ };
+...
diff --git a/Documentation/devicetree/bindings/hwmon/winbond,w83781d.yaml b/Documentation/devicetree/bindings/hwmon/winbond,w83781d.yaml
new file mode 100644
index 000000000000..31ce77a4b087
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/winbond,w83781d.yaml
@@ -0,0 +1,41 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+
+$id: http://devicetree.org/schemas/hwmon/winbond,w83781d.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Winbond W83781 and compatible hardware monitor IC
+
+maintainers:
+ - Linus Walleij <linus.walleij@linaro.org>
+
+properties:
+ compatible:
+ enum:
+ - winbond,w83781d
+ - winbond,w83781g
+ - winbond,w83782d
+ - winbond,w83783s
+ - asus,as99127f
+
+ reg:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+
+additionalProperties: false
+
+examples:
+ - |
+ i2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ temperature-sensor@28 {
+ compatible = "winbond,w83781d";
+ reg = <0x28>;
+ };
+ };
diff --git a/Documentation/devicetree/bindings/iio/st,st-sensors.yaml b/Documentation/devicetree/bindings/iio/st,st-sensors.yaml
index b2a1e42c56fa..71de5631ebae 100644
--- a/Documentation/devicetree/bindings/iio/st,st-sensors.yaml
+++ b/Documentation/devicetree/bindings/iio/st,st-sensors.yaml
@@ -152,47 +152,6 @@ allOf:
maxItems: 1
st,drdy-int-pin: false
- - if:
- properties:
- compatible:
- enum:
- # Two intertial interrupts i.e. accelerometer/gyro interrupts
- - st,h3lis331dl-accel
- - st,l3g4200d-gyro
- - st,l3g4is-gyro
- - st,l3gd20-gyro
- - st,l3gd20h-gyro
- - st,lis2de12
- - st,lis2dw12
- - st,lis2hh12
- - st,lis2dh12-accel
- - st,lis331dl-accel
- - st,lis331dlh-accel
- - st,lis3de
- - st,lis3dh-accel
- - st,lis3dhh
- - st,lis3mdl-magn
- - st,lng2dm-accel
- - st,lps331ap-press
- - st,lsm303agr-accel
- - st,lsm303dlh-accel
- - st,lsm303dlhc-accel
- - st,lsm303dlm-accel
- - st,lsm330-accel
- - st,lsm330-gyro
- - st,lsm330d-accel
- - st,lsm330d-gyro
- - st,lsm330dl-accel
- - st,lsm330dl-gyro
- - st,lsm330dlc-accel
- - st,lsm330dlc-gyro
- - st,lsm9ds0-gyro
- - st,lsm9ds1-magn
- then:
- properties:
- interrupts:
- maxItems: 2
-
required:
- compatible
- reg
diff --git a/Documentation/devicetree/bindings/interconnect/qcom,osm-l3.yaml b/Documentation/devicetree/bindings/interconnect/qcom,osm-l3.yaml
index d6a95c3cb26f..e701524ee811 100644
--- a/Documentation/devicetree/bindings/interconnect/qcom,osm-l3.yaml
+++ b/Documentation/devicetree/bindings/interconnect/qcom,osm-l3.yaml
@@ -18,6 +18,7 @@ properties:
compatible:
enum:
- qcom,sc7180-osm-l3
+ - qcom,sc8180x-osm-l3
- qcom,sdm845-osm-l3
- qcom,sm8150-osm-l3
- qcom,sm8250-epss-l3
diff --git a/Documentation/devicetree/bindings/interconnect/qcom,rpmh.yaml b/Documentation/devicetree/bindings/interconnect/qcom,rpmh.yaml
index 5accc0d113be..3fd1a134162d 100644
--- a/Documentation/devicetree/bindings/interconnect/qcom,rpmh.yaml
+++ b/Documentation/devicetree/bindings/interconnect/qcom,rpmh.yaml
@@ -49,6 +49,17 @@ properties:
- qcom,sc7280-mmss-noc
- qcom,sc7280-nsp-noc
- qcom,sc7280-system-noc
+ - qcom,sc8180x-aggre1-noc
+ - qcom,sc8180x-aggre2-noc
+ - qcom,sc8180x-camnoc-virt
+ - qcom,sc8180x-compute-noc
+ - qcom,sc8180x-config-noc
+ - qcom,sc8180x-dc-noc
+ - qcom,sc8180x-gem-noc
+ - qcom,sc8180x-ipa-virt
+ - qcom,sc8180x-mc-virt
+ - qcom,sc8180x-mmss-noc
+ - qcom,sc8180x-system-noc
- qcom,sdm845-aggre1-noc
- qcom,sdm845-aggre2-noc
- qcom,sdm845-config-noc
diff --git a/Documentation/devicetree/bindings/leds/common.yaml b/Documentation/devicetree/bindings/leds/common.yaml
index b1f363747a62..697102707703 100644
--- a/Documentation/devicetree/bindings/leds/common.yaml
+++ b/Documentation/devicetree/bindings/leds/common.yaml
@@ -128,6 +128,12 @@ properties:
as a panic indicator.
type: boolean
+ retain-state-shutdown:
+ description:
+ This property specifies that the LED should not be turned off or changed
+ when the system shuts down.
+ type: boolean
+
trigger-sources:
description: |
List of devices which should be used as a source triggering this LED
diff --git a/Documentation/devicetree/bindings/misc/ge-achc.txt b/Documentation/devicetree/bindings/misc/ge-achc.txt
deleted file mode 100644
index 77df94d7a32f..000000000000
--- a/Documentation/devicetree/bindings/misc/ge-achc.txt
+++ /dev/null
@@ -1,26 +0,0 @@
-* GE Healthcare USB Management Controller
-
-A device which handles data aquisition from compatible USB based peripherals.
-SPI is used for device management.
-
-Note: This device does not expose the peripherals as USB devices.
-
-Required properties:
-
-- compatible : Should be "ge,achc"
-
-Required SPI properties:
-
-- reg : Should be address of the device chip select within
- the controller.
-
-- spi-max-frequency : Maximum SPI clocking speed of device in Hz, should be
- 1MHz for the GE ACHC.
-
-Example:
-
-spidev0: spi@0 {
- compatible = "ge,achc";
- reg = <0>;
- spi-max-frequency = <1000000>;
-};
diff --git a/Documentation/devicetree/bindings/misc/ge-achc.yaml b/Documentation/devicetree/bindings/misc/ge-achc.yaml
new file mode 100644
index 000000000000..ff07aa62ed57
--- /dev/null
+++ b/Documentation/devicetree/bindings/misc/ge-achc.yaml
@@ -0,0 +1,65 @@
+# SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
+# Copyright (C) 2021 GE Inc.
+# Copyright (C) 2021 Collabora Ltd.
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/misc/ge-achc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: GE Healthcare USB Management Controller
+
+description: |
+ A device which handles data acquisition from compatible USB based peripherals.
+ SPI is used for device management.
+
+ Note: This device does not expose the peripherals as USB devices.
+
+maintainers:
+ - Sebastian Reichel <sre@kernel.org>
+
+properties:
+ compatible:
+ items:
+ - const: ge,achc
+ - const: nxp,kinetis-k20
+
+ clocks:
+ maxItems: 1
+
+ vdd-supply:
+ description: Digital power supply regulator on VDD pin
+
+ vdda-supply:
+ description: Analog power supply regulator on VDDA pin
+
+ reg:
+ items:
+ - description: Control interface
+ - description: Firmware programming interface
+
+ reset-gpios:
+ description: GPIO used for hardware reset.
+ maxItems: 1
+
+required:
+ - compatible
+ - clocks
+ - reg
+ - reset-gpios
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/gpio/gpio.h>
+ spi {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ spi@1 {
+ compatible = "ge,achc", "nxp,kinetis-k20";
+ reg = <1>, <0>;
+ clocks = <&achc_24M>;
+ reset-gpios = <&gpio3 6 GPIO_ACTIVE_LOW>;
+ };
+ };
diff --git a/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.yaml b/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.yaml
index b5baf439fbac..a3412f221104 100644
--- a/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.yaml
+++ b/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.yaml
@@ -29,6 +29,7 @@ properties:
- fsl,imx53-esdhc
- fsl,imx6q-usdhc
- fsl,imx6sl-usdhc
+ - fsl,imx6sll-usdhc
- fsl,imx6sx-usdhc
- fsl,imx6ull-usdhc
- fsl,imx7d-usdhc
@@ -115,12 +116,17 @@ properties:
- const: per
pinctrl-names:
- minItems: 1
- items:
- - const: default
- - const: state_100mhz
- - const: state_200mhz
- - const: sleep
+ oneOf:
+ - minItems: 3
+ items:
+ - const: default
+ - const: state_100mhz
+ - const: state_200mhz
+ - const: sleep
+ - minItems: 1
+ items:
+ - const: default
+ - const: sleep
required:
- compatible
diff --git a/Documentation/devicetree/bindings/mmc/mmc-pwrseq-sd8787.yaml b/Documentation/devicetree/bindings/mmc/mmc-pwrseq-sd8787.yaml
index e0169a285aa2..9e2396751030 100644
--- a/Documentation/devicetree/bindings/mmc/mmc-pwrseq-sd8787.yaml
+++ b/Documentation/devicetree/bindings/mmc/mmc-pwrseq-sd8787.yaml
@@ -11,7 +11,9 @@ maintainers:
properties:
compatible:
- const: mmc-pwrseq-sd8787
+ enum:
+ - mmc-pwrseq-sd8787
+ - mmc-pwrseq-wilc1000
powerdown-gpios:
minItems: 1
diff --git a/Documentation/devicetree/bindings/mmc/renesas,sdhi.yaml b/Documentation/devicetree/bindings/mmc/renesas,sdhi.yaml
index 677989bc5924..9f1e7092cf44 100644
--- a/Documentation/devicetree/bindings/mmc/renesas,sdhi.yaml
+++ b/Documentation/devicetree/bindings/mmc/renesas,sdhi.yaml
@@ -9,9 +9,6 @@ title: Renesas SDHI SD/MMC controller
maintainers:
- Wolfram Sang <wsa+renesas@sang-engineering.com>
-allOf:
- - $ref: "mmc-controller.yaml"
-
properties:
compatible:
oneOf:
@@ -47,19 +44,20 @@ properties:
- const: renesas,sdhi-mmc-r8a77470 # RZ/G1C (SDHI/MMC IP)
- items:
- enum:
- - renesas,sdhi-r8a774a1 # RZ/G2M
- - renesas,sdhi-r8a774b1 # RZ/G2N
- - renesas,sdhi-r8a774c0 # RZ/G2E
- - renesas,sdhi-r8a774e1 # RZ/G2H
- - renesas,sdhi-r8a7795 # R-Car H3
- - renesas,sdhi-r8a7796 # R-Car M3-W
- - renesas,sdhi-r8a77961 # R-Car M3-W+
- - renesas,sdhi-r8a77965 # R-Car M3-N
- - renesas,sdhi-r8a77970 # R-Car V3M
- - renesas,sdhi-r8a77980 # R-Car V3H
- - renesas,sdhi-r8a77990 # R-Car E3
- - renesas,sdhi-r8a77995 # R-Car D3
- - renesas,sdhi-r8a779a0 # R-Car V3U
+ - renesas,sdhi-r8a774a1 # RZ/G2M
+ - renesas,sdhi-r8a774b1 # RZ/G2N
+ - renesas,sdhi-r8a774c0 # RZ/G2E
+ - renesas,sdhi-r8a774e1 # RZ/G2H
+ - renesas,sdhi-r8a7795 # R-Car H3
+ - renesas,sdhi-r8a7796 # R-Car M3-W
+ - renesas,sdhi-r8a77961 # R-Car M3-W+
+ - renesas,sdhi-r8a77965 # R-Car M3-N
+ - renesas,sdhi-r8a77970 # R-Car V3M
+ - renesas,sdhi-r8a77980 # R-Car V3H
+ - renesas,sdhi-r8a77990 # R-Car E3
+ - renesas,sdhi-r8a77995 # R-Car D3
+ - renesas,sdhi-r8a779a0 # R-Car V3U
+ - renesas,sdhi-r9a07g044 # RZ/G2{L,LC}
- const: renesas,rcar-gen3-sdhi # R-Car Gen3 or RZ/G2
reg:
@@ -69,15 +67,9 @@ properties:
minItems: 1
maxItems: 3
- clocks:
- minItems: 1
- maxItems: 2
+ clocks: true
- clock-names:
- minItems: 1
- items:
- - const: core
- - const: cd
+ clock-names: true
dmas:
minItems: 4
@@ -104,14 +96,82 @@ properties:
pinctrl-1:
maxItems: 1
- pinctrl-names:
- minItems: 1
- items:
- - const: default
- - const: state_uhs
+ pinctrl-names: true
max-frequency: true
+allOf:
+ - $ref: "mmc-controller.yaml"
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ const: renesas,sdhi-r9a07g044
+ then:
+ properties:
+ clocks:
+ items:
+ - description: IMCLK, SDHI channel main clock1.
+ - description: IMCLK2, SDHI channel main clock2. When this clock is
+ turned off, external SD card detection cannot be
+ detected.
+ - description: CLK_HS, SDHI channel High speed clock which operates
+ 4 times that of SDHI channel main clock1.
+ - description: ACLK, SDHI channel bus clock.
+ clock-names:
+ items:
+ - const: imclk
+ - const: imclk2
+ - const: clk_hs
+ - const: aclk
+ required:
+ - clock-names
+ - resets
+ else:
+ properties:
+ clocks:
+ minItems: 1
+ maxItems: 2
+ clock-names:
+ minItems: 1
+ items:
+ - const: core
+ - const: cd
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ const: renesas,sdhi-mmc-r8a77470
+ then:
+ properties:
+ pinctrl-names:
+ items:
+ - const: state_uhs
+ else:
+ properties:
+ pinctrl-names:
+ minItems: 1
+ items:
+ - const: default
+ - const: state_uhs
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - renesas,sdhi-r7s72100
+ - renesas,sdhi-r7s9210
+ then:
+ required:
+ - clock-names
+ description:
+ The internal card detection logic that exists in these controllers is
+ sectioned off to be run by a separate second clock source to allow
+ the main core clock to be turned off to save power.
+
required:
- compatible
- reg
@@ -119,21 +179,6 @@ required:
- clocks
- power-domains
-if:
- properties:
- compatible:
- contains:
- enum:
- - renesas,sdhi-r7s72100
- - renesas,sdhi-r7s9210
-then:
- required:
- - clock-names
- description:
- The internal card detection logic that exists in these controllers is
- sectioned off to be run by a separate second clock source to allow
- the main core clock to be turned off to save power.
-
unevaluatedProperties: false
examples:
diff --git a/Documentation/devicetree/bindings/mmc/sdhci-msm.txt b/Documentation/devicetree/bindings/mmc/sdhci-msm.txt
index 4c7fa6a4ed15..365c3fc122ea 100644
--- a/Documentation/devicetree/bindings/mmc/sdhci-msm.txt
+++ b/Documentation/devicetree/bindings/mmc/sdhci-msm.txt
@@ -19,6 +19,7 @@ Required properties:
"qcom,msm8996-sdhci", "qcom,sdhci-msm-v4"
"qcom,qcs404-sdhci", "qcom,sdhci-msm-v5"
"qcom,sc7180-sdhci", "qcom,sdhci-msm-v5";
+ "qcom,sc7280-sdhci", "qcom,sdhci-msm-v5";
"qcom,sdm845-sdhci", "qcom,sdhci-msm-v5"
"qcom,sdx55-sdhci", "qcom,sdhci-msm-v5";
"qcom,sm8250-sdhci", "qcom,sdhci-msm-v5"
diff --git a/Documentation/devicetree/bindings/net/brcm,unimac-mdio.txt b/Documentation/devicetree/bindings/net/brcm,unimac-mdio.txt
deleted file mode 100644
index e15589f47787..000000000000
--- a/Documentation/devicetree/bindings/net/brcm,unimac-mdio.txt
+++ /dev/null
@@ -1,43 +0,0 @@
-* Broadcom UniMAC MDIO bus controller
-
-Required properties:
-- compatible: should one from "brcm,genet-mdio-v1", "brcm,genet-mdio-v2",
- "brcm,genet-mdio-v3", "brcm,genet-mdio-v4", "brcm,genet-mdio-v5" or
- "brcm,unimac-mdio"
-- reg: address and length of the register set for the device, first one is the
- base register, and the second one is optional and for indirect accesses to
- larger than 16-bits MDIO transactions
-- reg-names: name(s) of the register must be "mdio" and optional "mdio_indir_rw"
-- #size-cells: must be 1
-- #address-cells: must be 0
-
-Optional properties:
-- interrupts: must be one if the interrupt is shared with the Ethernet MAC or
- Ethernet switch this MDIO block is integrated from, or must be two, if there
- are two separate interrupts, first one must be "mdio done" and second must be
- for "mdio error"
-- interrupt-names: must be "mdio_done_error" when there is a share interrupt fed
- to this hardware block, or must be "mdio_done" for the first interrupt and
- "mdio_error" for the second when there are separate interrupts
-- clocks: A reference to the clock supplying the MDIO bus controller
-- clock-frequency: the MDIO bus clock that must be output by the MDIO bus
- hardware, if absent, the default hardware values are used
-
-Child nodes of this MDIO bus controller node are standard Ethernet PHY device
-nodes as described in Documentation/devicetree/bindings/net/phy.txt
-
-Example:
-
-mdio@403c0 {
- compatible = "brcm,unimac-mdio";
- reg = <0x403c0 0x8 0x40300 0x18>;
- reg-names = "mdio", "mdio_indir_rw";
- #size-cells = <1>;
- #address-cells = <0>;
-
- ...
- phy@0 {
- compatible = "ethernet-phy-ieee802.3-c22";
- reg = <0>;
- };
-};
diff --git a/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml b/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml
new file mode 100644
index 000000000000..f4f4c37f1d4e
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/brcm,unimac-mdio.yaml
@@ -0,0 +1,84 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/brcm,unimac-mdio.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Broadcom UniMAC MDIO bus controller
+
+maintainers:
+ - Rafał Miłecki <rafal@milecki.pl>
+
+allOf:
+ - $ref: mdio.yaml#
+
+properties:
+ compatible:
+ enum:
+ - brcm,genet-mdio-v1
+ - brcm,genet-mdio-v2
+ - brcm,genet-mdio-v3
+ - brcm,genet-mdio-v4
+ - brcm,genet-mdio-v5
+ - brcm,unimac-mdio
+
+ reg:
+ minItems: 1
+ items:
+ - description: base register
+ - description: indirect accesses to larger than 16-bits MDIO transactions
+
+ reg-names:
+ minItems: 1
+ items:
+ - const: mdio
+ - const: mdio_indir_rw
+
+ interrupts:
+ oneOf:
+ - description: >
+ Interrupt shared with the Ethernet MAC or Ethernet switch this MDIO
+ block is integrated from
+ - items:
+ - description: |
+ "mdio done" interrupt
+ - description: |
+ "mdio error" interrupt
+
+ interrupt-names:
+ oneOf:
+ - const: mdio_done_error
+ - items:
+ - const: mdio_done
+ - const: mdio_error
+
+ clocks:
+ description: A reference to the clock supplying the MDIO bus controller
+
+ clock-frequency:
+ description: >
+ The MDIO bus clock that must be output by the MDIO bus hardware, if
+ absent, the default hardware values are used
+
+unevaluatedProperties: false
+
+required:
+ - reg
+ - reg-names
+ - '#address-cells'
+ - '#size-cells'
+
+examples:
+ - |
+ mdio@403c0 {
+ compatible = "brcm,unimac-mdio";
+ reg = <0x403c0 0x8>, <0x40300 0x18>;
+ reg-names = "mdio", "mdio_indir_rw";
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ ethernet-phy@0 {
+ compatible = "ethernet-phy-ieee802.3-c22";
+ reg = <0>;
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/can/bosch,c_can.yaml b/Documentation/devicetree/bindings/net/can/bosch,c_can.yaml
new file mode 100644
index 000000000000..2cd145a642f1
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/can/bosch,c_can.yaml
@@ -0,0 +1,119 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/can/bosch,c_can.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Bosch C_CAN/D_CAN controller Device Tree Bindings
+
+description: Bosch C_CAN/D_CAN controller for CAN bus
+
+maintainers:
+ - Dario Binacchi <dariobin@libero.it>
+
+allOf:
+ - $ref: can-controller.yaml#
+
+properties:
+ compatible:
+ oneOf:
+ - enum:
+ - bosch,c_can
+ - bosch,d_can
+ - ti,dra7-d_can
+ - ti,am3352-d_can
+ - items:
+ - enum:
+ - ti,am4372-d_can
+ - const: ti,am3352-d_can
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ minItems: 1
+ maxItems: 4
+
+ power-domains:
+ description: |
+ Should contain a phandle to a PM domain provider node and an args
+ specifier containing the DCAN device id value. It's mandatory for
+ Keystone 2 66AK2G SoCs only.
+ maxItems: 1
+
+ clocks:
+ description: |
+ CAN functional clock phandle.
+ maxItems: 1
+
+ clock-names:
+ maxItems: 1
+
+ syscon-raminit:
+ description: |
+ Handle to system control region that contains the RAMINIT register,
+ register offset to the RAMINIT register and the CAN instance number (0
+ offset).
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ items:
+ items:
+ - description: The phandle to the system control region.
+ - description: The register offset.
+ - description: The CAN instance number.
+
+ resets:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+
+if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - bosch,d_can
+
+then:
+ properties:
+ interrupts:
+ minItems: 4
+ maxItems: 4
+ items:
+ - description: Error and status IRQ
+ - description: Message object IRQ
+ - description: RAM ECC correctable error IRQ
+ - description: RAM ECC non-correctable error IRQ
+
+else:
+ properties:
+ interrupts:
+ maxItems: 1
+ items:
+ - description: Error and status IRQ
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/reset/altr,rst-mgr.h>
+
+ can@ffc00000 {
+ compatible = "bosch,d_can";
+ reg = <0xffc00000 0x1000>;
+ interrupts = <0 131 4>, <0 132 4>, <0 133 4>, <0 134 4>;
+ clocks = <&can0_clk>;
+ resets = <&rst CAN0_RESET>;
+ };
+ - |
+ can@0 {
+ compatible = "ti,am3352-d_can";
+ reg = <0x0 0x2000>;
+ clocks = <&dcan1_fck>;
+ clock-names = "fck";
+ syscon-raminit = <&scm_conf 0x644 1>;
+ interrupts = <55>;
+ };
diff --git a/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml b/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml
index f84e31348d80..fb547e26c676 100644
--- a/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml
+++ b/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml
@@ -104,9 +104,18 @@ properties:
maximum: 32
maxItems: 1
+ power-domains:
+ description:
+ Power domain provider node and an args specifier containing
+ the can device id value.
+ maxItems: 1
+
can-transceiver:
$ref: can-transceiver.yaml#
+ phys:
+ maxItems: 1
+
required:
- compatible
- reg
diff --git a/Documentation/devicetree/bindings/net/can/c_can.txt b/Documentation/devicetree/bindings/net/can/c_can.txt
deleted file mode 100644
index 366479806acb..000000000000
--- a/Documentation/devicetree/bindings/net/can/c_can.txt
+++ /dev/null
@@ -1,65 +0,0 @@
-Bosch C_CAN/D_CAN controller Device Tree Bindings
--------------------------------------------------
-
-Required properties:
-- compatible : Should be "bosch,c_can" for C_CAN controllers and
- "bosch,d_can" for D_CAN controllers.
- Can be "ti,dra7-d_can", "ti,am3352-d_can" or
- "ti,am4372-d_can".
-- reg : physical base address and size of the C_CAN/D_CAN
- registers map
-- interrupts : property with a value describing the interrupt
- number
-
-The following are mandatory properties for DRA7x, AM33xx and AM43xx SoCs only:
-- ti,hwmods : Must be "d_can<n>" or "c_can<n>", n being the
- instance number
-
-The following are mandatory properties for Keystone 2 66AK2G SoCs only:
-- power-domains : Should contain a phandle to a PM domain provider node
- and an args specifier containing the DCAN device id
- value. This property is as per the binding,
- Documentation/devicetree/bindings/soc/ti/sci-pm-domain.yaml
-- clocks : CAN functional clock phandle. This property is as per the
- binding,
- Documentation/devicetree/bindings/clock/ti,sci-clk.yaml
-
-Optional properties:
-- syscon-raminit : Handle to system control region that contains the
- RAMINIT register, register offset to the RAMINIT
- register and the CAN instance number (0 offset).
-
-Note: "ti,hwmods" field is used to fetch the base address and irq
-resources from TI, omap hwmod data base during device registration.
-Future plan is to migrate hwmod data base contents into device tree
-blob so that, all the required data will be used from device tree dts
-file.
-
-Example:
-
-Step1: SoC common .dtsi file
-
- dcan1: d_can@481d0000 {
- compatible = "bosch,d_can";
- reg = <0x481d0000 0x2000>;
- interrupts = <55>;
- interrupt-parent = <&intc>;
- status = "disabled";
- };
-
-(or)
-
- dcan1: d_can@481d0000 {
- compatible = "bosch,d_can";
- ti,hwmods = "d_can1";
- reg = <0x481d0000 0x2000>;
- interrupts = <55>;
- interrupt-parent = <&intc>;
- status = "disabled";
- };
-
-Step 2: board specific .dts file
-
- &dcan1 {
- status = "okay";
- };
diff --git a/Documentation/devicetree/bindings/net/can/can-controller.yaml b/Documentation/devicetree/bindings/net/can/can-controller.yaml
index 9cf2ae097156..1f0e98051074 100644
--- a/Documentation/devicetree/bindings/net/can/can-controller.yaml
+++ b/Documentation/devicetree/bindings/net/can/can-controller.yaml
@@ -13,6 +13,15 @@ properties:
$nodename:
pattern: "^can(@.*)?$"
+ termination-gpios:
+ description: GPIO pin to enable CAN bus termination.
+ maxItems: 1
+
+ termination-ohms:
+ description: The resistance value of the CAN bus termination resistor.
+ minimum: 1
+ maximum: 65535
+
additionalProperties: true
...
diff --git a/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml b/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml
index 55bff1586b6f..3f0ee17c1461 100644
--- a/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml
+++ b/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml
@@ -119,6 +119,9 @@ properties:
minimum: 0
maximum: 2
+ termination-gpios: true
+ termination-ohms: true
+
required:
- compatible
- reg
@@ -148,3 +151,17 @@ examples:
fsl,stop-mode = <&gpr 0x34 28>;
fsl,scu-index = /bits/ 8 <1>;
};
+ - |
+ #include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/gpio/gpio.h>
+
+ can@2090000 {
+ compatible = "fsl,imx6q-flexcan";
+ reg = <0x02090000 0x4000>;
+ interrupts = <0 110 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&clks 1>, <&clks 2>;
+ clock-names = "ipg", "per";
+ fsl,stop-mode = <&gpr 0x34 28>;
+ termination-gpios = <&gpio1 0 GPIO_ACTIVE_LOW>;
+ termination-ohms = <120>;
+ };
diff --git a/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml b/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml
index 0b33ba9ccb47..546c6e6d2fb0 100644
--- a/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml
+++ b/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml
@@ -30,13 +30,15 @@ properties:
- renesas,r8a77995-canfd # R-Car D3
- const: renesas,rcar-gen3-canfd # R-Car Gen3 and RZ/G2
+ - items:
+ - enum:
+ - renesas,r9a07g044-canfd # RZ/G2{L,LC}
+ - const: renesas,rzg2l-canfd # RZ/G2L family
+
reg:
maxItems: 1
- interrupts:
- items:
- - description: Channel interrupt
- - description: Global interrupt
+ interrupts: true
clocks:
maxItems: 3
@@ -50,8 +52,7 @@ properties:
power-domains:
maxItems: 1
- resets:
- maxItems: 1
+ resets: true
renesas,no-can-fd:
$ref: /schemas/types.yaml#/definitions/flag
@@ -91,6 +92,62 @@ required:
- channel0
- channel1
+if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - renesas,rzg2l-canfd
+then:
+ properties:
+ interrupts:
+ items:
+ - description: CAN global error interrupt
+ - description: CAN receive FIFO interrupt
+ - description: CAN0 error interrupt
+ - description: CAN0 transmit interrupt
+ - description: CAN0 transmit/receive FIFO receive completion interrupt
+ - description: CAN1 error interrupt
+ - description: CAN1 transmit interrupt
+ - description: CAN1 transmit/receive FIFO receive completion interrupt
+
+ interrupt-names:
+ items:
+ - const: g_err
+ - const: g_recc
+ - const: ch0_err
+ - const: ch0_rec
+ - const: ch0_trx
+ - const: ch1_err
+ - const: ch1_rec
+ - const: ch1_trx
+
+ resets:
+ maxItems: 2
+
+ reset-names:
+ items:
+ - const: rstp_n
+ - const: rstc_n
+
+ required:
+ - interrupt-names
+ - reset-names
+else:
+ properties:
+ interrupts:
+ items:
+ - description: Channel interrupt
+ - description: Global interrupt
+
+ interrupt-names:
+ items:
+ - const: ch_int
+ - const: g_int
+
+ resets:
+ maxItems: 1
+
unevaluatedProperties: false
examples:
diff --git a/Documentation/devicetree/bindings/net/fsl,fec.yaml b/Documentation/devicetree/bindings/net/fsl,fec.yaml
new file mode 100644
index 000000000000..eca41443fcce
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/fsl,fec.yaml
@@ -0,0 +1,244 @@
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/fsl,fec.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale Fast Ethernet Controller (FEC)
+
+maintainers:
+ - Joakim Zhang <qiangqing.zhang@nxp.com>
+
+allOf:
+ - $ref: ethernet-controller.yaml#
+
+properties:
+ compatible:
+ oneOf:
+ - enum:
+ - fsl,imx25-fec
+ - fsl,imx27-fec
+ - fsl,imx28-fec
+ - fsl,imx6q-fec
+ - fsl,mvf600-fec
+ - items:
+ - enum:
+ - fsl,imx53-fec
+ - fsl,imx6sl-fec
+ - const: fsl,imx25-fec
+ - items:
+ - enum:
+ - fsl,imx35-fec
+ - fsl,imx51-fec
+ - const: fsl,imx27-fec
+ - items:
+ - enum:
+ - fsl,imx6ul-fec
+ - fsl,imx6sx-fec
+ - const: fsl,imx6q-fec
+ - items:
+ - enum:
+ - fsl,imx7d-fec
+ - const: fsl,imx6sx-fec
+ - items:
+ - const: fsl,imx8mq-fec
+ - const: fsl,imx6sx-fec
+ - items:
+ - enum:
+ - fsl,imx8mm-fec
+ - fsl,imx8mn-fec
+ - fsl,imx8mp-fec
+ - const: fsl,imx8mq-fec
+ - const: fsl,imx6sx-fec
+ - items:
+ - const: fsl,imx8qm-fec
+ - const: fsl,imx6sx-fec
+ - items:
+ - enum:
+ - fsl,imx8qxp-fec
+ - const: fsl,imx8qm-fec
+ - const: fsl,imx6sx-fec
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ minItems: 1
+ maxItems: 4
+
+ interrupt-names:
+ oneOf:
+ - items:
+ - const: int0
+ - items:
+ - const: int0
+ - const: pps
+ - items:
+ - const: int0
+ - const: int1
+ - const: int2
+ - items:
+ - const: int0
+ - const: int1
+ - const: int2
+ - const: pps
+
+ clocks:
+ minItems: 2
+ maxItems: 5
+ description:
+ The "ipg", for MAC ipg_clk_s, ipg_clk_mac_s that are for register accessing.
+ The "ahb", for MAC ipg_clk, ipg_clk_mac that are bus clock.
+ The "ptp"(option), for IEEE1588 timer clock that requires the clock.
+ The "enet_clk_ref"(option), for MAC transmit/receiver reference clock like
+ RGMII TXC clock or RMII reference clock. It depends on board design,
+ the clock is required if RGMII TXC and RMII reference clock source from
+ SOC internal PLL.
+ The "enet_out"(option), output clock for external device, like supply clock
+ for PHY. The clock is required if PHY clock source from SOC.
+ The "enet_2x_txclk"(option), for RGMII sampling clock which fixed at 250Mhz.
+ The clock is required if SoC RGMII enable clock delay.
+
+ clock-names:
+ minItems: 2
+ maxItems: 5
+ items:
+ enum:
+ - ipg
+ - ahb
+ - ptp
+ - enet_clk_ref
+ - enet_out
+ - enet_2x_txclk
+
+ phy-mode: true
+
+ phy-handle: true
+
+ fixed-link: true
+
+ local-mac-address: true
+
+ mac-address: true
+
+ tx-internal-delay-ps:
+ enum: [0, 2000]
+
+ rx-internal-delay-ps:
+ enum: [0, 2000]
+
+ phy-supply:
+ description:
+ Regulator that powers the Ethernet PHY.
+
+ fsl,num-tx-queues:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description:
+ The property is valid for enet-avb IP, which supports hw multi queues.
+ Should specify the tx queue number, otherwise set tx queue number to 1.
+ enum: [1, 2, 3]
+
+ fsl,num-rx-queues:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description:
+ The property is valid for enet-avb IP, which supports hw multi queues.
+ Should specify the rx queue number, otherwise set rx queue number to 1.
+ enum: [1, 2, 3]
+
+ fsl,magic-packet:
+ $ref: /schemas/types.yaml#/definitions/flag
+ description:
+ If present, indicates that the hardware supports waking up via magic packet.
+
+ fsl,err006687-workaround-present:
+ $ref: /schemas/types.yaml#/definitions/flag
+ description:
+ If present indicates that the system has the hardware workaround for
+ ERR006687 applied and does not need a software workaround.
+
+ fsl,stop-mode:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ description:
+ Register bits of stop mode control, the format is <&gpr req_gpr req_bit>.
+ gpr is the phandle to general purpose register node.
+ req_gpr is the gpr register offset for ENET stop request.
+ req_bit is the gpr bit offset for ENET stop request.
+
+ mdio:
+ type: object
+ description:
+ Specifies the mdio bus in the FEC, used as a container for phy nodes.
+
+ # Deprecated optional properties:
+ # To avoid these, create a phy node according to ethernet-phy.yaml in the same
+ # directory, and point the FEC's "phy-handle" property to it. Then use
+ # the phy's reset binding, again described by ethernet-phy.yaml.
+
+ phy-reset-gpios:
+ deprecated: true
+ description:
+ Should specify the gpio for phy reset.
+
+ phy-reset-duration:
+ deprecated: true
+ description:
+ Reset duration in milliseconds. Should present only if property
+ "phy-reset-gpios" is available. Missing the property will have the
+ duration be 1 millisecond. Numbers greater than 1000 are invalid
+ and 1 millisecond will be used instead.
+
+ phy-reset-active-high:
+ deprecated: true
+ description:
+ If present then the reset sequence using the GPIO specified in the
+ "phy-reset-gpios" property is reversed (H=reset state, L=operation state).
+
+ phy-reset-post-delay:
+ deprecated: true
+ description:
+ Post reset delay in milliseconds. If present then a delay of phy-reset-post-delay
+ milliseconds will be observed after the phy-reset-gpios has been toggled.
+ Can be omitted thus no delay is observed. Delay is in range of 1ms to 1000ms.
+ Other delays are invalid.
+
+required:
+ - compatible
+ - reg
+ - interrupts
+
+# FIXME: We had better set additionalProperties to false to avoid invalid or at
+# least undocumented properties. However, PHY may have a deprecated option to
+# place PHY OF properties in the MAC node, such as Micrel PHY, and we can find
+# these boards which is based on i.MX6QDL.
+additionalProperties: false
+
+examples:
+ - |
+ ethernet@83fec000 {
+ compatible = "fsl,imx51-fec", "fsl,imx27-fec";
+ reg = <0x83fec000 0x4000>;
+ interrupts = <87>;
+ phy-mode = "mii";
+ phy-reset-gpios = <&gpio2 14 0>;
+ phy-supply = <&reg_fec_supply>;
+ };
+
+ ethernet@83fed000 {
+ compatible = "fsl,imx51-fec", "fsl,imx27-fec";
+ reg = <0x83fed000 0x4000>;
+ interrupts = <87>;
+ phy-mode = "mii";
+ phy-reset-gpios = <&gpio2 14 0>;
+ phy-supply = <&reg_fec_supply>;
+ phy-handle = <&ethphy0>;
+
+ mdio {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ ethphy0: ethernet-phy@0 {
+ compatible = "ethernet-phy-ieee802.3-c22";
+ reg = <0>;
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt b/Documentation/devicetree/bindings/net/fsl-fec.txt
deleted file mode 100644
index 9b543789cd52..000000000000
--- a/Documentation/devicetree/bindings/net/fsl-fec.txt
+++ /dev/null
@@ -1,95 +0,0 @@
-* Freescale Fast Ethernet Controller (FEC)
-
-Required properties:
-- compatible : Should be "fsl,<soc>-fec"
-- reg : Address and length of the register set for the device
-- interrupts : Should contain fec interrupt
-- phy-mode : See ethernet.txt file in the same directory
-
-Optional properties:
-- phy-supply : regulator that powers the Ethernet PHY.
-- phy-handle : phandle to the PHY device connected to this device.
-- fixed-link : Assume a fixed link. See fixed-link.txt in the same directory.
- Use instead of phy-handle.
-- fsl,num-tx-queues : The property is valid for enet-avb IP, which supports
- hw multi queues. Should specify the tx queue number, otherwise set tx queue
- number to 1.
-- fsl,num-rx-queues : The property is valid for enet-avb IP, which supports
- hw multi queues. Should specify the rx queue number, otherwise set rx queue
- number to 1.
-- fsl,magic-packet : If present, indicates that the hardware supports waking
- up via magic packet.
-- fsl,err006687-workaround-present: If present indicates that the system has
- the hardware workaround for ERR006687 applied and does not need a software
- workaround.
-- fsl,stop-mode: register bits of stop mode control, the format is
- <&gpr req_gpr req_bit>.
- gpr is the phandle to general purpose register node.
- req_gpr is the gpr register offset for ENET stop request.
- req_bit is the gpr bit offset for ENET stop request.
- -interrupt-names: names of the interrupts listed in interrupts property in
- the same order. The defaults if not specified are
- __Number of interrupts__ __Default__
- 1 "int0"
- 2 "int0", "pps"
- 3 "int0", "int1", "int2"
- 4 "int0", "int1", "int2", "pps"
- The order may be changed as long as they correspond to the interrupts
- property. Currently, only i.mx7 uses "int1" and "int2". They correspond to
- tx/rx queues 1 and 2. "int0" will be used for queue 0 and ENET_MII interrupts.
- For imx6sx, "int0" handles all 3 queues and ENET_MII. "pps" is for the pulse
- per second interrupt associated with 1588 precision time protocol(PTP).
-
-Optional subnodes:
-- mdio : specifies the mdio bus in the FEC, used as a container for phy nodes
- according to phy.txt in the same directory
-
-Deprecated optional properties:
- To avoid these, create a phy node according to phy.txt in the same
- directory, and point the fec's "phy-handle" property to it. Then use
- the phy's reset binding, again described by phy.txt.
-- phy-reset-gpios : Should specify the gpio for phy reset
-- phy-reset-duration : Reset duration in milliseconds. Should present
- only if property "phy-reset-gpios" is available. Missing the property
- will have the duration be 1 millisecond. Numbers greater than 1000 are
- invalid and 1 millisecond will be used instead.
-- phy-reset-active-high : If present then the reset sequence using the GPIO
- specified in the "phy-reset-gpios" property is reversed (H=reset state,
- L=operation state).
-- phy-reset-post-delay : Post reset delay in milliseconds. If present then
- a delay of phy-reset-post-delay milliseconds will be observed after the
- phy-reset-gpios has been toggled. Can be omitted thus no delay is
- observed. Delay is in range of 1ms to 1000ms. Other delays are invalid.
-
-Example:
-
-ethernet@83fec000 {
- compatible = "fsl,imx51-fec", "fsl,imx27-fec";
- reg = <0x83fec000 0x4000>;
- interrupts = <87>;
- phy-mode = "mii";
- phy-reset-gpios = <&gpio2 14 GPIO_ACTIVE_LOW>; /* GPIO2_14 */
- local-mac-address = [00 04 9F 01 1B B9];
- phy-supply = <&reg_fec_supply>;
-};
-
-Example with phy specified:
-
-ethernet@83fec000 {
- compatible = "fsl,imx51-fec", "fsl,imx27-fec";
- reg = <0x83fec000 0x4000>;
- interrupts = <87>;
- phy-mode = "mii";
- phy-reset-gpios = <&gpio2 14 GPIO_ACTIVE_LOW>; /* GPIO2_14 */
- local-mac-address = [00 04 9F 01 1B B9];
- phy-supply = <&reg_fec_supply>;
- phy-handle = <&ethphy>;
- mdio {
- clock-frequency = <5000000>;
- ethphy: ethernet-phy@6 {
- compatible = "ethernet-phy-ieee802.3-c22";
- reg = <6>;
- max-speed = <100>;
- };
- };
-};
diff --git a/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml b/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml
new file mode 100644
index 000000000000..8b9b3f915d92
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml
@@ -0,0 +1,54 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+# Copyright 2018 Linaro Ltd.
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/net/intel,ixp46x-ptp-timer.yaml#"
+$schema: "http://devicetree.org/meta-schemas/core.yaml#"
+
+title: Intel IXP46x PTP Timer (TSYNC)
+
+maintainers:
+ - Linus Walleij <linus.walleij@linaro.org>
+
+description: |
+ The Intel IXP46x PTP timer is known in the manual as IEEE1588 Hardware
+ Assist and Time Synchronization Hardware Assist TSYNC provides a PTP
+ timer. It exists in the Intel IXP45x and IXP46x XScale SoCs.
+
+properties:
+ compatible:
+ const: intel,ixp46x-ptp-timer
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ items:
+ - description: Interrupt to trigger master mode snapshot from the
+ PRP timer, usually a GPIO interrupt.
+ - description: Interrupt to trigger slave mode snapshot from the
+ PRP timer, usually a GPIO interrupt.
+
+ interrupt-names:
+ items:
+ - const: master
+ - const: slave
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - interrupt-names
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/irq.h>
+ ptp-timer@c8010000 {
+ compatible = "intel,ixp46x-ptp-timer";
+ reg = <0xc8010000 0x1000>;
+ interrupt-parent = <&gpio0>;
+ interrupts = <8 IRQ_TYPE_EDGE_FALLING>, <7 IRQ_TYPE_EDGE_FALLING>;
+ interrupt-names = "master", "slave";
+ };
diff --git a/Documentation/devicetree/bindings/net/litex,liteeth.yaml b/Documentation/devicetree/bindings/net/litex,liteeth.yaml
new file mode 100644
index 000000000000..76c164a8199a
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/litex,liteeth.yaml
@@ -0,0 +1,98 @@
+# SPDX-License-Identifier: GPL-2.0-or-later OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/litex,liteeth.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: LiteX LiteETH ethernet device
+
+maintainers:
+ - Joel Stanley <joel@jms.id.au>
+
+description: |
+ LiteETH is a small footprint and configurable Ethernet core for FPGA based
+ system on chips.
+
+ The hardware source is Open Source and can be found on at
+ https://github.com/enjoy-digital/liteeth/.
+
+allOf:
+ - $ref: ethernet-controller.yaml#
+
+properties:
+ compatible:
+ const: litex,liteeth
+
+ reg:
+ items:
+ - description: MAC registers
+ - description: MDIO registers
+ - description: Packet buffer
+
+ reg-names:
+ items:
+ - const: mac
+ - const: mdio
+ - const: buffer
+
+ interrupts:
+ maxItems: 1
+
+ litex,rx-slots:
+ description: Number of slots in the receive buffer
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ default: 2
+
+ litex,tx-slots:
+ description: Number of slots in the transmit buffer
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ default: 2
+
+ litex,slot-size:
+ description: Size in bytes of a slot in the tx/rx buffer
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 0x800
+ default: 0x800
+
+ mac-address: true
+ local-mac-address: true
+ phy-handle: true
+
+ mdio:
+ $ref: mdio.yaml#
+
+required:
+ - compatible
+ - reg
+ - interrupts
+
+additionalProperties: false
+
+examples:
+ - |
+ mac: ethernet@8020000 {
+ compatible = "litex,liteeth";
+ reg = <0x8021000 0x100>,
+ <0x8020800 0x100>,
+ <0x8030000 0x2000>;
+ reg-names = "mac", "mdio", "buffer";
+ litex,rx-slots = <2>;
+ litex,tx-slots = <2>;
+ litex,slot-size = <0x800>;
+ interrupts = <0x11 0x1>;
+ phy-handle = <&eth_phy>;
+
+ mdio {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ eth_phy: ethernet-phy@0 {
+ reg = <0>;
+ };
+ };
+ };
+...
+
+# vim: set ts=2 sw=2 sts=2 tw=80 et cc=80 ft=yaml :
diff --git a/Documentation/devicetree/bindings/net/macb.txt b/Documentation/devicetree/bindings/net/macb.txt
index a4d547efc32a..af9df2f01a1c 100644
--- a/Documentation/devicetree/bindings/net/macb.txt
+++ b/Documentation/devicetree/bindings/net/macb.txt
@@ -8,6 +8,7 @@ Required properties:
Use "cdns,np4-macb" for NP4 SoC devices.
Use "cdns,at32ap7000-macb" for other 10/100 usage or use the generic form: "cdns,macb".
Use "atmel,sama5d2-gem" for the GEM IP (10/100) available on Atmel sama5d2 SoCs.
+ Use "atmel,sama5d29-gem" for GEM XL IP (10/100) available on Atmel sama5d29 SoCs.
Use "atmel,sama5d3-macb" for the 10/100Mbit IP available on Atmel sama5d3 SoCs.
Use "atmel,sama5d3-gem" for the Gigabit IP available on Atmel sama5d3 SoCs.
Use "atmel,sama5d4-gem" for the GEM IP (10/100) available on Atmel sama5d4 SoCs.
diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
index ed88ba4b94df..b8a0b392b24e 100644
--- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml
+++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
@@ -87,16 +87,24 @@ properties:
- const: ipa-setup-ready
interconnects:
- items:
- - description: Interconnect path between IPA and main memory
- - description: Interconnect path between IPA and internal memory
- - description: Interconnect path between IPA and the AP subsystem
+ oneOf:
+ - items:
+ - description: Path leading to system memory
+ - description: Path between the AP and IPA config space
+ - items:
+ - description: Path leading to system memory
+ - description: Path leading to internal memory
+ - description: Path between the AP and IPA config space
interconnect-names:
- items:
- - const: memory
- - const: imem
- - const: config
+ oneOf:
+ - items:
+ - const: memory
+ - const: config
+ - items:
+ - const: memory
+ - const: imem
+ - const: config
qcom,smem-states:
$ref: /schemas/types.yaml#/definitions/phandle-array
diff --git a/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml b/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml
index 0c973310ada0..2af304341772 100644
--- a/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml
+++ b/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml
@@ -14,7 +14,9 @@ allOf:
properties:
compatible:
- const: qcom,ipq4019-mdio
+ enum:
+ - qcom,ipq4019-mdio
+ - qcom,ipq5018-mdio
"#address-cells":
const: 1
@@ -23,7 +25,18 @@ properties:
const: 0
reg:
+ minItems: 1
+ maxItems: 2
+ description:
+ the first Address and length of the register set for the MDIO controller.
+ the second Address and length of the register for ethernet LDO, this second
+ address range is only required by the platform IPQ50xx.
+
+ clocks:
maxItems: 1
+ description: |
+ MDIO clock source frequency fixed to 100MHZ, this clock should be specified
+ by the platform IPQ807x, IPQ60xx and IPQ50xx.
required:
- compatible
diff --git a/Documentation/devicetree/bindings/nvmem/nintendo-otp.yaml b/Documentation/devicetree/bindings/nvmem/nintendo-otp.yaml
new file mode 100644
index 000000000000..dbe4ffdd644c
--- /dev/null
+++ b/Documentation/devicetree/bindings/nvmem/nintendo-otp.yaml
@@ -0,0 +1,44 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/nvmem/nintendo-otp.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Nintendo Wii and Wii U OTP Device Tree Bindings
+
+description: |
+ This binding represents the OTP memory as found on a Nintendo Wii or Wii U,
+ which contains common and per-console keys, signatures and related data
+ required to access peripherals.
+
+ See https://wiiubrew.org/wiki/Hardware/OTP
+
+maintainers:
+ - Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
+
+allOf:
+ - $ref: "nvmem.yaml#"
+
+properties:
+ compatible:
+ enum:
+ - nintendo,hollywood-otp
+ - nintendo,latte-otp
+
+ reg:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ otp@d8001ec {
+ compatible = "nintendo,latte-otp";
+ reg = <0x0d8001ec 0x8>;
+ };
+
+...
diff --git a/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml b/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml
index 861b205016b1..dede8892ee01 100644
--- a/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml
+++ b/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml
@@ -51,6 +51,9 @@ properties:
vcc-supply:
description: Our power supply.
+ power-domains:
+ maxItems: 1
+
# Needed if any child nodes are present.
"#address-cells":
const: 1
diff --git a/Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.txt b/Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.txt
deleted file mode 100644
index 7c70f2ad9942..000000000000
--- a/Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.txt
+++ /dev/null
@@ -1,20 +0,0 @@
-* Freescale i.MX8MQ USB3 PHY binding
-
-Required properties:
-- compatible: Should be "fsl,imx8mq-usb-phy" or "fsl,imx8mp-usb-phy"
-- #phys-cells: must be 0 (see phy-bindings.txt in this directory)
-- reg: The base address and length of the registers
-- clocks: phandles to the clocks for each clock listed in clock-names
-- clock-names: must contain "phy"
-
-Optional properties:
-- vbus-supply: A phandle to the regulator for USB VBUS.
-
-Example:
- usb3_phy0: phy@381f0040 {
- compatible = "fsl,imx8mq-usb-phy";
- reg = <0x381f0040 0x40>;
- clocks = <&clk IMX8MQ_CLK_USB1_PHY_ROOT>;
- clock-names = "phy";
- #phy-cells = <0>;
- };
diff --git a/Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml b/Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml
new file mode 100644
index 000000000000..2936f3510a6a
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml
@@ -0,0 +1,53 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/phy/fsl,imx8mq-usb-phy.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale i.MX8MQ USB3 PHY binding
+
+maintainers:
+ - Li Jun <jun.li@nxp.com>
+
+properties:
+ compatible:
+ enum:
+ - fsl,imx8mq-usb-phy
+ - fsl,imx8mp-usb-phy
+
+ reg:
+ maxItems: 1
+
+ "#phy-cells":
+ const: 0
+
+ clocks:
+ maxItems: 1
+
+ clock-names:
+ items:
+ - const: phy
+
+ vbus-supply:
+ description:
+ A phandle to the regulator for USB VBUS.
+
+required:
+ - compatible
+ - reg
+ - "#phy-cells"
+ - clocks
+ - clock-names
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/imx8mq-clock.h>
+ usb3_phy0: phy@381f0040 {
+ compatible = "fsl,imx8mq-usb-phy";
+ reg = <0x381f0040 0x40>;
+ clocks = <&clk IMX8MQ_CLK_USB1_PHY_ROOT>;
+ clock-names = "phy";
+ #phy-cells = <0>;
+ };
diff --git a/Documentation/devicetree/bindings/phy/intel,phy-keembay-usb.yaml b/Documentation/devicetree/bindings/phy/intel,keembay-phy-usb.yaml
index a217bb8ac5bc..52815b6c2b88 100644
--- a/Documentation/devicetree/bindings/phy/intel,phy-keembay-usb.yaml
+++ b/Documentation/devicetree/bindings/phy/intel,keembay-phy-usb.yaml
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
%YAML 1.2
---
-$id: http://devicetree.org/schemas/phy/intel,phy-keembay-usb.yaml#
+$id: http://devicetree.org/schemas/phy/intel,keembay-phy-usb.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Intel Keem Bay USB PHY bindings
diff --git a/Documentation/devicetree/bindings/phy/mediatek,tphy.yaml b/Documentation/devicetree/bindings/phy/mediatek,tphy.yaml
index ef9d9d4e6875..9e6c0f43f1c6 100644
--- a/Documentation/devicetree/bindings/phy/mediatek,tphy.yaml
+++ b/Documentation/devicetree/bindings/phy/mediatek,tphy.yaml
@@ -15,7 +15,7 @@ description: |
controllers on MediaTek SoCs, includes USB2.0, USB3.0, PCIe and SATA.
Layout differences of banks between T-PHY V1 (mt8173/mt2701) and
- T-PHY V2 (mt2712) when works on USB mode:
+ T-PHY V2 (mt2712) / V3 (mt8195) when works on USB mode:
-----------------------------------
Version 1:
port offset bank
@@ -34,7 +34,7 @@ description: |
u2 port2 0x1800 U2PHY_COM
...
- Version 2:
+ Version 2/3:
port offset bank
u2 port0 0x0000 MISC
0x0100 FMREG
@@ -59,7 +59,8 @@ description: |
SPLLC shared by u3 ports and FMREG shared by u2 ports on V1 are put back
into each port; a new bank MISC for u2 ports and CHIP for u3 ports are
- added on V2.
+ added on V2; the FMREG bank for slew rate calibration is not used anymore
+ and reserved on V3;
properties:
$nodename:
@@ -79,8 +80,11 @@ properties:
- mediatek,mt2712-tphy
- mediatek,mt7629-tphy
- mediatek,mt8183-tphy
- - mediatek,mt8195-tphy
- const: mediatek,generic-tphy-v2
+ - items:
+ - enum:
+ - mediatek,mt8195-tphy
+ - const: mediatek,generic-tphy-v3
- const: mediatek,mt2701-u3phy
deprecated: true
- const: mediatek,mt2712-u3phy
@@ -91,7 +95,7 @@ properties:
description:
Register shared by multiple ports, exclude port's private register.
It is needed for T-PHY V1, such as mt2701 and mt8173, but not for
- T-PHY V2, such as mt2712.
+ T-PHY V2/V3, such as mt2712.
maxItems: 1
"#address-cells":
@@ -197,6 +201,22 @@ patternProperties:
Specify the flag to enable BC1.2 if support it
type: boolean
+ mediatek,syscon-type:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ maxItems: 1
+ description:
+ A phandle to syscon used to access the register of type switch,
+ the field should always be 3 cells long.
+ items:
+ items:
+ - description:
+ The first cell represents a phandle to syscon
+ - description:
+ The second cell represents the register offset
+ - description:
+ The third cell represents the index of config segment
+ enum: [0, 1, 2, 3]
+
required:
- reg
- "#phy-cells"
diff --git a/Documentation/devicetree/bindings/phy/qcom,qmp-phy.yaml b/Documentation/devicetree/bindings/phy/qcom,qmp-phy.yaml
index f0497b8623ad..75be5650a198 100644
--- a/Documentation/devicetree/bindings/phy/qcom,qmp-phy.yaml
+++ b/Documentation/devicetree/bindings/phy/qcom,qmp-phy.yaml
@@ -18,6 +18,7 @@ properties:
compatible:
enum:
- qcom,ipq6018-qmp-pcie-phy
+ - qcom,ipq6018-qmp-usb3-phy
- qcom,ipq8074-qmp-pcie-phy
- qcom,ipq8074-qmp-usb3-phy
- qcom,msm8996-qmp-pcie-phy
@@ -27,6 +28,7 @@ properties:
- qcom,msm8998-qmp-ufs-phy
- qcom,msm8998-qmp-usb3-phy
- qcom,sc7180-qmp-usb3-phy
+ - qcom,sc8180x-qmp-pcie-phy
- qcom,sc8180x-qmp-ufs-phy
- qcom,sc8180x-qmp-usb3-phy
- qcom,sdm845-qhp-pcie-phy
@@ -34,6 +36,7 @@ properties:
- qcom,sdm845-qmp-ufs-phy
- qcom,sdm845-qmp-usb3-phy
- qcom,sdm845-qmp-usb3-uni-phy
+ - qcom,sm6115-qmp-ufs-phy
- qcom,sm8150-qmp-ufs-phy
- qcom,sm8150-qmp-usb3-phy
- qcom,sm8150-qmp-usb3-uni-phy
@@ -326,6 +329,7 @@ allOf:
compatible:
contains:
enum:
+ - qcom,sc8180x-qmp-pcie-phy
- qcom,sdm845-qhp-pcie-phy
- qcom,sdm845-qmp-pcie-phy
- qcom,sdx55-qmp-pcie-phy
diff --git a/Documentation/devicetree/bindings/phy/qcom,qmp-usb3-dp-phy.yaml b/Documentation/devicetree/bindings/phy/qcom,qmp-usb3-dp-phy.yaml
index 217aa6c91893..1d49cc3d4eae 100644
--- a/Documentation/devicetree/bindings/phy/qcom,qmp-usb3-dp-phy.yaml
+++ b/Documentation/devicetree/bindings/phy/qcom,qmp-usb3-dp-phy.yaml
@@ -14,6 +14,7 @@ properties:
compatible:
enum:
- qcom,sc7180-qmp-usb3-dp-phy
+ - qcom,sc8180x-qmp-usb3-dp-phy
- qcom,sdm845-qmp-usb3-dp-phy
- qcom,sm8250-qmp-usb3-dp-phy
reg:
diff --git a/Documentation/devicetree/bindings/phy/renesas,usb2-phy.yaml b/Documentation/devicetree/bindings/phy/renesas,usb2-phy.yaml
index d5dc5a3cdceb..3a6e1165419c 100644
--- a/Documentation/devicetree/bindings/phy/renesas,usb2-phy.yaml
+++ b/Documentation/devicetree/bindings/phy/renesas,usb2-phy.yaml
@@ -30,6 +30,11 @@ properties:
- renesas,usb2-phy-r8a77995 # R-Car D3
- const: renesas,rcar-gen3-usb2-phy
+ - items:
+ - enum:
+ - renesas,usb2-phy-r9a07g044 # RZ/G2{L,LC}
+ - const: renesas,rzg2l-usb2-phy # RZ/G2L family
+
reg:
maxItems: 1
@@ -91,6 +96,16 @@ required:
- clocks
- '#phy-cells'
+allOf:
+ - if:
+ properties:
+ compatible:
+ contains:
+ const: renesas,rzg2l-usb2-phy
+ then:
+ required:
+ - resets
+
additionalProperties: false
examples:
diff --git a/Documentation/devicetree/bindings/phy/samsung,ufs-phy.yaml b/Documentation/devicetree/bindings/phy/samsung,ufs-phy.yaml
index 636cc501b54f..f6ed1a005e7a 100644
--- a/Documentation/devicetree/bindings/phy/samsung,ufs-phy.yaml
+++ b/Documentation/devicetree/bindings/phy/samsung,ufs-phy.yaml
@@ -16,6 +16,7 @@ properties:
compatible:
enum:
- samsung,exynos7-ufs-phy
+ - samsung,exynosautov9-ufs-phy
reg:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.txt b/Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.txt
deleted file mode 100644
index 64b286d2d398..000000000000
--- a/Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.txt
+++ /dev/null
@@ -1,82 +0,0 @@
-TI AM654 SERDES
-
-Required properties:
- - compatible: Should be "ti,phy-am654-serdes"
- - reg : Address and length of the register set for the device.
- - #phy-cells: determine the number of cells that should be given in the
- phandle while referencing this phy. Should be "2". The 1st cell
- corresponds to the phy type (should be one of the types specified in
- include/dt-bindings/phy/phy.h) and the 2nd cell should be the serdes
- lane function.
- If SERDES0 is referenced 2nd cell should be:
- 0 - USB3
- 1 - PCIe0 Lane0
- 2 - ICSS2 SGMII Lane0
- If SERDES1 is referenced 2nd cell should be:
- 0 - PCIe1 Lane0
- 1 - PCIe0 Lane1
- 2 - ICSS2 SGMII Lane1
- - power-domains: As documented by the generic PM domain bindings in
- Documentation/devicetree/bindings/power/power_domain.txt.
- - clocks: List of clock-specifiers representing the input to the SERDES.
- Should have 3 items representing the left input clock, external
- reference clock and right input clock in that order.
- - clock-output-names: List of clock names for each of the clock outputs of
- SERDES. Should have 3 items for CMU reference clock,
- left output clock and right output clock in that order.
- - assigned-clocks: As defined in
- Documentation/devicetree/bindings/clock/clock-bindings.txt
- - assigned-clock-parents: As defined in
- Documentation/devicetree/bindings/clock/clock-bindings.txt
- - #clock-cells: Should be <1> to choose between the 3 output clocks.
- Defined in Documentation/devicetree/bindings/clock/clock-bindings.txt
-
- The following macros are defined in dt-bindings/phy/phy-am654-serdes.h
- for selecting the correct reference clock. This can be used while
- specifying the clocks created by SERDES.
- => AM654_SERDES_CMU_REFCLK
- => AM654_SERDES_LO_REFCLK
- => AM654_SERDES_RO_REFCLK
-
- - mux-controls: Phandle to the multiplexer that is used to select the lane
- function. See #phy-cells above to see the multiplex values.
-
-Example:
-
-Example for SERDES0 is given below. It has 3 clock inputs;
-left input reference clock as indicated by <&k3_clks 153 4>, external
-reference clock as indicated by <&k3_clks 153 1> and right input
-reference clock as indicated by <&serdes1 AM654_SERDES_LO_REFCLK>. (The
-right input of SERDES0 is connected to the left output of SERDES1).
-
-SERDES0 registers 3 clock outputs as indicated in clock-output-names. The
-first refers to the CMU reference clock, second refers to the left output
-reference clock and the third refers to the right output reference clock.
-
-The assigned-clocks and assigned-clock-parents is used here to set the
-parent of left input reference clock to MAINHSDIV_CLKOUT4 and parent of
-CMU reference clock to left input reference clock.
-
-serdes0: serdes@900000 {
- compatible = "ti,phy-am654-serdes";
- reg = <0x0 0x900000 0x0 0x2000>;
- reg-names = "serdes";
- #phy-cells = <2>;
- power-domains = <&k3_pds 153>;
- clocks = <&k3_clks 153 4>, <&k3_clks 153 1>,
- <&serdes1 AM654_SERDES_LO_REFCLK>;
- clock-output-names = "serdes0_cmu_refclk", "serdes0_lo_refclk",
- "serdes0_ro_refclk";
- assigned-clocks = <&k3_clks 153 4>, <&serdes0 AM654_SERDES_CMU_REFCLK>;
- assigned-clock-parents = <&k3_clks 153 8>, <&k3_clks 153 4>;
- ti,serdes-clk = <&serdes0_clk>;
- mux-controls = <&serdes_mux 0>;
- #clock-cells = <1>;
-};
-
-Example for PCIe consumer node using the SERDES PHY specifier is given below.
-&pcie0_rc {
- num-lanes = <2>;
- phys = <&serdes0 PHY_TYPE_PCIE 1>, <&serdes1 PHY_TYPE_PCIE 1>;
- phy-names = "pcie-phy0", "pcie-phy1";
-};
diff --git a/Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.yaml b/Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.yaml
new file mode 100644
index 000000000000..62dcb84c08aa
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/ti,phy-am654-serdes.yaml
@@ -0,0 +1,103 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/phy/ti,phy-am654-serdes.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: TI AM654 SERDES binding
+
+description:
+ This binding describes the TI AM654 SERDES. AM654 SERDES can be configured
+ to be used with either PCIe or USB or SGMII.
+
+maintainers:
+ - Kishon Vijay Abraham I <kishon@ti.com>
+
+properties:
+ compatible:
+ enum:
+ - ti,phy-am654-serdes
+
+ reg:
+ maxItems: 1
+
+ reg-names:
+ items:
+ - const: serdes
+
+ power-domains:
+ maxItems: 1
+
+ clocks:
+ maxItems: 3
+ description:
+ Three input clocks referring to left input reference clock, refclk and right input reference
+ clock.
+
+ assigned-clocks:
+ $ref: "/schemas/types.yaml#/definitions/phandle-array"
+ assigned-clock-parents:
+ $ref: "/schemas/types.yaml#/definitions/phandle-array"
+
+ '#phy-cells':
+ const: 2
+ description:
+ The 1st cell corresponds to the phy type (should be one of the types specified in
+ include/dt-bindings/phy/phy.h) and the 2nd cell should be the serdes lane function.
+
+ ti,serdes-clk:
+ description: Phandle to the SYSCON entry required for configuring SERDES clock selection.
+ $ref: /schemas/types.yaml#/definitions/phandle
+
+ '#clock-cells':
+ const: 1
+
+ mux-controls:
+ maxItems: 1
+ description: Phandle to the SYSCON entry required for configuring SERDES lane function.
+
+ clock-output-names:
+ oneOf:
+ - description: Clock output names for SERDES 0
+ items:
+ - const: serdes0_cmu_refclk
+ - const: serdes0_lo_refclk
+ - const: serdes0_ro_refclk
+ - description: Clock output names for SERDES 1
+ items:
+ - const: serdes1_cmu_refclk
+ - const: serdes1_lo_refclk
+ - const: serdes1_ro_refclk
+
+required:
+ - compatible
+ - reg
+ - power-domains
+ - clocks
+ - assigned-clocks
+ - assigned-clock-parents
+ - ti,serdes-clk
+ - mux-controls
+ - clock-output-names
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/phy/phy-am654-serdes.h>
+
+ serdes0: serdes@900000 {
+ compatible = "ti,phy-am654-serdes";
+ reg = <0x900000 0x2000>;
+ reg-names = "serdes";
+ #phy-cells = <2>;
+ power-domains = <&k3_pds 153>;
+ clocks = <&k3_clks 153 4>, <&k3_clks 153 1>,
+ <&serdes1 AM654_SERDES_LO_REFCLK>;
+ clock-output-names = "serdes0_cmu_refclk", "serdes0_lo_refclk", "serdes0_ro_refclk";
+ assigned-clocks = <&k3_clks 153 4>, <&serdes0 AM654_SERDES_CMU_REFCLK>;
+ assigned-clock-parents = <&k3_clks 153 8>, <&k3_clks 153 4>;
+ ti,serdes-clk = <&serdes0_clk>;
+ mux-controls = <&serdes_mux 0>;
+ #clock-cells = <1>;
+ };
diff --git a/Documentation/devicetree/bindings/power/supply/battery.yaml b/Documentation/devicetree/bindings/power/supply/battery.yaml
index c3b4b7543591..d56ac484fec5 100644
--- a/Documentation/devicetree/bindings/power/supply/battery.yaml
+++ b/Documentation/devicetree/bindings/power/supply/battery.yaml
@@ -31,6 +31,20 @@ properties:
compatible:
const: simple-battery
+ device-chemistry:
+ description: This describes the chemical technology of the battery.
+ oneOf:
+ - const: nickel-cadmium
+ - const: nickel-metal-hydride
+ - const: lithium-ion
+ description: This is a blanket type for all lithium-ion batteries,
+ including those below. If possible, a precise compatible string
+ from below should be used, but sometimes it is unknown which specific
+ lithium ion battery is employed and this wide compatible can be used.
+ - const: lithium-ion-polymer
+ - const: lithium-ion-iron-phosphate
+ - const: lithium-ion-manganese-oxide
+
over-voltage-threshold-microvolt:
description: battery over-voltage limit
diff --git a/Documentation/devicetree/bindings/power/supply/maxim,max17042.yaml b/Documentation/devicetree/bindings/power/supply/maxim,max17042.yaml
index c70f05ea6d27..971b53c58cc6 100644
--- a/Documentation/devicetree/bindings/power/supply/maxim,max17042.yaml
+++ b/Documentation/devicetree/bindings/power/supply/maxim,max17042.yaml
@@ -19,12 +19,15 @@ properties:
- maxim,max17047
- maxim,max17050
- maxim,max17055
+ - maxim,max77849-battery
reg:
maxItems: 1
interrupts:
maxItems: 1
+ description: |
+ The ALRT pin, an open-drain interrupt.
maxim,rsns-microohm:
$ref: /schemas/types.yaml#/definitions/uint32
diff --git a/Documentation/devicetree/bindings/power/supply/mt6360_charger.yaml b/Documentation/devicetree/bindings/power/supply/mt6360_charger.yaml
new file mode 100644
index 000000000000..b89b15a5bfa4
--- /dev/null
+++ b/Documentation/devicetree/bindings/power/supply/mt6360_charger.yaml
@@ -0,0 +1,48 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/power/supply/mt6360_charger.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Battery charger driver for MT6360 PMIC from MediaTek Integrated.
+
+maintainers:
+ - Gene Chen <gene_chen@richtek.com>
+
+description: |
+ This module is part of the MT6360 MFD device.
+ Provides Battery Charger, Boost for OTG devices and BC1.2 detection.
+
+properties:
+ compatible:
+ const: mediatek,mt6360-chg
+
+ richtek,vinovp-microvolt:
+ description: Maximum CHGIN regulation voltage in uV.
+ enum: [ 5500000, 6500000, 11000000, 14500000 ]
+
+
+ usb-otg-vbus-regulator:
+ type: object
+ description: OTG boost regulator.
+ $ref: /schemas/regulator/regulator.yaml#
+
+required:
+ - compatible
+
+additionalProperties: false
+
+examples:
+ - |
+ mt6360_charger: charger {
+ compatible = "mediatek,mt6360-chg";
+ richtek,vinovp-microvolt = <14500000>;
+
+ otg_vbus_regulator: usb-otg-vbus-regulator {
+ regulator-compatible = "usb-otg-vbus";
+ regulator-name = "usb-otg-vbus";
+ regulator-min-microvolt = <4425000>;
+ regulator-max-microvolt = <5825000>;
+ };
+ };
+...
diff --git a/Documentation/devicetree/bindings/power/supply/summit,smb347-charger.yaml b/Documentation/devicetree/bindings/power/supply/summit,smb347-charger.yaml
index 983fc215c1e5..20862cdfc116 100644
--- a/Documentation/devicetree/bindings/power/supply/summit,smb347-charger.yaml
+++ b/Documentation/devicetree/bindings/power/supply/summit,smb347-charger.yaml
@@ -73,6 +73,26 @@ properties:
- 1 # SMB3XX_SOFT_TEMP_COMPENSATE_CURRENT Current compensation
- 2 # SMB3XX_SOFT_TEMP_COMPENSATE_VOLTAGE Voltage compensation
+ summit,inok-polarity:
+ description: |
+ Polarity of INOK signal indicating presence of external power supply.
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum:
+ - 0 # SMB3XX_SYSOK_INOK_ACTIVE_LOW
+ - 1 # SMB3XX_SYSOK_INOK_ACTIVE_HIGH
+
+ usb-vbus:
+ $ref: "../../regulator/regulator.yaml#"
+ type: object
+
+ properties:
+ summit,needs-inok-toggle:
+ type: boolean
+ description: INOK signal is fixed and polarity needs to be toggled
+ in order to enable/disable output mode.
+
+ unevaluatedProperties: false
+
allOf:
- if:
properties:
@@ -134,6 +154,7 @@ examples:
reg = <0x7f>;
summit,enable-charge-control = <SMB3XX_CHG_ENABLE_PIN_ACTIVE_HIGH>;
+ summit,inok-polarity = <SMB3XX_SYSOK_INOK_ACTIVE_LOW>;
summit,chip-temperature-threshold-celsius = <110>;
summit,mains-current-limit-microamp = <2000000>;
summit,usb-current-limit-microamp = <500000>;
@@ -141,6 +162,15 @@ examples:
summit,enable-mains-charging;
monitored-battery = <&battery>;
+
+ usb-vbus {
+ regulator-name = "usb_vbus";
+ regulator-min-microvolt = <5000000>;
+ regulator-max-microvolt = <5000000>;
+ regulator-min-microamp = <750000>;
+ regulator-max-microamp = <750000>;
+ summit,needs-inok-toggle;
+ };
};
};
diff --git a/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-ac-power-supply.yaml b/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-ac-power-supply.yaml
index dcda6660b8ed..de6a23aee977 100644
--- a/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-ac-power-supply.yaml
+++ b/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-ac-power-supply.yaml
@@ -21,10 +21,13 @@ allOf:
properties:
compatible:
- enum:
- - x-powers,axp202-ac-power-supply
- - x-powers,axp221-ac-power-supply
- - x-powers,axp813-ac-power-supply
+ oneOf:
+ - const: x-powers,axp202-ac-power-supply
+ - const: x-powers,axp221-ac-power-supply
+ - items:
+ - const: x-powers,axp803-ac-power-supply
+ - const: x-powers,axp813-ac-power-supply
+ - const: x-powers,axp813-ac-power-supply
required:
- compatible
diff --git a/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-battery-power-supply.yaml b/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-battery-power-supply.yaml
index 86e8a713d4e2..d055428ae39f 100644
--- a/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-battery-power-supply.yaml
+++ b/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-battery-power-supply.yaml
@@ -19,10 +19,14 @@ allOf:
properties:
compatible:
- enum:
- - x-powers,axp209-battery-power-supply
- - x-powers,axp221-battery-power-supply
- - x-powers,axp813-battery-power-supply
+ oneOf:
+ - const: x-powers,axp202-battery-power-supply
+ - const: x-powers,axp209-battery-power-supply
+ - const: x-powers,axp221-battery-power-supply
+ - items:
+ - const: x-powers,axp803-battery-power-supply
+ - const: x-powers,axp813-battery-power-supply
+ - const: x-powers,axp813-battery-power-supply
required:
- compatible
diff --git a/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-usb-power-supply.yaml b/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-usb-power-supply.yaml
index 61f1b320c157..0c371b55c9e1 100644
--- a/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-usb-power-supply.yaml
+++ b/Documentation/devicetree/bindings/power/supply/x-powers,axp20x-usb-power-supply.yaml
@@ -20,11 +20,15 @@ allOf:
properties:
compatible:
- enum:
- - x-powers,axp202-usb-power-supply
- - x-powers,axp221-usb-power-supply
- - x-powers,axp223-usb-power-supply
- - x-powers,axp813-usb-power-supply
+ oneOf:
+ - enum:
+ - x-powers,axp202-usb-power-supply
+ - x-powers,axp221-usb-power-supply
+ - x-powers,axp223-usb-power-supply
+ - x-powers,axp813-usb-power-supply
+ - items:
+ - const: x-powers,axp803-usb-power-supply
+ - const: x-powers,axp813-usb-power-supply
required:
diff --git a/Documentation/devicetree/bindings/regulator/richtek,rtq2134-regulator.yaml b/Documentation/devicetree/bindings/regulator/richtek,rtq2134-regulator.yaml
new file mode 100644
index 000000000000..3f47e8e6c4fd
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/richtek,rtq2134-regulator.yaml
@@ -0,0 +1,106 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/regulator/richtek,rtq2134-regulator.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Richtek RTQ2134 SubPMIC Regulator
+
+maintainers:
+ - ChiYuan Huang <cy_huang@richtek.com>
+
+description: |
+ The RTQ2134 is a multi-phase, programmable power management IC that
+ integrates with four high efficient, synchronous step-down converter cores.
+
+ Datasheet is available at
+ https://www.richtek.com/assets/product_file/RTQ2134-QA/DSQ2134-QA-01.pdf
+
+properties:
+ compatible:
+ enum:
+ - richtek,rtq2134
+
+ reg:
+ maxItems: 1
+
+ regulators:
+ type: object
+
+ patternProperties:
+ "^buck[1-3]$":
+ type: object
+ $ref: regulator.yaml#
+ description: |
+ regulator description for buck[1-3].
+
+ properties:
+ richtek,use-vsel-dvs:
+ type: boolean
+ description: |
+ If specified, buck will listen to 'vsel' pin for dvs config.
+ Else, use dvs0 voltage by default.
+
+ richtek,uv-shutdown:
+ type: boolean
+ description: |
+ If specified, use shutdown as UV action. Else, hiccup by default.
+
+ unevaluatedProperties: false
+
+ additionalProperties: false
+
+required:
+ - compatible
+ - reg
+ - regulators
+
+additionalProperties: false
+
+examples:
+ - |
+ i2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ rtq2134@18 {
+ compatible = "richtek,rtq2134";
+ reg = <0x18>;
+
+ regulators {
+ buck1 {
+ regulator-name = "rtq2134-buck1";
+ regulator-min-microvolt = <300000>;
+ regulator-max-microvolt = <1850000>;
+ regulator-always-on;
+ richtek,use-vsel-dvs;
+ regulator-state-mem {
+ regulator-suspend-min-microvolt = <550000>;
+ regulator-suspend-max-microvolt = <550000>;
+ };
+ };
+ buck2 {
+ regulator-name = "rtq2134-buck2";
+ regulator-min-microvolt = <1120000>;
+ regulator-max-microvolt = <1120000>;
+ regulator-always-on;
+ richtek,use-vsel-dvs;
+ regulator-state-mem {
+ regulator-suspend-min-microvolt = <1120000>;
+ regulator-suspend-max-microvolt = <1120000>;
+ };
+ };
+ buck3 {
+ regulator-name = "rtq2134-buck3";
+ regulator-min-microvolt = <600000>;
+ regulator-max-microvolt = <600000>;
+ regulator-always-on;
+ richtek,use-vsel-dvs;
+ regulator-state-mem {
+ regulator-suspend-min-microvolt = <600000>;
+ regulator-suspend-max-microvolt = <600000>;
+ };
+ };
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/regulator/richtek,rtq6752-regulator.yaml b/Documentation/devicetree/bindings/regulator/richtek,rtq6752-regulator.yaml
new file mode 100644
index 000000000000..e6e5a9a7d940
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/richtek,rtq6752-regulator.yaml
@@ -0,0 +1,76 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/regulator/richtek,rtq6752-regulator.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Richtek RTQ6752 TFT LCD Voltage Regulator
+
+maintainers:
+ - ChiYuan Huang <cy_huang@richtek.com>
+
+description: |
+ The RTQ6752 is an I2C interface pgorammable power management IC. It includes
+ two synchronous boost converter for PAVDD, and one synchronous NAVDD
+ buck-boost. The device is suitable for automotive TFT-LCD panel.
+
+properties:
+ compatible:
+ enum:
+ - richtek,rtq6752
+
+ reg:
+ maxItems: 1
+
+ enable-gpios:
+ description: |
+ A connection of the chip 'enable' gpio line. If not provided, treat it as
+ external pull up.
+ maxItems: 1
+
+ regulators:
+ type: object
+
+ patternProperties:
+ "^(p|n)avdd$":
+ type: object
+ $ref: regulator.yaml#
+ description: |
+ regulator description for pavdd and navdd.
+
+ additionalProperties: false
+
+required:
+ - compatible
+ - reg
+ - regulators
+
+additionalProperties: false
+
+examples:
+ - |
+ i2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ rtq6752@6b {
+ compatible = "richtek,rtq6752";
+ reg = <0x6b>;
+ enable-gpios = <&gpio26 2 0>;
+
+ regulators {
+ pavdd {
+ regulator-name = "rtq6752-pavdd";
+ regulator-min-microvolt = <5000000>;
+ regulator-max-microvolt = <7300000>;
+ regulator-boot-on;
+ };
+ navdd {
+ regulator-name = "rtq6752-navdd";
+ regulator-min-microvolt = <5000000>;
+ regulator-max-microvolt = <7300000>;
+ regulator-boot-on;
+ };
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/regulator/socionext,uniphier-regulator.yaml b/Documentation/devicetree/bindings/regulator/socionext,uniphier-regulator.yaml
new file mode 100644
index 000000000000..861d5f3c79e8
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/socionext,uniphier-regulator.yaml
@@ -0,0 +1,85 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/regulator/socionext,uniphier-regulator.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Socionext UniPhier regulator controller
+
+description: |
+ This regulator controls VBUS and belongs to USB3 glue layer. Before using
+ the regulator, it is necessary to control the clocks and resets to enable
+ this layer. These clocks and resets should be described in each property.
+
+maintainers:
+ - Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
+
+allOf:
+ - $ref: "regulator.yaml#"
+
+# USB3 Controller
+
+properties:
+ compatible:
+ enum:
+ - socionext,uniphier-pro4-usb3-regulator
+ - socionext,uniphier-pro5-usb3-regulator
+ - socionext,uniphier-pxs2-usb3-regulator
+ - socionext,uniphier-ld20-usb3-regulator
+ - socionext,uniphier-pxs3-usb3-regulator
+
+ reg:
+ maxItems: 1
+
+ clocks:
+ minItems: 1
+ maxItems: 2
+
+ clock-names:
+ oneOf:
+ - items: # for Pro4, Pro5
+ - const: gio
+ - const: link
+ - items: # for others
+ - const: link
+
+ resets:
+ minItems: 1
+ maxItems: 2
+
+ reset-names:
+ oneOf:
+ - items: # for Pro4, Pro5
+ - const: gio
+ - const: link
+ - items:
+ - const: link
+
+additionalProperties: false
+
+required:
+ - compatible
+ - reg
+ - clocks
+ - clock-names
+ - resets
+ - reset-names
+
+examples:
+ - |
+ usb-glue@65b00000 {
+ compatible = "simple-mfd";
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0x65b00000 0x400>;
+
+ usb_vbus0: regulators@100 {
+ compatible = "socionext,uniphier-ld20-usb3-regulator";
+ reg = <0x100 0x10>;
+ clock-names = "link";
+ clocks = <&sys_clk 14>;
+ reset-names = "link";
+ resets = <&sys_rst 14>;
+ };
+ };
+
diff --git a/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt b/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt
deleted file mode 100644
index 94fd38b0d163..000000000000
--- a/Documentation/devicetree/bindings/regulator/uniphier-regulator.txt
+++ /dev/null
@@ -1,58 +0,0 @@
-Socionext UniPhier Regulator Controller
-
-This describes the devicetree bindings for regulator controller implemented
-on Socionext UniPhier SoCs.
-
-USB3 Controller
----------------
-
-This regulator controls VBUS and belongs to USB3 glue layer. Before using
-the regulator, it is necessary to control the clocks and resets to enable
-this layer. These clocks and resets should be described in each property.
-
-Required properties:
-- compatible: Should be
- "socionext,uniphier-pro4-usb3-regulator" - for Pro4 SoC
- "socionext,uniphier-pro5-usb3-regulator" - for Pro5 SoC
- "socionext,uniphier-pxs2-usb3-regulator" - for PXs2 SoC
- "socionext,uniphier-ld20-usb3-regulator" - for LD20 SoC
- "socionext,uniphier-pxs3-usb3-regulator" - for PXs3 SoC
-- reg: Specifies offset and length of the register set for the device.
-- clocks: A list of phandles to the clock gate for USB3 glue layer.
- According to the clock-names, appropriate clocks are required.
-- clock-names: Should contain
- "gio", "link" - for Pro4 and Pro5 SoCs
- "link" - for others
-- resets: A list of phandles to the reset control for USB3 glue layer.
- According to the reset-names, appropriate resets are required.
-- reset-names: Should contain
- "gio", "link" - for Pro4 and Pro5 SoCs
- "link" - for others
-
-See Documentation/devicetree/bindings/regulator/regulator.txt
-for more details about the regulator properties.
-
-Example:
-
- usb-glue@65b00000 {
- compatible = "socionext,uniphier-ld20-dwc3-glue",
- "simple-mfd";
- #address-cells = <1>;
- #size-cells = <1>;
- ranges = <0 0x65b00000 0x400>;
-
- usb_vbus0: regulators@100 {
- compatible = "socionext,uniphier-ld20-usb3-regulator";
- reg = <0x100 0x10>;
- clock-names = "link";
- clocks = <&sys_clk 14>;
- reset-names = "link";
- resets = <&sys_rst 14>;
- };
-
- phy {
- ...
- phy-supply = <&usb_vbus0>;
- };
- ...
- };
diff --git a/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml b/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml
index 1d38ff76d18f..2b1f91603897 100644
--- a/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml
+++ b/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml
@@ -24,10 +24,10 @@ allOf:
select:
properties:
compatible:
- items:
- - enum:
- - sifive,fu540-c000-ccache
- - sifive,fu740-c000-ccache
+ contains:
+ enum:
+ - sifive,fu540-c000-ccache
+ - sifive,fu740-c000-ccache
required:
- compatible
diff --git a/Documentation/devicetree/bindings/spi/omap-spi.txt b/Documentation/devicetree/bindings/spi/omap-spi.txt
deleted file mode 100644
index 487208c256c0..000000000000
--- a/Documentation/devicetree/bindings/spi/omap-spi.txt
+++ /dev/null
@@ -1,48 +0,0 @@
-OMAP2+ McSPI device
-
-Required properties:
-- compatible :
- - "ti,am654-mcspi" for AM654.
- - "ti,omap2-mcspi" for OMAP2 & OMAP3.
- - "ti,omap4-mcspi" for OMAP4+.
-- ti,spi-num-cs : Number of chipselect supported by the instance.
-- ti,hwmods: Name of the hwmod associated to the McSPI
-- ti,pindir-d0-out-d1-in: Select the D0 pin as output and D1 as
- input. The default is D0 as input and
- D1 as output.
-
-Optional properties:
-- dmas: List of DMA specifiers with the controller specific format
- as described in the generic DMA client binding. A tx and rx
- specifier is required for each chip select.
-- dma-names: List of DMA request names. These strings correspond
- 1:1 with the DMA specifiers listed in dmas. The string naming
- is to be "rxN" and "txN" for RX and TX requests,
- respectively, where N equals the chip select number.
-
-Examples:
-
-[hwmod populated DMA resources]
-
-mcspi1: mcspi@1 {
- #address-cells = <1>;
- #size-cells = <0>;
- compatible = "ti,omap4-mcspi";
- ti,hwmods = "mcspi1";
- ti,spi-num-cs = <4>;
-};
-
-[generic DMA request binding]
-
-mcspi1: mcspi@1 {
- #address-cells = <1>;
- #size-cells = <0>;
- compatible = "ti,omap4-mcspi";
- ti,hwmods = "mcspi1";
- ti,spi-num-cs = <2>;
- dmas = <&edma 42
- &edma 43
- &edma 44
- &edma 45>;
- dma-names = "tx0", "rx0", "tx1", "rx1";
-};
diff --git a/Documentation/devicetree/bindings/spi/omap-spi.yaml b/Documentation/devicetree/bindings/spi/omap-spi.yaml
new file mode 100644
index 000000000000..e55538186cf6
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/omap-spi.yaml
@@ -0,0 +1,117 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/omap-spi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: SPI controller bindings for OMAP and K3 SoCs
+
+maintainers:
+ - Aswath Govindraju <a-govindraju@ti.com>
+
+allOf:
+ - $ref: spi-controller.yaml#
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - enum:
+ - ti,am654-mcspi
+ - ti,am4372-mcspi
+ - const: ti,omap4-mcspi
+ - items:
+ - enum:
+ - ti,omap2-mcspi
+ - ti,omap4-mcspi
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ maxItems: 1
+
+ power-domains:
+ maxItems: 1
+
+ ti,spi-num-cs:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: Number of chipselect supported by the instance.
+ minimum: 1
+ maximum: 4
+
+ ti,hwmods:
+ $ref: /schemas/types.yaml#/definitions/string
+ description:
+ Must be "mcspi<n>", n being the instance number (1-based).
+ This property is applicable only on legacy platforms mainly omap2/3
+ and ti81xx and should not be used on other platforms.
+ deprecated: true
+
+ ti,pindir-d0-out-d1-in:
+ description:
+ Select the D0 pin as output and D1 as input. The default is D0
+ as input and D1 as output.
+ type: boolean
+
+ dmas:
+ description:
+ List of DMA specifiers with the controller specific format as
+ described in the generic DMA client binding. A tx and rx
+ specifier is required for each chip select.
+ minItems: 1
+ maxItems: 8
+
+ dma-names:
+ description:
+ List of DMA request names. These strings correspond 1:1 with
+ the DMA sepecifiers listed in dmas. The string names is to be
+ "rxN" and "txN" for RX and TX requests, respectively. Where N
+ is the chip select number.
+ minItems: 1
+ maxItems: 8
+
+required:
+ - compatible
+ - reg
+ - interrupts
+
+unevaluatedProperties: false
+
+if:
+ properties:
+ compatible:
+ oneOf:
+ - const: ti,omap2-mcspi
+ - const: ti,omap4-mcspi
+
+then:
+ properties:
+ ti,hwmods:
+ items:
+ - pattern: "^mcspi([1-9])$"
+
+else:
+ properties:
+ ti,hwmods: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/soc/ti,sci_pm_domain.h>
+
+ spi@2100000 {
+ compatible = "ti,am654-mcspi","ti,omap4-mcspi";
+ reg = <0x2100000 0x400>;
+ interrupts = <GIC_SPI 184 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&k3_clks 137 1>;
+ power-domains = <&k3_pds 137 TI_SCI_PD_EXCLUSIVE>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+ dmas = <&main_udmap 0xc500>, <&main_udmap 0x4500>;
+ dma-names = "tx0", "rx0";
+ };
diff --git a/Documentation/devicetree/bindings/spi/rockchip-sfc.yaml b/Documentation/devicetree/bindings/spi/rockchip-sfc.yaml
new file mode 100644
index 000000000000..339fb39529f3
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/rockchip-sfc.yaml
@@ -0,0 +1,91 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/rockchip-sfc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Rockchip Serial Flash Controller (SFC)
+
+maintainers:
+ - Heiko Stuebner <heiko@sntech.de>
+ - Chris Morgan <macromorgan@hotmail.com>
+
+allOf:
+ - $ref: spi-controller.yaml#
+
+properties:
+ compatible:
+ const: rockchip,sfc
+ description:
+ The rockchip sfc controller is a standalone IP with version register,
+ and the driver can handle all the feature difference inside the IP
+ depending on the version register.
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ items:
+ - description: Bus Clock
+ - description: Module Clock
+
+ clock-names:
+ items:
+ - const: clk_sfc
+ - const: hclk_sfc
+
+ power-domains:
+ maxItems: 1
+
+ rockchip,sfc-no-dma:
+ description: Disable DMA and utilize FIFO mode only
+ type: boolean
+
+patternProperties:
+ "^flash@[0-3]$":
+ type: object
+ properties:
+ reg:
+ minimum: 0
+ maximum: 3
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+ - clock-names
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/px30-cru.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/power/px30-power.h>
+
+ sfc: spi@ff3a0000 {
+ compatible = "rockchip,sfc";
+ reg = <0xff3a0000 0x4000>;
+ interrupts = <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&cru SCLK_SFC>, <&cru HCLK_SFC>;
+ clock-names = "clk_sfc", "hclk_sfc";
+ pinctrl-0 = <&sfc_clk &sfc_cs &sfc_bus2>;
+ pinctrl-names = "default";
+ power-domains = <&power PX30_PD_MMC_NAND>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ flash@0 {
+ compatible = "jedec,spi-nor";
+ reg = <0>;
+ spi-max-frequency = <108000000>;
+ spi-rx-bus-width = <2>;
+ spi-tx-bus-width = <2>;
+ };
+ };
+
+...
diff --git a/Documentation/devicetree/bindings/spi/spi-mt65xx.txt b/Documentation/devicetree/bindings/spi/spi-mt65xx.txt
index 4d0e4c15c4ea..2a24969159cc 100644
--- a/Documentation/devicetree/bindings/spi/spi-mt65xx.txt
+++ b/Documentation/devicetree/bindings/spi/spi-mt65xx.txt
@@ -11,6 +11,7 @@ Required properties:
- mediatek,mt8135-spi: for mt8135 platforms
- mediatek,mt8173-spi: for mt8173 platforms
- mediatek,mt8183-spi: for mt8183 platforms
+ - mediatek,mt6893-spi: for mt6893 platforms
- "mediatek,mt8192-spi", "mediatek,mt6765-spi": for mt8192 platforms
- "mediatek,mt8195-spi", "mediatek,mt6765-spi": for mt8195 platforms
- "mediatek,mt8516-spi", "mediatek,mt2712-spi": for mt8516 platforms
diff --git a/Documentation/devicetree/bindings/spi/spi-sprd-adi.txt b/Documentation/devicetree/bindings/spi/spi-sprd-adi.txt
deleted file mode 100644
index 2567c829e2dc..000000000000
--- a/Documentation/devicetree/bindings/spi/spi-sprd-adi.txt
+++ /dev/null
@@ -1,63 +0,0 @@
-Spreadtrum ADI controller
-
-ADI is the abbreviation of Anolog-Digital interface, which is used to access
-analog chip (such as PMIC) from digital chip. ADI controller follows the SPI
-framework for its hardware implementation is alike to SPI bus and its timing
-is compatile to SPI timing.
-
-ADI controller has 50 channels including 2 software read/write channels and
-48 hardware channels to access analog chip. For 2 software read/write channels,
-users should set ADI registers to access analog chip. For hardware channels,
-we can configure them to allow other hardware components to use it independently,
-which means we can just link one analog chip address to one hardware channel,
-then users can access the mapped analog chip address by this hardware channel
-triggered by hardware components instead of ADI software channels.
-
-Thus we introduce one property named "sprd,hw-channels" to configure hardware
-channels, the first value specifies the hardware channel id which is used to
-transfer data triggered by hardware automatically, and the second value specifies
-the analog chip address where user want to access by hardware components.
-
-Since we have multi-subsystems will use unique ADI to access analog chip, when
-one system is reading/writing data by ADI software channels, that should be under
-one hardware spinlock protection to prevent other systems from reading/writing
-data by ADI software channels at the same time, or two parallel routine of setting
-ADI registers will make ADI controller registers chaos to lead incorrect results.
-Then we need one hardware spinlock to synchronize between the multiple subsystems.
-
-The new version ADI controller supplies multiple master channels for different
-subsystem accessing, that means no need to add hardware spinlock to synchronize,
-thus change the hardware spinlock support to be optional to keep backward
-compatibility.
-
-Required properties:
-- compatible: Should be "sprd,sc9860-adi".
-- reg: Offset and length of ADI-SPI controller register space.
-- #address-cells: Number of cells required to define a chip select address
- on the ADI-SPI bus. Should be set to 1.
-- #size-cells: Size of cells required to define a chip select address size
- on the ADI-SPI bus. Should be set to 0.
-
-Optional properties:
-- hwlocks: Reference to a phandle of a hwlock provider node.
-- hwlock-names: Reference to hwlock name strings defined in the same order
- as the hwlocks, should be "adi".
-- sprd,hw-channels: This is an array of channel values up to 49 channels.
- The first value specifies the hardware channel id which is used to
- transfer data triggered by hardware automatically, and the second
- value specifies the analog chip address where user want to access
- by hardware components.
-
-SPI slave nodes must be children of the SPI controller node and can contain
-properties described in Documentation/devicetree/bindings/spi/spi-bus.txt.
-
-Example:
- adi_bus: spi@40030000 {
- compatible = "sprd,sc9860-adi";
- reg = <0 0x40030000 0 0x10000>;
- hwlocks = <&hwlock1 0>;
- hwlock-names = "adi";
- #address-cells = <1>;
- #size-cells = <0>;
- sprd,hw-channels = <30 0x8c20>;
- };
diff --git a/Documentation/devicetree/bindings/spi/sprd,spi-adi.yaml b/Documentation/devicetree/bindings/spi/sprd,spi-adi.yaml
new file mode 100644
index 000000000000..fe014020da69
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/sprd,spi-adi.yaml
@@ -0,0 +1,104 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/spi/sprd,spi-adi.yaml#"
+$schema: "http://devicetree.org/meta-schemas/core.yaml#"
+
+title: Spreadtrum ADI controller
+
+maintainers:
+ - Orson Zhai <orsonzhai@gmail.com>
+ - Baolin Wang <baolin.wang7@gmail.com>
+ - Chunyan Zhang <zhang.lyra@gmail.com>
+
+description: |
+ ADI is the abbreviation of Anolog-Digital interface, which is used to access
+ analog chip (such as PMIC) from digital chip. ADI controller follows the SPI
+ framework for its hardware implementation is alike to SPI bus and its timing
+ is compatile to SPI timing.
+
+ ADI controller has 50 channels including 2 software read/write channels and
+ 48 hardware channels to access analog chip. For 2 software read/write channels,
+ users should set ADI registers to access analog chip. For hardware channels,
+ we can configure them to allow other hardware components to use it independently,
+ which means we can just link one analog chip address to one hardware channel,
+ then users can access the mapped analog chip address by this hardware channel
+ triggered by hardware components instead of ADI software channels.
+
+ Thus we introduce one property named "sprd,hw-channels" to configure hardware
+ channels, the first value specifies the hardware channel id which is used to
+ transfer data triggered by hardware automatically, and the second value specifies
+ the analog chip address where user want to access by hardware components.
+
+ Since we have multi-subsystems will use unique ADI to access analog chip, when
+ one system is reading/writing data by ADI software channels, that should be under
+ one hardware spinlock protection to prevent other systems from reading/writing
+ data by ADI software channels at the same time, or two parallel routine of setting
+ ADI registers will make ADI controller registers chaos to lead incorrect results.
+ Then we need one hardware spinlock to synchronize between the multiple subsystems.
+
+ The new version ADI controller supplies multiple master channels for different
+ subsystem accessing, that means no need to add hardware spinlock to synchronize,
+ thus change the hardware spinlock support to be optional to keep backward
+ compatibility.
+
+allOf:
+ - $ref: /spi/spi-controller.yaml#
+
+properties:
+ compatible:
+ enum:
+ - sprd,sc9860-adi
+ - sprd,sc9863-adi
+ - sprd,ums512-adi
+
+ reg:
+ maxItems: 1
+
+ hwlocks:
+ maxItems: 1
+
+ hwlock-names:
+ const: adi
+
+ sprd,hw-channels:
+ $ref: /schemas/types.yaml#/definitions/uint32-matrix
+ description: A list of hardware channels
+ minItems: 1
+ maxItems: 48
+ items:
+ items:
+ - description: The hardware channel id which is used to transfer data
+ triggered by hardware automatically, channel id 0-1 are for software
+ use, 2-49 are hardware channels.
+ minimum: 2
+ maximum: 49
+ - description: The analog chip address where user want to access by
+ hardware components.
+
+required:
+ - compatible
+ - reg
+ - '#address-cells'
+ - '#size-cells'
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ aon {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ adi_bus: spi@40030000 {
+ compatible = "sprd,sc9860-adi";
+ reg = <0 0x40030000 0 0x10000>;
+ hwlocks = <&hwlock1 0>;
+ hwlock-names = "adi";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ sprd,hw-channels = <30 0x8c20>;
+ };
+ };
+...
diff --git a/Documentation/devicetree/bindings/timer/rockchip,rk-timer.txt b/Documentation/devicetree/bindings/timer/rockchip,rk-timer.txt
deleted file mode 100644
index d65fdce7c7f0..000000000000
--- a/Documentation/devicetree/bindings/timer/rockchip,rk-timer.txt
+++ /dev/null
@@ -1,27 +0,0 @@
-Rockchip rk timer
-
-Required properties:
-- compatible: should be:
- "rockchip,rv1108-timer", "rockchip,rk3288-timer": for Rockchip RV1108
- "rockchip,rk3036-timer", "rockchip,rk3288-timer": for Rockchip RK3036
- "rockchip,rk3066-timer", "rockchip,rk3288-timer": for Rockchip RK3066
- "rockchip,rk3188-timer", "rockchip,rk3288-timer": for Rockchip RK3188
- "rockchip,rk3228-timer", "rockchip,rk3288-timer": for Rockchip RK3228
- "rockchip,rk3229-timer", "rockchip,rk3288-timer": for Rockchip RK3229
- "rockchip,rk3288-timer": for Rockchip RK3288
- "rockchip,rk3368-timer", "rockchip,rk3288-timer": for Rockchip RK3368
- "rockchip,rk3399-timer": for Rockchip RK3399
-- reg: base address of the timer register starting with TIMERS CONTROL register
-- interrupts: should contain the interrupts for Timer0
-- clocks : must contain an entry for each entry in clock-names
-- clock-names : must include the following entries:
- "timer", "pclk"
-
-Example:
- timer: timer@ff810000 {
- compatible = "rockchip,rk3288-timer";
- reg = <0xff810000 0x20>;
- interrupts = <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&xin24m>, <&cru PCLK_TIMER>;
- clock-names = "timer", "pclk";
- };
diff --git a/Documentation/devicetree/bindings/timer/rockchip,rk-timer.yaml b/Documentation/devicetree/bindings/timer/rockchip,rk-timer.yaml
new file mode 100644
index 000000000000..e26ecb5893ae
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/rockchip,rk-timer.yaml
@@ -0,0 +1,64 @@
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/timer/rockchip,rk-timer.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Rockchip Timer Device Tree Bindings
+
+maintainers:
+ - Daniel Lezcano <daniel.lezcano@linaro.org>
+
+properties:
+ compatible:
+ oneOf:
+ - const: rockchip,rk3288-timer
+ - const: rockchip,rk3399-timer
+ - items:
+ - enum:
+ - rockchip,rv1108-timer
+ - rockchip,rk3036-timer
+ - rockchip,rk3066-timer
+ - rockchip,rk3188-timer
+ - rockchip,rk3228-timer
+ - rockchip,rk3229-timer
+ - rockchip,rk3288-timer
+ - rockchip,rk3368-timer
+ - rockchip,px30-timer
+ - const: rockchip,rk3288-timer
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ minItems: 2
+ maxItems: 2
+
+ clock-names:
+ items:
+ - const: pclk
+ - const: timer
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+ - clock-names
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/clock/rk3288-cru.h>
+
+ timer: timer@ff810000 {
+ compatible = "rockchip,rk3288-timer";
+ reg = <0xff810000 0x20>;
+ interrupts = <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&cru PCLK_TIMER>, <&xin24m>;
+ clock-names = "pclk", "timer";
+ };
diff --git a/Documentation/driver-api/fpga/fpga-bridge.rst b/Documentation/driver-api/fpga/fpga-bridge.rst
index 198aadafd3e7..8d650b4e2ce6 100644
--- a/Documentation/driver-api/fpga/fpga-bridge.rst
+++ b/Documentation/driver-api/fpga/fpga-bridge.rst
@@ -4,11 +4,11 @@ FPGA Bridge
API to implement a new FPGA bridge
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-* struct fpga_bridge — The FPGA Bridge structure
-* struct fpga_bridge_ops — Low level Bridge driver ops
-* devm_fpga_bridge_create() — Allocate and init a bridge struct
-* fpga_bridge_register() — Register a bridge
-* fpga_bridge_unregister() — Unregister a bridge
+* struct fpga_bridge - The FPGA Bridge structure
+* struct fpga_bridge_ops - Low level Bridge driver ops
+* devm_fpga_bridge_create() - Allocate and init a bridge struct
+* fpga_bridge_register() - Register a bridge
+* fpga_bridge_unregister() - Unregister a bridge
.. kernel-doc:: include/linux/fpga/fpga-bridge.h
:functions: fpga_bridge
diff --git a/Documentation/driver-api/fpga/fpga-mgr.rst b/Documentation/driver-api/fpga/fpga-mgr.rst
index 917ee22db429..4d926b452cb3 100644
--- a/Documentation/driver-api/fpga/fpga-mgr.rst
+++ b/Documentation/driver-api/fpga/fpga-mgr.rst
@@ -101,12 +101,12 @@ in state.
API for implementing a new FPGA Manager driver
----------------------------------------------
-* ``fpga_mgr_states`` — Values for :c:expr:`fpga_manager->state`.
-* struct fpga_manager — the FPGA manager struct
-* struct fpga_manager_ops — Low level FPGA manager driver ops
-* devm_fpga_mgr_create() — Allocate and init a manager struct
-* fpga_mgr_register() — Register an FPGA manager
-* fpga_mgr_unregister() — Unregister an FPGA manager
+* ``fpga_mgr_states`` - Values for :c:expr:`fpga_manager->state`.
+* struct fpga_manager - the FPGA manager struct
+* struct fpga_manager_ops - Low level FPGA manager driver ops
+* devm_fpga_mgr_create() - Allocate and init a manager struct
+* fpga_mgr_register() - Register an FPGA manager
+* fpga_mgr_unregister() - Unregister an FPGA manager
.. kernel-doc:: include/linux/fpga/fpga-mgr.h
:functions: fpga_mgr_states
diff --git a/Documentation/driver-api/fpga/fpga-programming.rst b/Documentation/driver-api/fpga/fpga-programming.rst
index 002392dab04f..fb4da4240e96 100644
--- a/Documentation/driver-api/fpga/fpga-programming.rst
+++ b/Documentation/driver-api/fpga/fpga-programming.rst
@@ -84,10 +84,10 @@ will generate that list. Here's some sample code of what to do next::
API for programming an FPGA
---------------------------
-* fpga_region_program_fpga() — Program an FPGA
-* fpga_image_info() — Specifies what FPGA image to program
-* fpga_image_info_alloc() — Allocate an FPGA image info struct
-* fpga_image_info_free() — Free an FPGA image info struct
+* fpga_region_program_fpga() - Program an FPGA
+* fpga_image_info() - Specifies what FPGA image to program
+* fpga_image_info_alloc() - Allocate an FPGA image info struct
+* fpga_image_info_free() - Free an FPGA image info struct
.. kernel-doc:: drivers/fpga/fpga-region.c
:functions: fpga_region_program_fpga
diff --git a/Documentation/driver-api/fpga/fpga-region.rst b/Documentation/driver-api/fpga/fpga-region.rst
index 363a8171ab0a..2636a27c11b2 100644
--- a/Documentation/driver-api/fpga/fpga-region.rst
+++ b/Documentation/driver-api/fpga/fpga-region.rst
@@ -45,19 +45,19 @@ An example of usage can be seen in the probe function of [#f2]_.
API to add a new FPGA region
----------------------------
-* struct fpga_region — The FPGA region struct
-* devm_fpga_region_create() — Allocate and init a region struct
-* fpga_region_register() — Register an FPGA region
-* fpga_region_unregister() — Unregister an FPGA region
+* struct fpga_region - The FPGA region struct
+* devm_fpga_region_create() - Allocate and init a region struct
+* fpga_region_register() - Register an FPGA region
+* fpga_region_unregister() - Unregister an FPGA region
The FPGA region's probe function will need to get a reference to the FPGA
Manager it will be using to do the programming. This usually would happen
during the region's probe function.
-* fpga_mgr_get() — Get a reference to an FPGA manager, raise ref count
-* of_fpga_mgr_get() — Get a reference to an FPGA manager, raise ref count,
+* fpga_mgr_get() - Get a reference to an FPGA manager, raise ref count
+* of_fpga_mgr_get() - Get a reference to an FPGA manager, raise ref count,
given a device node.
-* fpga_mgr_put() — Put an FPGA manager
+* fpga_mgr_put() - Put an FPGA manager
The FPGA region will need to specify which bridges to control while programming
the FPGA. The region driver can build a list of bridges during probe time
@@ -66,11 +66,11 @@ the list of bridges to program just before programming
(:c:expr:`fpga_region->get_bridges`). The FPGA bridge framework supplies the
following APIs to handle building or tearing down that list.
-* fpga_bridge_get_to_list() — Get a ref of an FPGA bridge, add it to a
+* fpga_bridge_get_to_list() - Get a ref of an FPGA bridge, add it to a
list
-* of_fpga_bridge_get_to_list() — Get a ref of an FPGA bridge, add it to a
+* of_fpga_bridge_get_to_list() - Get a ref of an FPGA bridge, add it to a
list, given a device node
-* fpga_bridges_put() — Given a list of bridges, put them
+* fpga_bridges_put() - Given a list of bridges, put them
.. kernel-doc:: include/linux/fpga/fpga-region.h
:functions: fpga_region
diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst
index f5a3207aa7fa..c57c609ad2eb 100644
--- a/Documentation/driver-api/index.rst
+++ b/Documentation/driver-api/index.rst
@@ -85,7 +85,6 @@ available subsections can be seen below.
io-mapping
io_ordering
generic-counter
- lightnvm-pblk
memory-devices/index
men-chameleon-bus
ntb
diff --git a/Documentation/driver-api/lightnvm-pblk.rst b/Documentation/driver-api/lightnvm-pblk.rst
deleted file mode 100644
index 1040ed1cec81..000000000000
--- a/Documentation/driver-api/lightnvm-pblk.rst
+++ /dev/null
@@ -1,21 +0,0 @@
-pblk: Physical Block Device Target
-==================================
-
-pblk implements a fully associative, host-based FTL that exposes a traditional
-block I/O interface. Its primary responsibilities are:
-
- - Map logical addresses onto physical addresses (4KB granularity) in a
- logical-to-physical (L2P) table.
- - Maintain the integrity and consistency of the L2P table as well as its
- recovery from normal tear down and power outage.
- - Deal with controller- and media-specific constrains.
- - Handle I/O errors.
- - Implement garbage collection.
- - Maintain consistency across the I/O stack during synchronization points.
-
-For more information please refer to:
-
- http://lightnvm.io
-
-which maintains updated FAQs, manual pages, technical documentation, tools,
-contacts, etc.
diff --git a/Documentation/driver-api/nfc/nfc-hci.rst b/Documentation/driver-api/nfc/nfc-hci.rst
index eb8a1a14e919..f10fe53aa9fe 100644
--- a/Documentation/driver-api/nfc/nfc-hci.rst
+++ b/Documentation/driver-api/nfc/nfc-hci.rst
@@ -181,7 +181,7 @@ xmit_from_hci():
The llc must be registered with nfc before it can be used. Do that by
calling::
- nfc_llc_register(const char *name, struct nfc_llc_ops *ops);
+ nfc_llc_register(const char *name, const struct nfc_llc_ops *ops);
Again, note that the llc does not handle the physical link. It is thus very
easy to mix any physical link with any llc for a given chip driver.
diff --git a/Documentation/fault-injection/fault-injection.rst b/Documentation/fault-injection/fault-injection.rst
index f47d05ed0d94..4a25c5eb6f07 100644
--- a/Documentation/fault-injection/fault-injection.rst
+++ b/Documentation/fault-injection/fault-injection.rst
@@ -24,6 +24,10 @@ Available fault injection capabilities
injects futex deadlock and uaddr fault errors.
+- fail_sunrpc
+
+ injects kernel RPC client and server failures.
+
- fail_make_request
injects disk IO errors on devices permitted by setting
@@ -151,6 +155,20 @@ configuration of fault-injection capabilities.
default is 'N', setting it to 'Y' will disable failure injections
when dealing with private (address space) futexes.
+- /sys/kernel/debug/fail_sunrpc/ignore-client-disconnect:
+
+ Format: { 'Y' | 'N' }
+
+ default is 'N', setting it to 'Y' will disable disconnect
+ injection on the RPC client.
+
+- /sys/kernel/debug/fail_sunrpc/ignore-server-disconnect:
+
+ Format: { 'Y' | 'N' }
+
+ default is 'N', setting it to 'Y' will disable disconnect
+ injection on the RPC server.
+
- /sys/kernel/debug/fail_function/inject:
Format: { 'function-name' | '!function-name' | '' }
diff --git a/Documentation/fault-injection/provoke-crashes.rst b/Documentation/fault-injection/provoke-crashes.rst
index a20ba5d93932..3abe84225613 100644
--- a/Documentation/fault-injection/provoke-crashes.rst
+++ b/Documentation/fault-injection/provoke-crashes.rst
@@ -29,8 +29,7 @@ recur_count
cpoint_name
Where in the kernel to trigger the action. It can be
one of INT_HARDWARE_ENTRY, INT_HW_IRQ_EN, INT_TASKLET_ENTRY,
- FS_DEVRW, MEM_SWAPOUT, TIMERADD, SCSI_DISPATCH_CMD,
- IDE_CORE_CP, or DIRECT
+ FS_DEVRW, MEM_SWAPOUT, TIMERADD, SCSI_QUEUE_RQ, or DIRECT.
cpoint_type
Indicates the action to be taken on hitting the crash point.
diff --git a/Documentation/filesystems/cifs/index.rst b/Documentation/filesystems/cifs/index.rst
new file mode 100644
index 000000000000..1c8597a679ab
--- /dev/null
+++ b/Documentation/filesystems/cifs/index.rst
@@ -0,0 +1,10 @@
+===============================
+CIFS
+===============================
+
+
+.. toctree::
+ :maxdepth: 1
+
+ ksmbd
+ cifsroot
diff --git a/Documentation/filesystems/cifs/ksmbd.rst b/Documentation/filesystems/cifs/ksmbd.rst
new file mode 100644
index 000000000000..a1326157d53f
--- /dev/null
+++ b/Documentation/filesystems/cifs/ksmbd.rst
@@ -0,0 +1,165 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================
+KSMBD - SMB3 Kernel Server
+==========================
+
+KSMBD is a linux kernel server which implements SMB3 protocol in kernel space
+for sharing files over network.
+
+KSMBD architecture
+==================
+
+The subset of performance related operations belong in kernelspace and
+the other subset which belong to operations which are not really related with
+performance in userspace. So, DCE/RPC management that has historically resulted
+into number of buffer overflow issues and dangerous security bugs and user
+account management are implemented in user space as ksmbd.mountd.
+File operations that are related with performance (open/read/write/close etc.)
+in kernel space (ksmbd). This also allows for easier integration with VFS
+interface for all file operations.
+
+ksmbd (kernel daemon)
+---------------------
+
+When the server daemon is started, It starts up a forker thread
+(ksmbd/interface name) at initialization time and open a dedicated port 445
+for listening to SMB requests. Whenever new clients make request, Forker
+thread will accept the client connection and fork a new thread for dedicated
+communication channel between the client and the server. It allows for parallel
+processing of SMB requests(commands) from clients as well as allowing for new
+clients to make new connections. Each instance is named ksmbd/1~n(port number)
+to indicate connected clients. Depending on the SMB request types, each new
+thread can decide to pass through the commands to the user space (ksmbd.mountd),
+currently DCE/RPC commands are identified to be handled through the user space.
+To further utilize the linux kernel, it has been chosen to process the commands
+as workitems and to be executed in the handlers of the ksmbd-io kworker threads.
+It allows for multiplexing of the handlers as the kernel take care of initiating
+extra worker threads if the load is increased and vice versa, if the load is
+decreased it destroys the extra worker threads. So, after connection is
+established with client. Dedicated ksmbd/1..n(port number) takes complete
+ownership of receiving/parsing of SMB commands. Each received command is worked
+in parallel i.e., There can be multiple clients commands which are worked in
+parallel. After receiving each command a separated kernel workitem is prepared
+for each command which is further queued to be handled by ksmbd-io kworkers.
+So, each SMB workitem is queued to the kworkers. This allows the benefit of load
+sharing to be managed optimally by the default kernel and optimizing client
+performance by handling client commands in parallel.
+
+ksmbd.mountd (user space daemon)
+--------------------------------
+
+ksmbd.mountd is userspace process to, transfer user account and password that
+are registered using ksmbd.adduser(part of utils for user space). Further it
+allows sharing information parameters that parsed from smb.conf to ksmbd in
+kernel. For the execution part it has a daemon which is continuously running
+and connected to the kernel interface using netlink socket, it waits for the
+requests(dcerpc and share/user info). It handles RPC calls (at a minimum few
+dozen) that are most important for file server from NetShareEnum and
+NetServerGetInfo. Complete DCE/RPC response is prepared from the user space
+and passed over to the associated kernel thread for the client.
+
+
+KSMBD Feature Status
+====================
+
+============================== =================================================
+Feature name Status
+============================== =================================================
+Dialects Supported. SMB2.1 SMB3.0, SMB3.1.1 dialects
+ (intentionally excludes security vulnerable SMB1
+ dialect).
+Auto Negotiation Supported.
+Compound Request Supported.
+Oplock Cache Mechanism Supported.
+SMB2 leases(v1 lease) Supported.
+Directory leases(v2 lease) Planned for future.
+Multi-credits Supported.
+NTLM/NTLMv2 Supported.
+HMAC-SHA256 Signing Supported.
+Secure negotiate Supported.
+Signing Update Supported.
+Pre-authentication integrity Supported.
+SMB3 encryption(CCM, GCM) Supported. (CCM and GCM128 supported, GCM256 in
+ progress)
+SMB direct(RDMA) Partially Supported. SMB3 Multi-channel is
+ required to connect to Windows client.
+SMB3 Multi-channel Partially Supported. Planned to implement
+ replay/retry mechanisms for future.
+SMB3.1.1 POSIX extension Supported.
+ACLs Partially Supported. only DACLs available, SACLs
+ (auditing) is planned for the future. For
+ ownership (SIDs) ksmbd generates random subauth
+ values(then store it to disk) and use uid/gid
+ get from inode as RID for local domain SID.
+ The current acl implementation is limited to
+ standalone server, not a domain member.
+ Integration with Samba tools is being worked on
+ to allow future support for running as a domain
+ member.
+Kerberos Supported.
+Durable handle v1,v2 Planned for future.
+Persistent handle Planned for future.
+SMB2 notify Planned for future.
+Sparse file support Supported.
+DCE/RPC support Partially Supported. a few calls(NetShareEnumAll,
+ NetServerGetInfo, SAMR, LSARPC) that are needed
+ for file server handled via netlink interface
+ from ksmbd.mountd. Additional integration with
+ Samba tools and libraries via upcall is being
+ investigated to allow support for additional
+ DCE/RPC management calls (and future support
+ for Witness protocol e.g.)
+ksmbd/nfsd interoperability Planned for future. The features that ksmbd
+ support are Leases, Notify, ACLs and Share modes.
+============================== =================================================
+
+
+How to run
+==========
+
+1. Download ksmbd-tools and compile them.
+ - https://github.com/cifsd-team/ksmbd-tools
+
+2. Create user/password for SMB share.
+
+ # mkdir /etc/ksmbd/
+ # ksmbd.adduser -a <Enter USERNAME for SMB share access>
+
+3. Create /etc/ksmbd/smb.conf file, add SMB share in smb.conf file
+ - Refer smb.conf.example and
+ https://github.com/cifsd-team/ksmbd-tools/blob/master/Documentation/configuration.txt
+
+4. Insert ksmbd.ko module
+
+ # insmod ksmbd.ko
+
+5. Start ksmbd user space daemon
+ # ksmbd.mountd
+
+6. Access share from Windows or Linux using CIFS
+
+Shutdown KSMBD
+==============
+
+1. kill user and kernel space daemon
+ # sudo ksmbd.control -s
+
+How to turn debug print on
+==========================
+
+Each layer
+/sys/class/ksmbd-control/debug
+
+1. Enable all component prints
+ # sudo ksmbd.control -d "all"
+
+2. Enable one of components(smb, auth, vfs, oplock, ipc, conn, rdma)
+ # sudo ksmbd.control -d "smb"
+
+3. Show what prints are enable.
+ # cat/sys/class/ksmbd-control/debug
+ [smb] auth vfs oplock ipc conn [rdma]
+
+4. Disable prints:
+ If you try the selected component once more, It is disabled without brackets.
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index 44b67ebd6e40..0eb799d9d05a 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -1063,11 +1063,6 @@ astute users may notice some differences in behavior:
- DAX (Direct Access) is not supported on encrypted files.
-- The st_size of an encrypted symlink will not necessarily give the
- length of the symlink target as required by POSIX. It will actually
- give the length of the ciphertext, which will be slightly longer
- than the plaintext due to NUL-padding and an extra 2-byte overhead.
-
- The maximum length of an encrypted symlink is 2 bytes shorter than
the maximum length of an unencrypted symlink. For example, on an
EXT4 filesystem with a 4K block size, unencrypted symlinks can be up
@@ -1235,12 +1230,12 @@ the user-supplied name to get the ciphertext.
Lookups without the key are more complicated. The raw ciphertext may
contain the ``\0`` and ``/`` characters, which are illegal in
-filenames. Therefore, readdir() must base64-encode the ciphertext for
-presentation. For most filenames, this works fine; on ->lookup(), the
-filesystem just base64-decodes the user-supplied name to get back to
-the raw ciphertext.
+filenames. Therefore, readdir() must base64url-encode the ciphertext
+for presentation. For most filenames, this works fine; on ->lookup(),
+the filesystem just base64url-decodes the user-supplied name to get
+back to the raw ciphertext.
-However, for very long filenames, base64 encoding would cause the
+However, for very long filenames, base64url encoding would cause the
filename length to exceed NAME_MAX. To prevent this, readdir()
actually presents long filenames in an abbreviated form which encodes
a strong "hash" of the ciphertext filename, along with the optional
diff --git a/Documentation/filesystems/idmappings.rst b/Documentation/filesystems/idmappings.rst
new file mode 100644
index 000000000000..1229a75ec75d
--- /dev/null
+++ b/Documentation/filesystems/idmappings.rst
@@ -0,0 +1,1026 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Idmappings
+==========
+
+Most filesystem developers will have encountered idmappings. They are used when
+reading from or writing ownership to disk, reporting ownership to userspace, or
+for permission checking. This document is aimed at filesystem developers that
+want to know how idmappings work.
+
+Formal notes
+------------
+
+An idmapping is essentially a translation of a range of ids into another or the
+same range of ids. The notational convention for idmappings that is widely used
+in userspace is::
+
+ u:k:r
+
+``u`` indicates the first element in the upper idmapset ``U`` and ``k``
+indicates the first element in the lower idmapset ``K``. The ``r`` parameter
+indicates the range of the idmapping, i.e. how many ids are mapped. From now
+on, we will always prefix ids with ``u`` or ``k`` to make it clear whether
+we're talking about an id in the upper or lower idmapset.
+
+To see what this looks like in practice, let's take the following idmapping::
+
+ u22:k10000:r3
+
+and write down the mappings it will generate::
+
+ u22 -> k10000
+ u23 -> k10001
+ u24 -> k10002
+
+From a mathematical viewpoint ``U`` and ``K`` are well-ordered sets and an
+idmapping is an order isomorphism from ``U`` into ``K``. So ``U`` and ``K`` are
+order isomorphic. In fact, ``U`` and ``K`` are always well-ordered subsets of
+the set of all possible ids useable on a given system.
+
+Looking at this mathematically briefly will help us highlight some properties
+that make it easier to understand how we can translate between idmappings. For
+example, we know that the inverse idmapping is an order isomorphism as well::
+
+ k10000 -> u22
+ k10001 -> u23
+ k10002 -> u24
+
+Given that we are dealing with order isomorphisms plus the fact that we're
+dealing with subsets we can embedd idmappings into each other, i.e. we can
+sensibly translate between different idmappings. For example, assume we've been
+given the three idmappings::
+
+ 1. u0:k10000:r10000
+ 2. u0:k20000:r10000
+ 3. u0:k30000:r10000
+
+and id ``k11000`` which has been generated by the first idmapping by mapping
+``u1000`` from the upper idmapset down to ``k11000`` in the lower idmapset.
+
+Because we're dealing with order isomorphic subsets it is meaningful to ask
+what id ``k11000`` corresponds to in the second or third idmapping. The
+straightfoward algorithm to use is to apply the inverse of the first idmapping,
+mapping ``k11000`` up to ``u1000``. Afterwards, we can map ``u1000`` down using
+either the second idmapping mapping or third idmapping mapping. The second
+idmapping would map ``u1000`` down to ``21000``. The third idmapping would map
+``u1000`` down to ``u31000``.
+
+If we were given the same task for the following three idmappings::
+
+ 1. u0:k10000:r10000
+ 2. u0:k20000:r200
+ 3. u0:k30000:r300
+
+we would fail to translate as the sets aren't order isomorphic over the full
+range of the first idmapping anymore (However they are order isomorphic over
+the full range of the second idmapping.). Neither the second or third idmapping
+contain ``u1000`` in the upper idmapset ``U``. This is equivalent to not having
+an id mapped. We can simply say that ``u1000`` is unmapped in the second and
+third idmapping. The kernel will report unmapped ids as the overflowuid
+``(uid_t)-1`` or overflowgid ``(gid_t)-1`` to userspace.
+
+The algorithm to calculate what a given id maps to is pretty simple. First, we
+need to verify that the range can contain our target id. We will skip this step
+for simplicity. After that if we want to know what ``id`` maps to we can do
+simple calculations:
+
+- If we want to map from left to right::
+
+ u:k:r
+ id - u + k = n
+
+- If we want to map from right to left::
+
+ u:k:r
+ id - k + u = n
+
+Instead of "left to right" we can also say "down" and instead of "right to
+left" we can also say "up". Obviously mapping down and up invert each other.
+
+To see whether the simple formulas above work, consider the following two
+idmappings::
+
+ 1. u0:k20000:r10000
+ 2. u500:k30000:r10000
+
+Assume we are given ``k21000`` in the lower idmapset of the first idmapping. We
+want to know what id this was mapped from in the upper idmapset of the first
+idmapping. So we're mapping up in the first idmapping::
+
+ id - k + u = n
+ k21000 - k20000 + u0 = u1000
+
+Now assume we are given the id ``u1100`` in the upper idmapset of the second
+idmapping and we want to know what this id maps down to in the lower idmapset
+of the second idmapping. This means we're mapping down in the second
+idmapping::
+
+ id - u + k = n
+ u1100 - u500 + k30000 = k30600
+
+General notes
+-------------
+
+In the context of the kernel an idmapping can be interpreted as mapping a range
+of userspace ids into a range of kernel ids::
+
+ userspace-id:kernel-id:range
+
+A userspace id is always an element in the upper idmapset of an idmapping of
+type ``uid_t`` or ``gid_t`` and a kernel id is always an element in the lower
+idmapset of an idmapping of type ``kuid_t`` or ``kgid_t``. From now on
+"userspace id" will be used to refer to the well known ``uid_t`` and ``gid_t``
+types and "kernel id" will be used to refer to ``kuid_t`` and ``kgid_t``.
+
+The kernel is mostly concerned with kernel ids. They are used when performing
+permission checks and are stored in an inode's ``i_uid`` and ``i_gid`` field.
+A userspace id on the other hand is an id that is reported to userspace by the
+kernel, or is passed by userspace to the kernel, or a raw device id that is
+written or read from disk.
+
+Note that we are only concerned with idmappings as the kernel stores them not
+how userspace would specify them.
+
+For the rest of this document we will prefix all userspace ids with ``u`` and
+all kernel ids with ``k``. Ranges of idmappings will be prefixed with ``r``. So
+an idmapping will be written as ``u0:k10000:r10000``.
+
+For example, the id ``u1000`` is an id in the upper idmapset or "userspace
+idmapset" starting with ``u1000``. And it is mapped to ``k11000`` which is a
+kernel id in the lower idmapset or "kernel idmapset" starting with ``k10000``.
+
+A kernel id is always created by an idmapping. Such idmappings are associated
+with user namespaces. Since we mainly care about how idmappings work we're not
+going to be concerned with how idmappings are created nor how they are used
+outside of the filesystem context. This is best left to an explanation of user
+namespaces.
+
+The initial user namespace is special. It always has an idmapping of the
+following form::
+
+ u0:k0:r4294967295
+
+which is an identity idmapping over the full range of ids available on this
+system.
+
+Other user namespaces usually have non-identity idmappings such as::
+
+ u0:k10000:r10000
+
+When a process creates or wants to change ownership of a file, or when the
+ownership of a file is read from disk by a filesystem, the userspace id is
+immediately translated into a kernel id according to the idmapping associated
+with the relevant user namespace.
+
+For instance, consider a file that is stored on disk by a filesystem as being
+owned by ``u1000``:
+
+- If a filesystem were to be mounted in the initial user namespaces (as most
+ filesystems are) then the initial idmapping will be used. As we saw this is
+ simply the identity idmapping. This would mean id ``u1000`` read from disk
+ would be mapped to id ``k1000``. So an inode's ``i_uid`` and ``i_gid`` field
+ would contain ``k1000``.
+
+- If a filesystem were to be mounted with an idmapping of ``u0:k10000:r10000``
+ then ``u1000`` read from disk would be mapped to ``k11000``. So an inode's
+ ``i_uid`` and ``i_gid`` would contain ``k11000``.
+
+Translation algorithms
+----------------------
+
+We've already seen briefly that it is possible to translate between different
+idmappings. We'll now take a closer look how that works.
+
+Crossmapping
+~~~~~~~~~~~~
+
+This translation algorithm is used by the kernel in quite a few places. For
+example, it is used when reporting back the ownership of a file to userspace
+via the ``stat()`` system call family.
+
+If we've been given ``k11000`` from one idmapping we can map that id up in
+another idmapping. In order for this to work both idmappings need to contain
+the same kernel id in their kernel idmapsets. For example, consider the
+following idmappings::
+
+ 1. u0:k10000:r10000
+ 2. u20000:k10000:r10000
+
+and we are mapping ``u1000`` down to ``k11000`` in the first idmapping . We can
+then translate ``k11000`` into a userspace id in the second idmapping using the
+kernel idmapset of the second idmapping::
+
+ /* Map the kernel id up into a userspace id in the second idmapping. */
+ from_kuid(u20000:k10000:r10000, k11000) = u21000
+
+Note, how we can get back to the kernel id in the first idmapping by inverting
+the algorithm::
+
+ /* Map the userspace id down into a kernel id in the second idmapping. */
+ make_kuid(u20000:k10000:r10000, u21000) = k11000
+
+ /* Map the kernel id up into a userspace id in the first idmapping. */
+ from_kuid(u0:k10000:r10000, k11000) = u1000
+
+This algorithm allows us to answer the question what userspace id a given
+kernel id corresponds to in a given idmapping. In order to be able to answer
+this question both idmappings need to contain the same kernel id in their
+respective kernel idmapsets.
+
+For example, when the kernel reads a raw userspace id from disk it maps it down
+into a kernel id according to the idmapping associated with the filesystem.
+Let's assume the filesystem was mounted with an idmapping of
+``u0:k20000:r10000`` and it reads a file owned by ``u1000`` from disk. This
+means ``u1000`` will be mapped to ``k21000`` which is what will be stored in
+the inode's ``i_uid`` and ``i_gid`` field.
+
+When someone in userspace calls ``stat()`` or a related function to get
+ownership information about the file the kernel can't simply map the id back up
+according to the filesystem's idmapping as this would give the wrong owner if
+the caller is using an idmapping.
+
+So the kernel will map the id back up in the idmapping of the caller. Let's
+assume the caller has the slighly unconventional idmapping
+``u3000:k20000:r10000`` then ``k21000`` would map back up to ``u4000``.
+Consequently the user would see that this file is owned by ``u4000``.
+
+Remapping
+~~~~~~~~~
+
+It is possible to translate a kernel id from one idmapping to another one via
+the userspace idmapset of the two idmappings. This is equivalent to remapping
+a kernel id.
+
+Let's look at an example. We are given the following two idmappings::
+
+ 1. u0:k10000:r10000
+ 2. u0:k20000:r10000
+
+and we are given ``k11000`` in the first idmapping. In order to translate this
+kernel id in the first idmapping into a kernel id in the second idmapping we
+need to perform two steps:
+
+1. Map the kernel id up into a userspace id in the first idmapping::
+
+ /* Map the kernel id up into a userspace id in the first idmapping. */
+ from_kuid(u0:k10000:r10000, k11000) = u1000
+
+2. Map the userspace id down into a kernel id in the second idmapping::
+
+ /* Map the userspace id down into a kernel id in the second idmapping. */
+ make_kuid(u0:k20000:r10000, u1000) = k21000
+
+As you can see we used the userspace idmapset in both idmappings to translate
+the kernel id in one idmapping to a kernel id in another idmapping.
+
+This allows us to answer the question what kernel id we would need to use to
+get the same userspace id in another idmapping. In order to be able to answer
+this question both idmappings need to contain the same userspace id in their
+respective userspace idmapsets.
+
+Note, how we can easily get back to the kernel id in the first idmapping by
+inverting the algorithm:
+
+1. Map the kernel id up into a userspace id in the second idmapping::
+
+ /* Map the kernel id up into a userspace id in the second idmapping. */
+ from_kuid(u0:k20000:r10000, k21000) = u1000
+
+2. Map the userspace id down into a kernel id in the first idmapping::
+
+ /* Map the userspace id down into a kernel id in the first idmapping. */
+ make_kuid(u0:k10000:r10000, u1000) = k11000
+
+Another way to look at this translation is to treat it as inverting one
+idmapping and applying another idmapping if both idmappings have the relevant
+userspace id mapped. This will come in handy when working with idmapped mounts.
+
+Invalid translations
+~~~~~~~~~~~~~~~~~~~~
+
+It is never valid to use an id in the kernel idmapset of one idmapping as the
+id in the userspace idmapset of another or the same idmapping. While the kernel
+idmapset always indicates an idmapset in the kernel id space the userspace
+idmapset indicates a userspace id. So the following translations are forbidden::
+
+ /* Map the userspace id down into a kernel id in the first idmapping. */
+ make_kuid(u0:k10000:r10000, u1000) = k11000
+
+ /* INVALID: Map the kernel id down into a kernel id in the second idmapping. */
+ make_kuid(u10000:k20000:r10000, k110000) = k21000
+ ~~~~~~~
+
+and equally wrong::
+
+ /* Map the kernel id up into a userspace id in the first idmapping. */
+ from_kuid(u0:k10000:r10000, k11000) = u1000
+
+ /* INVALID: Map the userspace id up into a userspace id in the second idmapping. */
+ from_kuid(u20000:k0:r10000, u1000) = k21000
+ ~~~~~
+
+Idmappings when creating filesystem objects
+-------------------------------------------
+
+The concepts of mapping an id down or mapping an id up are expressed in the two
+kernel functions filesystem developers are rather familiar with and which we've
+already used in this document::
+
+ /* Map the userspace id down into a kernel id. */
+ make_kuid(idmapping, uid)
+
+ /* Map the kernel id up into a userspace id. */
+ from_kuid(idmapping, kuid)
+
+We will take an abbreviated look into how idmappings figure into creating
+filesystem objects. For simplicity we will only look at what happens when the
+VFS has already completed path lookup right before it calls into the filesystem
+itself. So we're concerned with what happens when e.g. ``vfs_mkdir()`` is
+called. We will also assume that the directory we're creating filesystem
+objects in is readable and writable for everyone.
+
+When creating a filesystem object the caller will look at the caller's
+filesystem ids. These are just regular ``uid_t`` and ``gid_t`` userspace ids
+but they are exclusively used when determining file ownership which is why they
+are called "filesystem ids". They are usually identical to the uid and gid of
+the caller but can differ. We will just assume they are always identical to not
+get lost in too many details.
+
+When the caller enters the kernel two things happen:
+
+1. Map the caller's userspace ids down into kernel ids in the caller's
+ idmapping.
+ (To be precise, the kernel will simply look at the kernel ids stashed in the
+ credentials of the current task but for our education we'll pretend this
+ translation happens just in time.)
+2. Verify that the caller's kernel ids can be mapped up to userspace ids in the
+ filesystem's idmapping.
+
+The second step is important as regular filesystem will ultimately need to map
+the kernel id back up into a userspace id when writing to disk.
+So with the second step the kernel guarantees that a valid userspace id can be
+written to disk. If it can't the kernel will refuse the creation request to not
+even remotely risk filesystem corruption.
+
+The astute reader will have realized that this is simply a varation of the
+crossmapping algorithm we mentioned above in a previous section. First, the
+kernel maps the caller's userspace id down into a kernel id according to the
+caller's idmapping and then maps that kernel id up according to the
+filesystem's idmapping.
+
+Example 1
+~~~~~~~~~
+
+::
+
+ caller id: u1000
+ caller idmapping: u0:k0:r4294967295
+ filesystem idmapping: u0:k0:r4294967295
+
+Both the caller and the filesystem use the identity idmapping:
+
+1. Map the caller's userspace ids into kernel ids in the caller's idmapping::
+
+ make_kuid(u0:k0:r4294967295, u1000) = k1000
+
+2. Verify that the caller's kernel ids can be mapped to userspace ids in the
+ filesystem's idmapping.
+
+ For this second step the kernel will call the function
+ ``fsuidgid_has_mapping()`` which ultimately boils down to calling
+ ``from_kuid()``::
+
+ from_kuid(u0:k0:r4294967295, k1000) = u1000
+
+In this example both idmappings are the same so there's nothing exciting going
+on. Ultimately the userspace id that lands on disk will be ``u1000``.
+
+Example 2
+~~~~~~~~~
+
+::
+
+ caller id: u1000
+ caller idmapping: u0:k10000:r10000
+ filesystem idmapping: u0:k20000:r10000
+
+1. Map the caller's userspace ids down into kernel ids in the caller's
+ idmapping::
+
+ make_kuid(u0:k10000:r10000, u1000) = k11000
+
+2. Verify that the caller's kernel ids can be mapped up to userspace ids in the
+ filesystem's idmapping::
+
+ from_kuid(u0:k20000:r10000, k11000) = u-1
+
+It's immediately clear that while the caller's userspace id could be
+successfully mapped down into kernel ids in the caller's idmapping the kernel
+ids could not be mapped up according to the filesystem's idmapping. So the
+kernel will deny this creation request.
+
+Note that while this example is less common, because most filesystem can't be
+mounted with non-initial idmappings this is a general problem as we can see in
+the next examples.
+
+Example 3
+~~~~~~~~~
+
+::
+
+ caller id: u1000
+ caller idmapping: u0:k10000:r10000
+ filesystem idmapping: u0:k0:r4294967295
+
+1. Map the caller's userspace ids down into kernel ids in the caller's
+ idmapping::
+
+ make_kuid(u0:k10000:r10000, u1000) = k11000
+
+2. Verify that the caller's kernel ids can be mapped up to userspace ids in the
+ filesystem's idmapping::
+
+ from_kuid(u0:k0:r4294967295, k11000) = u11000
+
+We can see that the translation always succeeds. The userspace id that the
+filesystem will ultimately put to disk will always be identical to the value of
+the kernel id that was created in the caller's idmapping. This has mainly two
+consequences.
+
+First, that we can't allow a caller to ultimately write to disk with another
+userspace id. We could only do this if we were to mount the whole fileystem
+with the caller's or another idmapping. But that solution is limited to a few
+filesystems and not very flexible. But this is a use-case that is pretty
+important in containerized workloads.
+
+Second, the caller will usually not be able to create any files or access
+directories that have stricter permissions because none of the filesystem's
+kernel ids map up into valid userspace ids in the caller's idmapping
+
+1. Map raw userspace ids down to kernel ids in the filesystem's idmapping::
+
+ make_kuid(u0:k0:r4294967295, u1000) = k1000
+
+2. Map kernel ids up to userspace ids in the caller's idmapping::
+
+ from_kuid(u0:k10000:r10000, k1000) = u-1
+
+Example 4
+~~~~~~~~~
+
+::
+
+ file id: u1000
+ caller idmapping: u0:k10000:r10000
+ filesystem idmapping: u0:k0:r4294967295
+
+In order to report ownership to userspace the kernel uses the crossmapping
+algorithm introduced in a previous section:
+
+1. Map the userspace id on disk down into a kernel id in the filesystem's
+ idmapping::
+
+ make_kuid(u0:k0:r4294967295, u1000) = k1000
+
+2. Map the kernel id up into a userspace id in the caller's idmapping::
+
+ from_kuid(u0:k10000:r10000, k1000) = u-1
+
+The crossmapping algorithm fails in this case because the kernel id in the
+filesystem idmapping cannot be mapped up to a userspace id in the caller's
+idmapping. Thus, the kernel will report the ownership of this file as the
+overflowid.
+
+Example 5
+~~~~~~~~~
+
+::
+
+ file id: u1000
+ caller idmapping: u0:k10000:r10000
+ filesystem idmapping: u0:k20000:r10000
+
+In order to report ownership to userspace the kernel uses the crossmapping
+algorithm introduced in a previous section:
+
+1. Map the userspace id on disk down into a kernel id in the filesystem's
+ idmapping::
+
+ make_kuid(u0:k20000:r10000, u1000) = k21000
+
+2. Map the kernel id up into a userspace id in the caller's idmapping::
+
+ from_kuid(u0:k10000:r10000, k21000) = u-1
+
+Again, the crossmapping algorithm fails in this case because the kernel id in
+the filesystem idmapping cannot be mapped to a userspace id in the caller's
+idmapping. Thus, the kernel will report the ownership of this file as the
+overflowid.
+
+Note how in the last two examples things would be simple if the caller would be
+using the initial idmapping. For a filesystem mounted with the initial
+idmapping it would be trivial. So we only consider a filesystem with an
+idmapping of ``u0:k20000:r10000``:
+
+1. Map the userspace id on disk down into a kernel id in the filesystem's
+ idmapping::
+
+ make_kuid(u0:k20000:r10000, u1000) = k21000
+
+2. Map the kernel id up into a userspace id in the caller's idmapping::
+
+ from_kuid(u0:k0:r4294967295, k21000) = u21000
+
+Idmappings on idmapped mounts
+-----------------------------
+
+The examples we've seen in the previous section where the caller's idmapping
+and the filesystem's idmapping are incompatible causes various issues for
+workloads. For a more complex but common example, consider two containers
+started on the host. To completely prevent the two containers from affecting
+each other, an administrator may often use different non-overlapping idmappings
+for the two containers::
+
+ container1 idmapping: u0:k10000:r10000
+ container2 idmapping: u0:k20000:r10000
+ filesystem idmapping: u0:k30000:r10000
+
+An administrator wanting to provide easy read-write access to the following set
+of files::
+
+ dir id: u0
+ dir/file1 id: u1000
+ dir/file2 id: u2000
+
+to both containers currently can't.
+
+Of course the administrator has the option to recursively change ownership via
+``chown()``. For example, they could change ownership so that ``dir`` and all
+files below it can be crossmapped from the filesystem's into the container's
+idmapping. Let's assume they change ownership so it is compatible with the
+first container's idmapping::
+
+ dir id: u10000
+ dir/file1 id: u11000
+ dir/file2 id: u12000
+
+This would still leave ``dir`` rather useless to the second container. In fact,
+``dir`` and all files below it would continue to appear owned by the overflowid
+for the second container.
+
+Or consider another increasingly popular example. Some service managers such as
+systemd implement a concept called "portable home directories". A user may want
+to use their home directories on different machines where they are assigned
+different login userspace ids. Most users will have ``u1000`` as the login id
+on their machine at home and all files in their home directory will usually be
+owned by ``u1000``. At uni or at work they may have another login id such as
+``u1125``. This makes it rather difficult to interact with their home directory
+on their work machine.
+
+In both cases changing ownership recursively has grave implications. The most
+obvious one is that ownership is changed globally and permanently. In the home
+directory case this change in ownership would even need to happen everytime the
+user switches from their home to their work machine. For really large sets of
+files this becomes increasingly costly.
+
+If the user is lucky, they are dealing with a filesystem that is mountable
+inside user namespaces. But this would also change ownership globally and the
+change in ownership is tied to the lifetime of the filesystem mount, i.e. the
+superblock. The only way to change ownership is to completely unmount the
+filesystem and mount it again in another user namespace. This is usually
+impossible because it would mean that all users currently accessing the
+filesystem can't anymore. And it means that ``dir`` still can't be shared
+between two containers with different idmappings.
+But usually the user doesn't even have this option since most filesystems
+aren't mountable inside containers. And not having them mountable might be
+desirable as it doesn't require the filesystem to deal with malicious
+filesystem images.
+
+But the usecases mentioned above and more can be handled by idmapped mounts.
+They allow to expose the same set of dentries with different ownership at
+different mounts. This is achieved by marking the mounts with a user namespace
+through the ``mount_setattr()`` system call. The idmapping associated with it
+is then used to translate from the caller's idmapping to the filesystem's
+idmapping and vica versa using the remapping algorithm we introduced above.
+
+Idmapped mounts make it possible to change ownership in a temporary and
+localized way. The ownership changes are restricted to a specific mount and the
+ownership changes are tied to the lifetime of the mount. All other users and
+locations where the filesystem is exposed are unaffected.
+
+Filesystems that support idmapped mounts don't have any real reason to support
+being mountable inside user namespaces. A filesystem could be exposed
+completely under an idmapped mount to get the same effect. This has the
+advantage that filesystems can leave the creation of the superblock to
+privileged users in the initial user namespace.
+
+However, it is perfectly possible to combine idmapped mounts with filesystems
+mountable inside user namespaces. We will touch on this further below.
+
+Remapping helpers
+~~~~~~~~~~~~~~~~~
+
+Idmapping functions were added that translate between idmappings. They make use
+of the remapping algorithm we've introduced earlier. We're going to look at
+two:
+
+- ``i_uid_into_mnt()`` and ``i_gid_into_mnt()``
+
+ The ``i_*id_into_mnt()`` functions translate filesystem's kernel ids into
+ kernel ids in the mount's idmapping::
+
+ /* Map the filesystem's kernel id up into a userspace id in the filesystem's idmapping. */
+ from_kuid(filesystem, kid) = uid
+
+ /* Map the filesystem's userspace id down ito a kernel id in the mount's idmapping. */
+ make_kuid(mount, uid) = kuid
+
+- ``mapped_fsuid()`` and ``mapped_fsgid()``
+
+ The ``mapped_fs*id()`` functions translate the caller's kernel ids into
+ kernel ids in the filesystem's idmapping. This translation is achieved by
+ remapping the caller's kernel ids using the mount's idmapping::
+
+ /* Map the caller's kernel id up into a userspace id in the mount's idmapping. */
+ from_kuid(mount, kid) = uid
+
+ /* Map the mount's userspace id down into a kernel id in the filesystem's idmapping. */
+ make_kuid(filesystem, uid) = kuid
+
+Note that these two functions invert each other. Consider the following
+idmappings::
+
+ caller idmapping: u0:k10000:r10000
+ filesystem idmapping: u0:k20000:r10000
+ mount idmapping: u0:k10000:r10000
+
+Assume a file owned by ``u1000`` is read from disk. The filesystem maps this id
+to ``k21000`` according to it's idmapping. This is what is stored in the
+inode's ``i_uid`` and ``i_gid`` fields.
+
+When the caller queries the ownership of this file via ``stat()`` the kernel
+would usually simply use the crossmapping algorithm and map the filesystem's
+kernel id up to a userspace id in the caller's idmapping.
+
+But when the caller is accessing the file on an idmapped mount the kernel will
+first call ``i_uid_into_mnt()`` thereby translating the filesystem's kernel id
+into a kernel id in the mount's idmapping::
+
+ i_uid_into_mnt(k21000):
+ /* Map the filesystem's kernel id up into a userspace id. */
+ from_kuid(u0:k20000:r10000, k21000) = u1000
+
+ /* Map the filesystem's userspace id down ito a kernel id in the mount's idmapping. */
+ make_kuid(u0:k10000:r10000, u1000) = k11000
+
+Finally, when the kernel reports the owner to the caller it will turn the
+kernel id in the mount's idmapping into a userspace id in the caller's
+idmapping::
+
+ from_kuid(u0:k10000:r10000, k11000) = u1000
+
+We can test whether this algorithm really works by verifying what happens when
+we create a new file. Let's say the user is creating a file with ``u1000``.
+
+The kernel maps this to ``k11000`` in the caller's idmapping. Usually the
+kernel would now apply the crossmapping, verifying that ``k11000`` can be
+mapped to a userspace id in the filesystem's idmapping. Since ``k11000`` can't
+be mapped up in the filesystem's idmapping directly this creation request
+fails.
+
+But when the caller is accessing the file on an idmapped mount the kernel will
+first call ``mapped_fs*id()`` thereby translating the caller's kernel id into
+a kernel id according to the mount's idmapping::
+
+ mapped_fsuid(k11000):
+ /* Map the caller's kernel id up into a userspace id in the mount's idmapping. */
+ from_kuid(u0:k10000:r10000, k11000) = u1000
+
+ /* Map the mount's userspace id down into a kernel id in the filesystem's idmapping. */
+ make_kuid(u0:k20000:r10000, u1000) = k21000
+
+When finally writing to disk the kernel will then map ``k21000`` up into a
+userspace id in the filesystem's idmapping::
+
+ from_kuid(u0:k20000:r10000, k21000) = u1000
+
+As we can see, we end up with an invertible and therefore information
+preserving algorithm. A file created from ``u1000`` on an idmapped mount will
+also be reported as being owned by ``u1000`` and vica versa.
+
+Let's now briefly reconsider the failing examples from earlier in the context
+of idmapped mounts.
+
+Example 2 reconsidered
+~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+ caller id: u1000
+ caller idmapping: u0:k10000:r10000
+ filesystem idmapping: u0:k20000:r10000
+ mount idmapping: u0:k10000:r10000
+
+When the caller is using a non-initial idmapping the common case is to attach
+the same idmapping to the mount. We now perform three steps:
+
+1. Map the caller's userspace ids into kernel ids in the caller's idmapping::
+
+ make_kuid(u0:k10000:r10000, u1000) = k11000
+
+2. Translate the caller's kernel id into a kernel id in the filesystem's
+ idmapping::
+
+ mapped_fsuid(k11000):
+ /* Map the kernel id up into a userspace id in the mount's idmapping. */
+ from_kuid(u0:k10000:r10000, k11000) = u1000
+
+ /* Map the userspace id down into a kernel id in the filesystem's idmapping. */
+ make_kuid(u0:k20000:r10000, u1000) = k21000
+
+2. Verify that the caller's kernel ids can be mapped to userspace ids in the
+ filesystem's idmapping::
+
+ from_kuid(u0:k20000:r10000, k21000) = u1000
+
+So the ownership that lands on disk will be ``u1000``.
+
+Example 3 reconsidered
+~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+ caller id: u1000
+ caller idmapping: u0:k10000:r10000
+ filesystem idmapping: u0:k0:r4294967295
+ mount idmapping: u0:k10000:r10000
+
+The same translation algorithm works with the third example.
+
+1. Map the caller's userspace ids into kernel ids in the caller's idmapping::
+
+ make_kuid(u0:k10000:r10000, u1000) = k11000
+
+2. Translate the caller's kernel id into a kernel id in the filesystem's
+ idmapping::
+
+ mapped_fsuid(k11000):
+ /* Map the kernel id up into a userspace id in the mount's idmapping. */
+ from_kuid(u0:k10000:r10000, k11000) = u1000
+
+ /* Map the userspace id down into a kernel id in the filesystem's idmapping. */
+ make_kuid(u0:k0:r4294967295, u1000) = k1000
+
+2. Verify that the caller's kernel ids can be mapped to userspace ids in the
+ filesystem's idmapping::
+
+ from_kuid(u0:k0:r4294967295, k21000) = u1000
+
+So the ownership that lands on disk will be ``u1000``.
+
+Example 4 reconsidered
+~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+ file id: u1000
+ caller idmapping: u0:k10000:r10000
+ filesystem idmapping: u0:k0:r4294967295
+ mount idmapping: u0:k10000:r10000
+
+In order to report ownership to userspace the kernel now does three steps using
+the translation algorithm we introduced earlier:
+
+1. Map the userspace id on disk down into a kernel id in the filesystem's
+ idmapping::
+
+ make_kuid(u0:k0:r4294967295, u1000) = k1000
+
+2. Translate the kernel id into a kernel id in the mount's idmapping::
+
+ i_uid_into_mnt(k1000):
+ /* Map the kernel id up into a userspace id in the filesystem's idmapping. */
+ from_kuid(u0:k0:r4294967295, k1000) = u1000
+
+ /* Map the userspace id down into a kernel id in the mounts's idmapping. */
+ make_kuid(u0:k10000:r10000, u1000) = k11000
+
+3. Map the kernel id up into a userspace id in the caller's idmapping::
+
+ from_kuid(u0:k10000:r10000, k11000) = u1000
+
+Earlier, the caller's kernel id couldn't be crossmapped in the filesystems's
+idmapping. With the idmapped mount in place it now can be crossmapped into the
+filesystem's idmapping via the mount's idmapping. The file will now be created
+with ``u1000`` according to the mount's idmapping.
+
+Example 5 reconsidered
+~~~~~~~~~~~~~~~~~~~~~~
+
+::
+
+ file id: u1000
+ caller idmapping: u0:k10000:r10000
+ filesystem idmapping: u0:k20000:r10000
+ mount idmapping: u0:k10000:r10000
+
+Again, in order to report ownership to userspace the kernel now does three
+steps using the translation algorithm we introduced earlier:
+
+1. Map the userspace id on disk down into a kernel id in the filesystem's
+ idmapping::
+
+ make_kuid(u0:k20000:r10000, u1000) = k21000
+
+2. Translate the kernel id into a kernel id in the mount's idmapping::
+
+ i_uid_into_mnt(k21000):
+ /* Map the kernel id up into a userspace id in the filesystem's idmapping. */
+ from_kuid(u0:k20000:r10000, k21000) = u1000
+
+ /* Map the userspace id down into a kernel id in the mounts's idmapping. */
+ make_kuid(u0:k10000:r10000, u1000) = k11000
+
+3. Map the kernel id up into a userspace id in the caller's idmapping::
+
+ from_kuid(u0:k10000:r10000, k11000) = u1000
+
+Earlier, the file's kernel id couldn't be crossmapped in the filesystems's
+idmapping. With the idmapped mount in place it now can be crossmapped into the
+filesystem's idmapping via the mount's idmapping. The file is now owned by
+``u1000`` according to the mount's idmapping.
+
+Changing ownership on a home directory
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We've seen above how idmapped mounts can be used to translate between
+idmappings when either the caller, the filesystem or both uses a non-initial
+idmapping. A wide range of usecases exist when the caller is using
+a non-initial idmapping. This mostly happens in the context of containerized
+workloads. The consequence is as we have seen that for both, filesystem's
+mounted with the initial idmapping and filesystems mounted with non-initial
+idmappings, access to the filesystem isn't working because the kernel ids can't
+be crossmapped between the caller's and the filesystem's idmapping.
+
+As we've seen above idmapped mounts provide a solution to this by remapping the
+caller's or filesystem's idmapping according to the mount's idmapping.
+
+Aside from containerized workloads, idmapped mounts have the advantage that
+they also work when both the caller and the filesystem use the initial
+idmapping which means users on the host can change the ownership of directories
+and files on a per-mount basis.
+
+Consider our previous example where a user has their home directory on portable
+storage. At home they have id ``u1000`` and all files in their home directory
+are owned by ``u1000`` whereas at uni or work they have login id ``u1125``.
+
+Taking their home directory with them becomes problematic. They can't easily
+access their files, they might not be able to write to disk without applying
+lax permissions or ACLs and even if they can, they will end up with an annoying
+mix of files and directories owned by ``u1000`` and ``u1125``.
+
+Idmapped mounts allow to solve this problem. A user can create an idmapped
+mount for their home directory on their work computer or their computer at home
+depending on what ownership they would prefer to end up on the portable storage
+itself.
+
+Let's assume they want all files on disk to belong to ``u1000``. When the user
+plugs in their portable storage at their work station they can setup a job that
+creates an idmapped mount with the minimal idmapping ``u1000:k1125:r1``. So now
+when they create a file the kernel performs the following steps we already know
+from above:::
+
+ caller id: u1125
+ caller idmapping: u0:k0:r4294967295
+ filesystem idmapping: u0:k0:r4294967295
+ mount idmapping: u1000:k1125:r1
+
+1. Map the caller's userspace ids into kernel ids in the caller's idmapping::
+
+ make_kuid(u0:k0:r4294967295, u1125) = k1125
+
+2. Translate the caller's kernel id into a kernel id in the filesystem's
+ idmapping::
+
+ mapped_fsuid(k1125):
+ /* Map the kernel id up into a userspace id in the mount's idmapping. */
+ from_kuid(u1000:k1125:r1, k1125) = u1000
+
+ /* Map the userspace id down into a kernel id in the filesystem's idmapping. */
+ make_kuid(u0:k0:r4294967295, u1000) = k1000
+
+2. Verify that the caller's kernel ids can be mapped to userspace ids in the
+ filesystem's idmapping::
+
+ from_kuid(u0:k0:r4294967295, k1000) = u1000
+
+So ultimately the file will be created with ``u1000`` on disk.
+
+Now let's briefly look at what ownership the caller with id ``u1125`` will see
+on their work computer:
+
+::
+
+ file id: u1000
+ caller idmapping: u0:k0:r4294967295
+ filesystem idmapping: u0:k0:r4294967295
+ mount idmapping: u1000:k1125:r1
+
+1. Map the userspace id on disk down into a kernel id in the filesystem's
+ idmapping::
+
+ make_kuid(u0:k0:r4294967295, u1000) = k1000
+
+2. Translate the kernel id into a kernel id in the mount's idmapping::
+
+ i_uid_into_mnt(k1000):
+ /* Map the kernel id up into a userspace id in the filesystem's idmapping. */
+ from_kuid(u0:k0:r4294967295, k1000) = u1000
+
+ /* Map the userspace id down into a kernel id in the mounts's idmapping. */
+ make_kuid(u1000:k1125:r1, u1000) = k1125
+
+3. Map the kernel id up into a userspace id in the caller's idmapping::
+
+ from_kuid(u0:k0:r4294967295, k1125) = u1125
+
+So ultimately the caller will be reported that the file belongs to ``u1125``
+which is the caller's userspace id on their workstation in our example.
+
+The raw userspace id that is put on disk is ``u1000`` so when the user takes
+their home directory back to their home computer where they are assigned
+``u1000`` using the initial idmapping and mount the filesystem with the initial
+idmapping they will see all those files owned by ``u1000``.
+
+Shortcircuting
+--------------
+
+Currently, the implementation of idmapped mounts enforces that the filesystem
+is mounted with the initial idmapping. The reason is simply that none of the
+filesystems that we targeted were mountable with a non-initial idmapping. But
+that might change soon enough. As we've seen above, thanks to the properties of
+idmappings the translation works for both filesystems mounted with the initial
+idmapping and filesystem with non-initial idmappings.
+
+Based on this current restriction to filesystem mounted with the initial
+idmapping two noticeable shortcuts have been taken:
+
+1. We always stash a reference to the initial user namespace in ``struct
+ vfsmount``. Idmapped mounts are thus mounts that have a non-initial user
+ namespace attached to them.
+
+ In order to support idmapped mounts this needs to be changed. Instead of
+ stashing the initial user namespace the user namespace the filesystem was
+ mounted with must be stashed. An idmapped mount is then any mount that has
+ a different user namespace attached then the filesystem was mounted with.
+ This has no user-visible consequences.
+
+2. The translation algorithms in ``mapped_fs*id()`` and ``i_*id_into_mnt()``
+ are simplified.
+
+ Let's consider ``mapped_fs*id()`` first. This function translates the
+ caller's kernel id into a kernel id in the filesystem's idmapping via
+ a mount's idmapping. The full algorithm is::
+
+ mapped_fsuid(kid):
+ /* Map the kernel id up into a userspace id in the mount's idmapping. */
+ from_kuid(mount-idmapping, kid) = uid
+
+ /* Map the userspace id down into a kernel id in the filesystem's idmapping. */
+ make_kuid(filesystem-idmapping, uid) = kuid
+
+ We know that the filesystem is always mounted with the initial idmapping as
+ we enforce this in ``mount_setattr()``. So this can be shortened to::
+
+ mapped_fsuid(kid):
+ /* Map the kernel id up into a userspace id in the mount's idmapping. */
+ from_kuid(mount-idmapping, kid) = uid
+
+ /* Map the userspace id down into a kernel id in the filesystem's idmapping. */
+ KUIDT_INIT(uid) = kuid
+
+ Similarly, for ``i_*id_into_mnt()`` which translated the filesystem's kernel
+ id into a mount's kernel id::
+
+ i_uid_into_mnt(kid):
+ /* Map the kernel id up into a userspace id in the filesystem's idmapping. */
+ from_kuid(filesystem-idmapping, kid) = uid
+
+ /* Map the userspace id down into a kernel id in the mounts's idmapping. */
+ make_kuid(mount-idmapping, uid) = kuid
+
+ Again, we know that the filesystem is always mounted with the initial
+ idmapping as we enforce this in ``mount_setattr()``. So this can be
+ shortened to::
+
+ i_uid_into_mnt(kid):
+ /* Map the kernel id up into a userspace id in the filesystem's idmapping. */
+ __kuid_val(kid) = uid
+
+ /* Map the userspace id down into a kernel id in the mounts's idmapping. */
+ make_kuid(mount-idmapping, uid) = kuid
+
+Handling filesystems mounted with non-initial idmappings requires that the
+translation functions be converted to their full form. They can still be
+shortcircuited on non-idmapped mounts. This has no user-visible consequences.
diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst
index 246af51b277a..1a2dd4d35717 100644
--- a/Documentation/filesystems/index.rst
+++ b/Documentation/filesystems/index.rst
@@ -34,6 +34,7 @@ algorithms work.
quota
seq_file
sharedsubtree
+ idmappings
automount-support
@@ -72,7 +73,7 @@ Documentation for filesystem implementations.
befs
bfs
btrfs
- cifs/cifsroot
+ cifs/index
ceph
coda
configfs
diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 2183fd8cc350..2a75dd5da7b5 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -271,19 +271,19 @@ prototypes::
locking rules:
All except set_page_dirty and freepage may block
-====================== ======================== =========
-ops PageLocked(page) i_rwsem
-====================== ======================== =========
+====================== ======================== ========= ===============
+ops PageLocked(page) i_rwsem invalidate_lock
+====================== ======================== ========= ===============
writepage: yes, unlocks (see below)
-readpage: yes, unlocks
+readpage: yes, unlocks shared
writepages:
set_page_dirty no
-readahead: yes, unlocks
-readpages: no
+readahead: yes, unlocks shared
+readpages: no shared
write_begin: locks the page exclusive
write_end: yes, unlocks exclusive
bmap:
-invalidatepage: yes
+invalidatepage: yes exclusive
releasepage: yes
freepage: yes
direct_IO:
@@ -295,7 +295,7 @@ is_partially_uptodate: yes
error_remove_page: yes
swap_activate: no
swap_deactivate: no
-====================== ======================== =========
+====================== ======================== ========= ===============
->write_begin(), ->write_end() and ->readpage() may be called from
the request handler (/dev/loop).
@@ -378,7 +378,10 @@ keep it that way and don't breed new callers.
->invalidatepage() is called when the filesystem must attempt to drop
some or all of the buffers from the page when it is being truncated. It
returns zero on success. If ->invalidatepage is zero, the kernel uses
-block_invalidatepage() instead.
+block_invalidatepage() instead. The filesystem must exclusively acquire
+invalidate_lock before invalidating page cache in truncate / hole punch path
+(and thus calling into ->invalidatepage) to block races between page cache
+invalidation and page cache filling functions (fault, read, ...).
->releasepage() is called when the kernel is about to try to drop the
buffers from the page in preparation for freeing it. It returns zero to
@@ -506,6 +509,7 @@ prototypes::
ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
+ int (*iopoll) (struct kiocb *kiocb, bool spin);
int (*iterate) (struct file *, struct dir_context *);
int (*iterate_shared) (struct file *, struct dir_context *);
__poll_t (*poll) (struct file *, struct poll_table_struct *);
@@ -518,12 +522,6 @@ prototypes::
int (*fsync) (struct file *, loff_t start, loff_t end, int datasync);
int (*fasync) (int, struct file *, int);
int (*lock) (struct file *, int, struct file_lock *);
- ssize_t (*readv) (struct file *, const struct iovec *, unsigned long,
- loff_t *);
- ssize_t (*writev) (struct file *, const struct iovec *, unsigned long,
- loff_t *);
- ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t,
- void __user *);
ssize_t (*sendpage) (struct file *, struct page *, int, size_t,
loff_t *, int);
unsigned long (*get_unmapped_area)(struct file *, unsigned long,
@@ -536,6 +534,14 @@ prototypes::
size_t, unsigned int);
int (*setlease)(struct file *, long, struct file_lock **, void **);
long (*fallocate)(struct file *, int, loff_t, loff_t);
+ void (*show_fdinfo)(struct seq_file *m, struct file *f);
+ unsigned (*mmap_capabilities)(struct file *);
+ ssize_t (*copy_file_range)(struct file *, loff_t, struct file *,
+ loff_t, size_t, unsigned int);
+ loff_t (*remap_file_range)(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags);
+ int (*fadvise)(struct file *, loff_t, loff_t, int);
locking rules:
All may block.
@@ -570,6 +576,25 @@ in sys_read() and friends.
the lease within the individual filesystem to record the result of the
operation
+->fallocate implementation must be really careful to maintain page cache
+consistency when punching holes or performing other operations that invalidate
+page cache contents. Usually the filesystem needs to call
+truncate_inode_pages_range() to invalidate relevant range of the page cache.
+However the filesystem usually also needs to update its internal (and on disk)
+view of file offset -> disk block mapping. Until this update is finished, the
+filesystem needs to block page faults and reads from reloading now-stale page
+cache contents from the disk. Since VFS acquires mapping->invalidate_lock in
+shared mode when loading pages from disk (filemap_fault(), filemap_read(),
+readahead paths), the fallocate implementation must take the invalidate_lock to
+prevent reloading.
+
+->copy_file_range and ->remap_file_range implementations need to serialize
+against modifications of file data while the operation is running. For
+blocking changes through write(2) and similar operations inode->i_rwsem can be
+used. To block changes to file contents via a memory mapping during the
+operation, the filesystem must take mapping->invalidate_lock to coordinate
+with ->page_mkwrite.
+
dquot_operations
================
@@ -627,11 +652,11 @@ pfn_mkwrite: yes
access: yes
============= ========= ===========================
-->fault() is called when a previously not present pte is about
-to be faulted in. The filesystem must find and return the page associated
-with the passed in "pgoff" in the vm_fault structure. If it is possible that
-the page may be truncated and/or invalidated, then the filesystem must lock
-the page, then ensure it is not already truncated (the page lock will block
+->fault() is called when a previously not present pte is about to be faulted
+in. The filesystem must find and return the page associated with the passed in
+"pgoff" in the vm_fault structure. If it is possible that the page may be
+truncated and/or invalidated, then the filesystem must lock invalidate_lock,
+then ensure the page is not already truncated (invalidate_lock will block
subsequent truncate), and then return with VM_FAULT_LOCKED, and the page
locked. The VM will unlock the page.
@@ -644,12 +669,14 @@ page table entry. Pointer to entry associated with the page is passed in
"pte" field in vm_fault structure. Pointers to entries for other offsets
should be calculated relative to "pte".
-->page_mkwrite() is called when a previously read-only pte is
-about to become writeable. The filesystem again must ensure that there are
-no truncate/invalidate races, and then return with the page locked. If
-the page has been truncated, the filesystem should not look up a new page
-like the ->fault() handler, but simply return with VM_FAULT_NOPAGE, which
-will cause the VM to retry the fault.
+->page_mkwrite() is called when a previously read-only pte is about to become
+writeable. The filesystem again must ensure that there are no
+truncate/invalidate races or races with operations such as ->remap_file_range
+or ->copy_file_range, and then return with the page locked. Usually
+mapping->invalidate_lock is suitable for proper serialization. If the page has
+been truncated, the filesystem should not look up a new page like the ->fault()
+handler, but simply return with VM_FAULT_NOPAGE, which will cause the VM to
+retry the fault.
->pfn_mkwrite() is the same as page_mkwrite but when the pte is
VM_PFNMAP or VM_MIXEDMAP with a page-less entry. Expected return is
diff --git a/Documentation/filesystems/mandatory-locking.rst b/Documentation/filesystems/mandatory-locking.rst
deleted file mode 100644
index 9ce73544a8f0..000000000000
--- a/Documentation/filesystems/mandatory-locking.rst
+++ /dev/null
@@ -1,188 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-=====================================================
-Mandatory File Locking For The Linux Operating System
-=====================================================
-
- Andy Walker <andy@lysaker.kvaerner.no>
-
- 15 April 1996
-
- (Updated September 2007)
-
-0. Why you should avoid mandatory locking
------------------------------------------
-
-The Linux implementation is prey to a number of difficult-to-fix race
-conditions which in practice make it not dependable:
-
- - The write system call checks for a mandatory lock only once
- at its start. It is therefore possible for a lock request to
- be granted after this check but before the data is modified.
- A process may then see file data change even while a mandatory
- lock was held.
- - Similarly, an exclusive lock may be granted on a file after
- the kernel has decided to proceed with a read, but before the
- read has actually completed, and the reading process may see
- the file data in a state which should not have been visible
- to it.
- - Similar races make the claimed mutual exclusion between lock
- and mmap similarly unreliable.
-
-1. What is mandatory locking?
-------------------------------
-
-Mandatory locking is kernel enforced file locking, as opposed to the more usual
-cooperative file locking used to guarantee sequential access to files among
-processes. File locks are applied using the flock() and fcntl() system calls
-(and the lockf() library routine which is a wrapper around fcntl().) It is
-normally a process' responsibility to check for locks on a file it wishes to
-update, before applying its own lock, updating the file and unlocking it again.
-The most commonly used example of this (and in the case of sendmail, the most
-troublesome) is access to a user's mailbox. The mail user agent and the mail
-transfer agent must guard against updating the mailbox at the same time, and
-prevent reading the mailbox while it is being updated.
-
-In a perfect world all processes would use and honour a cooperative, or
-"advisory" locking scheme. However, the world isn't perfect, and there's
-a lot of poorly written code out there.
-
-In trying to address this problem, the designers of System V UNIX came up
-with a "mandatory" locking scheme, whereby the operating system kernel would
-block attempts by a process to write to a file that another process holds a
-"read" -or- "shared" lock on, and block attempts to both read and write to a
-file that a process holds a "write " -or- "exclusive" lock on.
-
-The System V mandatory locking scheme was intended to have as little impact as
-possible on existing user code. The scheme is based on marking individual files
-as candidates for mandatory locking, and using the existing fcntl()/lockf()
-interface for applying locks just as if they were normal, advisory locks.
-
-.. Note::
-
- 1. In saying "file" in the paragraphs above I am actually not telling
- the whole truth. System V locking is based on fcntl(). The granularity of
- fcntl() is such that it allows the locking of byte ranges in files, in
- addition to entire files, so the mandatory locking rules also have byte
- level granularity.
-
- 2. POSIX.1 does not specify any scheme for mandatory locking, despite
- borrowing the fcntl() locking scheme from System V. The mandatory locking
- scheme is defined by the System V Interface Definition (SVID) Version 3.
-
-2. Marking a file for mandatory locking
----------------------------------------
-
-A file is marked as a candidate for mandatory locking by setting the group-id
-bit in its file mode but removing the group-execute bit. This is an otherwise
-meaningless combination, and was chosen by the System V implementors so as not
-to break existing user programs.
-
-Note that the group-id bit is usually automatically cleared by the kernel when
-a setgid file is written to. This is a security measure. The kernel has been
-modified to recognize the special case of a mandatory lock candidate and to
-refrain from clearing this bit. Similarly the kernel has been modified not
-to run mandatory lock candidates with setgid privileges.
-
-3. Available implementations
-----------------------------
-
-I have considered the implementations of mandatory locking available with
-SunOS 4.1.x, Solaris 2.x and HP-UX 9.x.
-
-Generally I have tried to make the most sense out of the behaviour exhibited
-by these three reference systems. There are many anomalies.
-
-All the reference systems reject all calls to open() for a file on which
-another process has outstanding mandatory locks. This is in direct
-contravention of SVID 3, which states that only calls to open() with the
-O_TRUNC flag set should be rejected. The Linux implementation follows the SVID
-definition, which is the "Right Thing", since only calls with O_TRUNC can
-modify the contents of the file.
-
-HP-UX even disallows open() with O_TRUNC for a file with advisory locks, not
-just mandatory locks. That would appear to contravene POSIX.1.
-
-mmap() is another interesting case. All the operating systems mentioned
-prevent mandatory locks from being applied to an mmap()'ed file, but HP-UX
-also disallows advisory locks for such a file. SVID actually specifies the
-paranoid HP-UX behaviour.
-
-In my opinion only MAP_SHARED mappings should be immune from locking, and then
-only from mandatory locks - that is what is currently implemented.
-
-SunOS is so hopeless that it doesn't even honour the O_NONBLOCK flag for
-mandatory locks, so reads and writes to locked files always block when they
-should return EAGAIN.
-
-I'm afraid that this is such an esoteric area that the semantics described
-below are just as valid as any others, so long as the main points seem to
-agree.
-
-4. Semantics
-------------
-
-1. Mandatory locks can only be applied via the fcntl()/lockf() locking
- interface - in other words the System V/POSIX interface. BSD style
- locks using flock() never result in a mandatory lock.
-
-2. If a process has locked a region of a file with a mandatory read lock, then
- other processes are permitted to read from that region. If any of these
- processes attempts to write to the region it will block until the lock is
- released, unless the process has opened the file with the O_NONBLOCK
- flag in which case the system call will return immediately with the error
- status EAGAIN.
-
-3. If a process has locked a region of a file with a mandatory write lock, all
- attempts to read or write to that region block until the lock is released,
- unless a process has opened the file with the O_NONBLOCK flag in which case
- the system call will return immediately with the error status EAGAIN.
-
-4. Calls to open() with O_TRUNC, or to creat(), on a existing file that has
- any mandatory locks owned by other processes will be rejected with the
- error status EAGAIN.
-
-5. Attempts to apply a mandatory lock to a file that is memory mapped and
- shared (via mmap() with MAP_SHARED) will be rejected with the error status
- EAGAIN.
-
-6. Attempts to create a shared memory map of a file (via mmap() with MAP_SHARED)
- that has any mandatory locks in effect will be rejected with the error status
- EAGAIN.
-
-5. Which system calls are affected?
------------------------------------
-
-Those which modify a file's contents, not just the inode. That gives read(),
-write(), readv(), writev(), open(), creat(), mmap(), truncate() and
-ftruncate(). truncate() and ftruncate() are considered to be "write" actions
-for the purposes of mandatory locking.
-
-The affected region is usually defined as stretching from the current position
-for the total number of bytes read or written. For the truncate calls it is
-defined as the bytes of a file removed or added (we must also consider bytes
-added, as a lock can specify just "the whole file", rather than a specific
-range of bytes.)
-
-Note 3: I may have overlooked some system calls that need mandatory lock
-checking in my eagerness to get this code out the door. Please let me know, or
-better still fix the system calls yourself and submit a patch to me or Linus.
-
-6. Warning!
------------
-
-Not even root can override a mandatory lock, so runaway processes can wreak
-havoc if they lock crucial files. The way around it is to change the file
-permissions (remove the setgid bit) before trying to read or write to it.
-Of course, that might be a bit tricky if the system is hung :-(
-
-7. The "mand" mount option
---------------------------
-Mandatory locking is disabled on all filesystems by default, and must be
-administratively enabled by mounting with "-o mand". That mount option
-is only allowed if the mounting task has the CAP_SYS_ADMIN capability.
-
-Since kernel v4.5, it is possible to disable mandatory locking
-altogether by setting CONFIG_MANDATORY_FILE_LOCKING to "n". A kernel
-with this disabled will reject attempts to mount filesystems with the
-"mand" mount option with the error status EPERM.
diff --git a/Documentation/fpga/dfl.rst b/Documentation/fpga/dfl.rst
index 75df90d1e54c..ef9eec71f6f3 100644
--- a/Documentation/fpga/dfl.rst
+++ b/Documentation/fpga/dfl.rst
@@ -10,7 +10,7 @@ Authors:
- Xu Yilun <yilun.xu@intel.com>
The Device Feature List (DFL) FPGA framework (and drivers according to
-this framework) hides the very details of low layer hardwares and provides
+this framework) hides the very details of low layer hardware and provides
unified interfaces to userspace. Applications could use these interfaces to
configure, enumerate, open and access FPGA accelerators on platforms which
implement the DFL in the device memory. Besides this, the DFL framework
@@ -205,7 +205,7 @@ given Device Feature Lists and create platform devices for feature devices
also abstracts operations for the private features and exposes common ops to
feature device drivers.
-The FPGA DFL Device could be different hardwares, e.g. PCIe device, platform
+The FPGA DFL Device could be different hardware, e.g. PCIe device, platform
device and etc. Its driver module is always loaded first once the device is
created by the system. This driver plays an infrastructural role in the
driver architecture. It locates the DFLs in the device memory, handles them
diff --git a/Documentation/gpu/rfc/i915_gem_lmem.rst b/Documentation/gpu/rfc/i915_gem_lmem.rst
index 675ba8620d66..b421a3c1806e 100644
--- a/Documentation/gpu/rfc/i915_gem_lmem.rst
+++ b/Documentation/gpu/rfc/i915_gem_lmem.rst
@@ -18,114 +18,5 @@ real, with all the uAPI bits is:
* Route shmem backend over to TTM SYSTEM for discrete
* TTM purgeable object support
* Move i915 buddy allocator over to TTM
- * MMAP ioctl mode(see `I915 MMAP`_)
- * SET/GET ioctl caching(see `I915 SET/GET CACHING`_)
* Send RFC(with mesa-dev on cc) for final sign off on the uAPI
* Add pciid for DG1 and turn on uAPI for real
-
-New object placement and region query uAPI
-==========================================
-Starting from DG1 we need to give userspace the ability to allocate buffers from
-device local-memory. Currently the driver supports gem_create, which can place
-buffers in system memory via shmem, and the usual assortment of other
-interfaces, like dumb buffers and userptr.
-
-To support this new capability, while also providing a uAPI which will work
-beyond just DG1, we propose to offer three new bits of uAPI:
-
-DRM_I915_QUERY_MEMORY_REGIONS
------------------------------
-New query ID which allows userspace to discover the list of supported memory
-regions(like system-memory and local-memory) for a given device. We identify
-each region with a class and instance pair, which should be unique. The class
-here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
-like DG1.
-
-Side note: The class/instance design is borrowed from our existing engine uAPI,
-where we describe every physical engine in terms of its class, and the
-particular instance, since we can have more than one per class.
-
-In the future we also want to expose more information which can further
-describe the capabilities of a region.
-
-.. kernel-doc:: include/uapi/drm/i915_drm.h
- :functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
-
-GEM_CREATE_EXT
---------------
-New ioctl which is basically just gem_create but now allows userspace to provide
-a chain of possible extensions. Note that if we don't provide any extensions and
-set flags=0 then we get the exact same behaviour as gem_create.
-
-Side note: We also need to support PXP[1] in the near future, which is also
-applicable to integrated platforms, and adds its own gem_create_ext extension,
-which basically lets userspace mark a buffer as "protected".
-
-.. kernel-doc:: include/uapi/drm/i915_drm.h
- :functions: drm_i915_gem_create_ext
-
-I915_GEM_CREATE_EXT_MEMORY_REGIONS
-----------------------------------
-Implemented as an extension for gem_create_ext, we would now allow userspace to
-optionally provide an immutable list of preferred placements at creation time,
-in priority order, for a given buffer object. For the placements we expect
-them each to use the class/instance encoding, as per the output of the regions
-query. Having the list in priority order will be useful in the future when
-placing an object, say during eviction.
-
-.. kernel-doc:: include/uapi/drm/i915_drm.h
- :functions: drm_i915_gem_create_ext_memory_regions
-
-One fair criticism here is that this seems a little over-engineered[2]. If we
-just consider DG1 then yes, a simple gem_create.flags or something is totally
-all that's needed to tell the kernel to allocate the buffer in local-memory or
-whatever. However looking to the future we need uAPI which can also support
-upcoming Xe HP multi-tile architecture in a sane way, where there can be
-multiple local-memory instances for a given device, and so using both class and
-instance in our uAPI to describe regions is desirable, although specifically
-for DG1 it's uninteresting, since we only have a single local-memory instance.
-
-Existing uAPI issues
-====================
-Some potential issues we still need to resolve.
-
-I915 MMAP
----------
-In i915 there are multiple ways to MMAP GEM object, including mapping the same
-object using different mapping types(WC vs WB), i.e multiple active mmaps per
-object. TTM expects one MMAP at most for the lifetime of the object. If it
-turns out that we have to backpedal here, there might be some potential
-userspace fallout.
-
-I915 SET/GET CACHING
---------------------
-In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
-DG1 doesn't support non-snooped pcie transactions, so we can just always
-allocate as WB for smem-only buffers. If/when our hw gains support for
-non-snooped pcie transactions then we must fix this mode at allocation time as
-a new GEM extension.
-
-This is related to the mmap problem, because in general (meaning, when we're
-not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
-allocation mode.
-
-Possible idea is to let the kernel picks the mmap mode for userspace from the
-following table:
-
-smem-only: WB. Userspace does not need to call clflush.
-
-smem+lmem: We only ever allow a single mode, so simply allocate this as uncached
-memory, and always give userspace a WC mapping. GPU still does snooped access
-here(assuming we can't turn it off like on DG1), which is a bit inefficient.
-
-lmem only: always WC
-
-This means on discrete you only get a single mmap mode, all others must be
-rejected. That's probably going to be a new default mode or something like
-that.
-
-Links
-=====
-[1] https://patchwork.freedesktop.org/series/86798/
-
-[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791
diff --git a/Documentation/hwmon/aquacomputer_d5next.rst b/Documentation/hwmon/aquacomputer_d5next.rst
new file mode 100644
index 000000000000..1f4bb4ba2e4b
--- /dev/null
+++ b/Documentation/hwmon/aquacomputer_d5next.rst
@@ -0,0 +1,61 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+Kernel driver aquacomputer-d5next
+=================================
+
+Supported devices:
+
+* Aquacomputer D5 Next watercooling pump
+
+Author: Aleksa Savic
+
+Description
+-----------
+
+This driver exposes hardware sensors of the Aquacomputer D5 Next watercooling
+pump, which communicates through a proprietary USB HID protocol.
+
+Available sensors are pump and fan speed, power, voltage and current, as
+well as coolant temperature. Also available through debugfs are the serial
+number, firmware version and power-on count.
+
+Attaching a fan is optional and allows it to be controlled using temperature
+curves directly from the pump. If it's not connected, the fan-related sensors
+will report zeroes.
+
+The pump can be configured either through software or via its physical
+interface. Configuring the pump through this driver is not implemented, as it
+seems to require sending it a complete configuration. That includes addressable
+RGB LEDs, for which there is no standard sysfs interface. Thus, that task is
+better suited for userspace tools.
+
+Usage notes
+-----------
+
+The pump communicates via HID reports. The driver is loaded automatically by
+the kernel and supports hotswapping.
+
+Sysfs entries
+-------------
+
+============ =============================================
+temp1_input Coolant temperature (in millidegrees Celsius)
+fan1_input Pump speed (in RPM)
+fan2_input Fan speed (in RPM)
+power1_input Pump power (in micro Watts)
+power2_input Fan power (in micro Watts)
+in0_input Pump voltage (in milli Volts)
+in1_input Fan voltage (in milli Volts)
+in2_input +5V rail voltage (in milli Volts)
+curr1_input Pump current (in milli Amperes)
+curr2_input Fan current (in milli Amperes)
+============ =============================================
+
+Debugfs entries
+---------------
+
+================ ===============================================
+serial_number Serial number of the pump
+firmware_version Version of installed firmware
+power_cycles Count of how many times the pump was powered on
+================ ===============================================
diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst
index bc01601ea81a..f790f1260c33 100644
--- a/Documentation/hwmon/index.rst
+++ b/Documentation/hwmon/index.rst
@@ -39,6 +39,7 @@ Hardware Monitoring Kernel Drivers
adt7475
aht10
amc6821
+ aquacomputer_d5next
asb100
asc7621
aspeed-pwm-tacho
@@ -160,6 +161,7 @@ Hardware Monitoring Kernel Drivers
pwm-fan
q54sj108a2
raspberrypi-hwmon
+ sbrmi
sbtsi_temp
sch5627
sch5636
diff --git a/Documentation/hwmon/sbrmi.rst b/Documentation/hwmon/sbrmi.rst
new file mode 100644
index 000000000000..296049e13ac9
--- /dev/null
+++ b/Documentation/hwmon/sbrmi.rst
@@ -0,0 +1,79 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+Kernel driver sbrmi
+===================
+
+Supported hardware:
+
+ * Sideband Remote Management Interface (SB-RMI) compliant AMD SoC
+ device connected to the BMC via the APML.
+
+ Prefix: 'sbrmi'
+
+ Addresses scanned: This driver doesn't support address scanning.
+
+ To instantiate this driver on an AMD CPU with SB-RMI
+ support, the i2c bus number would be the bus connected from the board
+ management controller (BMC) to the CPU.
+ The SMBus address is really 7 bits. Some vendors and the SMBus
+ specification show the address as 8 bits, left justified with the R/W
+ bit as a write (0) making bit 0. Some vendors use only the 7 bits
+ to describe the address.
+ As mentioned in AMD's APML specification, The SB-RMI address is
+ normally 78h(0111 100W) or 3Ch(011 1100) for socket 0 and 70h(0111 000W)
+ or 38h(011 1000) for socket 1, but it could vary based on hardware
+ address select pins.
+
+ Datasheet: The SB-RMI interface and protocol along with the Advanced
+ Platform Management Link (APML) Specification is available
+ as part of the open source SoC register reference at:
+
+ https://www.amd.com/en/support/tech-docs?keyword=55898
+
+Author: Akshay Gupta <akshay.gupta@amd.com>
+
+Description
+-----------
+
+The APML provides a way to communicate with the SB Remote Management interface
+(SB-RMI) module from the external SMBus master that can be used to report socket
+power on AMD platforms using mailbox command and resembles a typical 8-pin remote
+power sensor's I2C interface to BMC.
+
+This driver implements current power with power cap and power cap max.
+
+sysfs-Interface
+---------------
+Power sensors can be queried and set via the standard ``hwmon`` interface
+on ``sysfs``, under the directory ``/sys/class/hwmon/hwmonX`` for some value
+of ``X`` (search for the ``X`` such that ``/sys/class/hwmon/hwmonX/name`` has
+content ``sbrmi``)
+
+================ ===== ========================================================
+Name Perm Description
+================ ===== ========================================================
+power1_input RO Current Power consumed
+power1_cap RW Power limit can be set between 0 and power1_cap_max
+power1_cap_max RO Maximum powerlimit calculated and reported by the SMU FW
+================ ===== ========================================================
+
+The following example show how the 'Power' attribute from the i2c-addresses
+can be monitored using the userspace utilities like ``sensors`` binary::
+
+ # sensors
+ sbrmi-i2c-1-38
+ Adapter: bcm2835 I2C adapter
+ power1: 61.00 W (cap = 225.00 W)
+
+ sbrmi-i2c-1-3c
+ Adapter: bcm2835 I2C adapter
+ power1: 28.39 W (cap = 224.77 W)
+ #
+
+Also, Below shows how get and set the values from sysfs entries individually::
+ # cat /sys/class/hwmon/hwmon1/power1_cap_max
+ 225000000
+
+ # echo 180000000 > /sys/class/hwmon/hwmon1/power1_cap
+ # cat /sys/class/hwmon/hwmon1/power1_cap
+ 180000000
diff --git a/Documentation/hwmon/scpi-hwmon.rst b/Documentation/hwmon/scpi-hwmon.rst
index eee7022b44db..1e3f83ec0658 100644
--- a/Documentation/hwmon/scpi-hwmon.rst
+++ b/Documentation/hwmon/scpi-hwmon.rst
@@ -32,5 +32,5 @@ Usage Notes
The driver relies on device tree node to indicate the presence of SCPI
support in the kernel. See
-Documentation/devicetree/bindings/arm/arm,scpi.txt for details of the
+Documentation/devicetree/bindings/firmware/arm,scpi.yaml for details of the
devicetree node.
diff --git a/Documentation/hwmon/sht4x.rst b/Documentation/hwmon/sht4x.rst
index 3b37abcd4a46..c318e5582ead 100644
--- a/Documentation/hwmon/sht4x.rst
+++ b/Documentation/hwmon/sht4x.rst
@@ -42,4 +42,4 @@ humidity1_input Measured humidity in %H
update_interval The minimum interval for polling the sensor,
in milliseconds. Writable. Must be at least
2000.
-============== =============================================
+=============== ============================================
diff --git a/Documentation/i2c/index.rst b/Documentation/i2c/index.rst
index 8b76217e370a..6270f1fd7d4e 100644
--- a/Documentation/i2c/index.rst
+++ b/Documentation/i2c/index.rst
@@ -17,6 +17,7 @@ Introduction
busses/index
i2c-topology
muxes/i2c-mux-gpio
+ i2c-sysfs
Writing device drivers
======================
diff --git a/Documentation/leds/well-known-leds.txt b/Documentation/leds/well-known-leds.txt
new file mode 100644
index 000000000000..4a8b9dc4bf52
--- /dev/null
+++ b/Documentation/leds/well-known-leds.txt
@@ -0,0 +1,58 @@
+-*- org -*-
+
+It is somehow important to provide consistent interface to the
+userland. LED devices have one problem there, and that is naming of
+directories in /sys/class/leds. It would be nice if userland would
+just know right "name" for given LED function, but situation got more
+complex.
+
+Anyway, if backwards compatibility is not an issue, new code should
+use one of the "good" names from this list, and you should extend the
+list where applicable.
+
+Legacy names are listed, too; in case you are writing application that
+wants to use particular feature, you should probe for good name, first,
+but then try the legacy ones, too.
+
+Notice there's a list of functions in include/dt-bindings/leds/common.h .
+
+* Keyboards
+
+Good: "input*:*:capslock"
+Good: "input*:*:scrolllock"
+Good: "input*:*:numlock"
+Legacy: "shift-key-light" (Motorola Droid 4, capslock)
+
+Set of common keyboard LEDs, going back to PC AT or so.
+
+Legacy: "tpacpi::thinklight" (IBM/Lenovo Thinkpads)
+Legacy: "lp5523:kb{1,2,3,4,5,6}" (Nokia N900)
+
+Frontlight/backlight of main keyboard.
+
+Legacy: "button-backlight" (Motorola Droid 4)
+
+Some phones have touch buttons below screen; it is different from main
+keyboard. And this is their backlight.
+
+* Sound subsystem
+
+Good: "platform:*:mute"
+Good: "platform:*:micmute"
+
+LEDs on notebook body, indicating that sound input / output is muted.
+
+* System notification
+
+Legacy: "status-led:{red,green,blue}" (Motorola Droid 4)
+Legacy: "lp5523:{r,g,b}" (Nokia N900)
+
+Phones usually have multi-color status LED.
+
+* Power management
+
+Good: "platform:*:charging" (allwinner sun50i)
+
+* Screen
+
+Good: ":backlight" (Motorola Droid 4)
diff --git a/Documentation/networking/batman-adv.rst b/Documentation/networking/batman-adv.rst
index 74821d29a22f..b85563ea3682 100644
--- a/Documentation/networking/batman-adv.rst
+++ b/Documentation/networking/batman-adv.rst
@@ -157,7 +157,7 @@ Contact
Please send us comments, experiences, questions, anything :)
IRC:
- #batman on irc.freenode.org
+ #batadv on ircs://irc.hackint.org/
Mailing-list:
b.a.t.m.a.n@open-mesh.org (optional subscription at
https://lists.open-mesh.org/mailman3/postorius/lists/b.a.t.m.a.n.lists.open-mesh.org/)
diff --git a/Documentation/networking/bonding.rst b/Documentation/networking/bonding.rst
index 62f2aab8eaec..31cfd7d674a6 100644
--- a/Documentation/networking/bonding.rst
+++ b/Documentation/networking/bonding.rst
@@ -501,6 +501,18 @@ fail_over_mac
This option was added in bonding version 3.2.0. The "follow"
policy was added in bonding version 3.3.0.
+lacp_active
+ Option specifying whether to send LACPDU frames periodically.
+
+ off or 0
+ LACPDU frames acts as "speak when spoken to".
+
+ on or 1
+ LACPDU frames are sent along the configured links
+ periodically. See lacp_rate for more details.
+
+ The default is on.
+
lacp_rate
Option specifying the rate in which we'll ask our link partner
diff --git a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/index.rst b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/index.rst
index ee40fcc5ddff..62f4a4aff6ec 100644
--- a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/index.rst
+++ b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/index.rst
@@ -9,3 +9,4 @@ DPAA2 Documentation
dpio-driver
ethernet-driver
mac-phy-support
+ switch-driver
diff --git a/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst
new file mode 100644
index 000000000000..8bf411b857d4
--- /dev/null
+++ b/Documentation/networking/device_drivers/ethernet/freescale/dpaa2/switch-driver.rst
@@ -0,0 +1,217 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+===================
+DPAA2 Switch driver
+===================
+
+:Copyright: |copy| 2021 NXP
+
+The DPAA2 Switch driver probes on the Datapath Switch (DPSW) object which can
+be instantiated on the following DPAA2 SoCs and their variants: LS2088A and
+LX2160A.
+
+The driver uses the switch device driver model and exposes each switch port as
+a network interface, which can be included in a bridge or used as a standalone
+interface. Traffic switched between ports is offloaded into the hardware.
+
+The DPSW can have ports connected to DPNIs or to DPMACs for external access.
+::
+
+ [ethA] [ethB] [ethC] [ethD] [ethE] [ethF]
+ : : : : : :
+ : : : : : :
+ [dpaa2-eth] [dpaa2-eth] [ dpaa2-switch ]
+ : : : : : : kernel
+ =============================================================================
+ : : : : : : hardware
+ [DPNI] [DPNI] [============= DPSW =================]
+ | | | | | |
+ | ---------- | [DPMAC] [DPMAC]
+ ------------------------------- | |
+ | |
+ [PHY] [PHY]
+
+Creating an Ethernet Switch
+===========================
+
+The dpaa2-switch driver probes on DPSW devices found on the fsl-mc bus. These
+devices can be either created statically through the boot time configuration
+file - DataPath Layout (DPL) - or at runtime using the DPAA2 object APIs
+(incorporated already into the restool userspace tool).
+
+At the moment, the dpaa2-switch driver imposes the following restrictions on
+the DPSW object that it will probe:
+
+ * The minimum number of FDBs should be at least equal to the number of switch
+ interfaces. This is necessary so that separation of switch ports can be
+ done, ie when not under a bridge, each switch port will have its own FDB.
+ ::
+
+ fsl_dpaa2_switch dpsw.0: The number of FDBs is lower than the number of ports, cannot probe
+
+ * Both the broadcast and flooding configuration should be per FDB. This
+ enables the driver to restrict the broadcast and flooding domains of each
+ FDB depending on the switch ports that are sharing it (aka are under the
+ same bridge).
+ ::
+
+ fsl_dpaa2_switch dpsw.0: Flooding domain is not per FDB, cannot probe
+ fsl_dpaa2_switch dpsw.0: Broadcast domain is not per FDB, cannot probe
+
+ * The control interface of the switch should not be disabled
+ (DPSW_OPT_CTRL_IF_DIS not passed as a create time option). Without the
+ control interface, the driver is not capable to provide proper Rx/Tx traffic
+ support on the switch port netdevices.
+ ::
+
+ fsl_dpaa2_switch dpsw.0: Control Interface is disabled, cannot probe
+
+Besides the configuration of the actual DPSW object, the dpaa2-switch driver
+will need the following DPAA2 objects:
+
+ * 1 DPMCP - A Management Command Portal object is needed for any interraction
+ with the MC firmware.
+
+ * 1 DPBP - A Buffer Pool is used for seeding buffers intended for the Rx path
+ on the control interface.
+
+ * Access to at least one DPIO object (Software Portal) is needed for any
+ enqueue/dequeue operation to be performed on the control interface queues.
+ The DPIO object will be shared, no need for a private one.
+
+Switching features
+==================
+
+The driver supports the configuration of L2 forwarding rules in hardware for
+port bridging as well as standalone usage of the independent switch interfaces.
+
+The hardware is not configurable with respect to VLAN awareness, thus any DPAA2
+switch port should be used only in usecases with a VLAN aware bridge::
+
+ $ ip link add dev br0 type bridge vlan_filtering 1
+
+ $ ip link add dev br1 type bridge
+ $ ip link set dev ethX master br1
+ Error: fsl_dpaa2_switch: Cannot join a VLAN-unaware bridge
+
+Topology and loop detection through STP is supported when ``stp_state 1`` is
+used at bridge create ::
+
+ $ ip link add dev br0 type bridge vlan_filtering 1 stp_state 1
+
+L2 FDB manipulation (add/delete/dump) is supported.
+
+HW FDB learning can be configured on each switch port independently through
+bridge commands. When the HW learning is disabled, a fast age procedure will be
+run and any previously learnt addresses will be removed.
+::
+
+ $ bridge link set dev ethX learning off
+ $ bridge link set dev ethX learning on
+
+Restricting the unknown unicast and multicast flooding domain is supported, but
+not independently of each other::
+
+ $ ip link set dev ethX type bridge_slave flood off mcast_flood off
+ $ ip link set dev ethX type bridge_slave flood off mcast_flood on
+ Error: fsl_dpaa2_switch: Cannot configure multicast flooding independently of unicast.
+
+Broadcast flooding on a switch port can be disabled/enabled through the brport sysfs::
+
+ $ echo 0 > /sys/bus/fsl-mc/devices/dpsw.Y/net/ethX/brport/broadcast_flood
+
+Offloads
+========
+
+Routing actions (redirect, trap, drop)
+--------------------------------------
+
+The DPAA2 switch is able to offload flow-based redirection of packets making
+use of ACL tables. Shared filter blocks are supported by sharing a single ACL
+table between multiple ports.
+
+The following flow keys are supported:
+
+ * Ethernet: dst_mac/src_mac
+ * IPv4: dst_ip/src_ip/ip_proto/tos
+ * VLAN: vlan_id/vlan_prio/vlan_tpid/vlan_dei
+ * L4: dst_port/src_port
+
+Also, the matchall filter can be used to redirect the entire traffic received
+on a port.
+
+As per flow actions, the following are supported:
+
+ * drop
+ * mirred egress redirect
+ * trap
+
+Each ACL entry (filter) can be setup with only one of the listed
+actions.
+
+Example 1: send frames received on eth4 with a SA of 00:01:02:03:04:05 to the
+CPU::
+
+ $ tc qdisc add dev eth4 clsact
+ $ tc filter add dev eth4 ingress flower src_mac 00:01:02:03:04:05 skip_sw action trap
+
+Example 2: drop frames received on eth4 with VID 100 and PCP of 3::
+
+ $ tc filter add dev eth4 ingress protocol 802.1q flower skip_sw vlan_id 100 vlan_prio 3 action drop
+
+Example 3: redirect all frames received on eth4 to eth1::
+
+ $ tc filter add dev eth4 ingress matchall action mirred egress redirect dev eth1
+
+Example 4: Use a single shared filter block on both eth5 and eth6::
+
+ $ tc qdisc add dev eth5 ingress_block 1 clsact
+ $ tc qdisc add dev eth6 ingress_block 1 clsact
+ $ tc filter add block 1 ingress flower dst_mac 00:01:02:03:04:04 skip_sw \
+ action trap
+ $ tc filter add block 1 ingress protocol ipv4 flower src_ip 192.168.1.1 skip_sw \
+ action mirred egress redirect dev eth3
+
+Mirroring
+~~~~~~~~~
+
+The DPAA2 switch supports only per port mirroring and per VLAN mirroring.
+Adding mirroring filters in shared blocks is also supported.
+
+When using the tc-flower classifier with the 802.1q protocol, only the
+''vlan_id'' key will be accepted. Mirroring based on any other fields from the
+802.1q protocol will be rejected::
+
+ $ tc qdisc add dev eth8 ingress_block 1 clsact
+ $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_prio 3 action mirred egress mirror dev eth6
+ Error: fsl_dpaa2_switch: Only matching on VLAN ID supported.
+ We have an error talking to the kernel
+
+If a mirroring VLAN filter is requested on a port, the VLAN must to be
+installed on the switch port in question either using ''bridge'' or by creating
+a VLAN upper device if the switch port is used as a standalone interface::
+
+ $ tc qdisc add dev eth8 ingress_block 1 clsact
+ $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_id 200 action mirred egress mirror dev eth6
+ Error: VLAN must be installed on the switch port.
+ We have an error talking to the kernel
+
+ $ bridge vlan add vid 200 dev eth8
+ $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_id 200 action mirred egress mirror dev eth6
+
+ $ ip link add link eth8 name eth8.200 type vlan id 200
+ $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_id 200 action mirred egress mirror dev eth6
+
+Also, it should be noted that the mirrored traffic will be subject to the same
+egress restrictions as any other traffic. This means that when a mirrored
+packet will reach the mirror port, if the VLAN found in the packet is not
+installed on the port it will get dropped.
+
+The DPAA2 switch supports only a single mirroring destination, thus multiple
+mirror rules can be installed but their ''to'' port has to be the same::
+
+ $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_id 200 action mirred egress mirror dev eth6
+ $ tc filter add block 1 ingress protocol 802.1q flower skip_sw vlan_id 100 action mirred egress mirror dev eth7
+ Error: fsl_dpaa2_switch: Multiple mirror ports not supported.
+ We have an error talking to the kernel
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
index ef8cb62e82a1..4b59cf2c599f 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
@@ -656,3 +656,47 @@ Bridge offloads tracepoints:
$ cat /sys/kernel/debug/tracing/trace
...
ip-5387 [000] ...1 573713: mlx5_esw_bridge_vport_cleanup: vport_num=1
+
+Eswitch QoS tracepoints:
+
+- mlx5_esw_vport_qos_create: trace creation of transmit scheduler arbiter for vport::
+
+ $ echo mlx5:mlx5_esw_vport_qos_create >> /sys/kernel/debug/tracing/set_event
+ $ cat /sys/kernel/debug/tracing/trace
+ ...
+ <...>-23496 [018] .... 73136.838831: mlx5_esw_vport_qos_create: (0000:82:00.0) vport=2 tsar_ix=4 bw_share=0, max_rate=0 group=000000007b576bb3
+
+- mlx5_esw_vport_qos_config: trace configuration of transmit scheduler arbiter for vport::
+
+ $ echo mlx5:mlx5_esw_vport_qos_config >> /sys/kernel/debug/tracing/set_event
+ $ cat /sys/kernel/debug/tracing/trace
+ ...
+ <...>-26548 [023] .... 75754.223823: mlx5_esw_vport_qos_config: (0000:82:00.0) vport=1 tsar_ix=3 bw_share=34, max_rate=10000 group=000000007b576bb3
+
+- mlx5_esw_vport_qos_destroy: trace deletion of transmit scheduler arbiter for vport::
+
+ $ echo mlx5:mlx5_esw_vport_qos_destroy >> /sys/kernel/debug/tracing/set_event
+ $ cat /sys/kernel/debug/tracing/trace
+ ...
+ <...>-27418 [004] .... 76546.680901: mlx5_esw_vport_qos_destroy: (0000:82:00.0) vport=1 tsar_ix=3
+
+- mlx5_esw_group_qos_create: trace creation of transmit scheduler arbiter for rate group::
+
+ $ echo mlx5:mlx5_esw_group_qos_create >> /sys/kernel/debug/tracing/set_event
+ $ cat /sys/kernel/debug/tracing/trace
+ ...
+ <...>-26578 [008] .... 75776.022112: mlx5_esw_group_qos_create: (0000:82:00.0) group=000000008dac63ea tsar_ix=5
+
+- mlx5_esw_group_qos_config: trace configuration of transmit scheduler arbiter for rate group::
+
+ $ echo mlx5:mlx5_esw_group_qos_config >> /sys/kernel/debug/tracing/set_event
+ $ cat /sys/kernel/debug/tracing/trace
+ ...
+ <...>-27303 [020] .... 76461.455356: mlx5_esw_group_qos_config: (0000:82:00.0) group=000000008dac63ea tsar_ix=5 bw_share=100 max_rate=20000
+
+- mlx5_esw_group_qos_destroy: trace deletion of transmit scheduler arbiter for group::
+
+ $ echo mlx5:mlx5_esw_group_qos_destroy >> /sys/kernel/debug/tracing/set_event
+ $ cat /sys/kernel/debug/tracing/trace
+ ...
+ <...>-27418 [006] .... 76547.187258: mlx5_esw_group_qos_destroy: (0000:82:00.0) group=000000007b576bb3 tsar_ix=1
diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst
index 54c9f107c4b0..4878907e9232 100644
--- a/Documentation/networking/devlink/devlink-params.rst
+++ b/Documentation/networking/devlink/devlink-params.rst
@@ -97,6 +97,18 @@ own name.
* - ``enable_roce``
- Boolean
- Enable handling of RoCE traffic in the device.
+ * - ``enable_eth``
+ - Boolean
+ - When enabled, the device driver will instantiate Ethernet specific
+ auxiliary device of the devlink device.
+ * - ``enable_rdma``
+ - Boolean
+ - When enabled, the device driver will instantiate RDMA specific
+ auxiliary device of the devlink device.
+ * - ``enable_vnet``
+ - Boolean
+ - When enabled, the device driver will instantiate VDPA networking
+ specific auxiliary device of the devlink device.
* - ``internal_err_reset``
- Boolean
- When enabled, the device driver will reset the device on internal
diff --git a/Documentation/networking/devlink/hns3.rst b/Documentation/networking/devlink/hns3.rst
new file mode 100644
index 000000000000..4562a6e4782f
--- /dev/null
+++ b/Documentation/networking/devlink/hns3.rst
@@ -0,0 +1,25 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+hns3 devlink support
+====================
+
+This document describes the devlink features implemented by the ``hns3``
+device driver.
+
+The ``hns3`` driver supports reloading via ``DEVLINK_CMD_RELOAD``.
+
+Info versions
+=============
+
+The ``hns3`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+ :widths: 10 10 80
+
+ * - Name
+ - Type
+ - Description
+ * - ``fw``
+ - running
+ - Used to represent the firmware version.
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index b3b9e0692088..45b5f8b341df 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -34,6 +34,7 @@ parameters, info versions, and other features it supports.
:maxdepth: 1
bnxt
+ hns3
ionic
ice
mlx4
@@ -42,7 +43,6 @@ parameters, info versions, and other features it supports.
mv88e6xxx
netdevsim
nfp
- sja1105
qed
ti-cpsw-switch
am65-nuss-cpsw-switch
diff --git a/Documentation/networking/devlink/sja1105.rst b/Documentation/networking/devlink/sja1105.rst
deleted file mode 100644
index e2679c274085..000000000000
--- a/Documentation/networking/devlink/sja1105.rst
+++ /dev/null
@@ -1,49 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-=======================
-sja1105 devlink support
-=======================
-
-This document describes the devlink features implemented
-by the ``sja1105`` device driver.
-
-Parameters
-==========
-
-.. list-table:: Driver-specific parameters implemented
- :widths: 5 5 5 85
-
- * - Name
- - Type
- - Mode
- - Description
- * - ``best_effort_vlan_filtering``
- - Boolean
- - runtime
- - Allow plain ETH_P_8021Q headers to be used as DSA tags.
-
- Benefits:
-
- - Can terminate untagged traffic over switch net
- devices even when enslaved to a bridge with
- vlan_filtering=1.
- - Can terminate VLAN-tagged traffic over switch net
- devices even when enslaved to a bridge with
- vlan_filtering=1, with some constraints (no more than
- 7 non-pvid VLANs per user port).
- - Can do QoS based on VLAN PCP and VLAN membership
- admission control for autonomously forwarded frames
- (regardless of whether they can be terminated on the
- CPU or not).
-
- Drawbacks:
-
- - User cannot use VLANs in range 1024-3071. If the
- switch receives frames with such VIDs, it will
- misinterpret them as DSA tags.
- - Switch uses Shared VLAN Learning (FDB lookup uses
- only DMAC as key).
- - When VLANs span cross-chip topologies, the total
- number of permitted VLANs may be less than 7 per
- port, due to a maximum number of 32 VLAN retagging
- rules per switch.
diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
index 20baacf2bc5c..89bb4fa4c362 100644
--- a/Documentation/networking/dsa/dsa.rst
+++ b/Documentation/networking/dsa/dsa.rst
@@ -200,19 +200,6 @@ receive all frames regardless of the value of the MAC DA. This can be done by
setting the ``promisc_on_master`` property of the ``struct dsa_device_ops``.
Note that this assumes a DSA-unaware master driver, which is the norm.
-Hardware manufacturers are strongly discouraged to do this, but some tagging
-protocols might not provide source port information on RX for all packets, but
-e.g. only for control traffic (link-local PDUs). In this case, by implementing
-the ``filter`` method of ``struct dsa_device_ops``, the tagger might select
-which packets are to be redirected on RX towards the virtual DSA user network
-interfaces, and which are to be left in the DSA master's RX data path.
-
-It might also happen (although silicon vendors are strongly discouraged to
-produce hardware like this) that a tagging protocol splits the switch-specific
-information into a header portion and a tail portion, therefore not falling
-cleanly into any of the above 3 categories. DSA does not support this
-configuration.
-
Master network devices
----------------------
@@ -663,6 +650,22 @@ Bridge layer
CPU port, and flooding towards the CPU port should also be enabled, due to a
lack of an explicit address filtering mechanism in the DSA core.
+- ``port_bridge_tx_fwd_offload``: bridge layer function invoked after
+ ``port_bridge_join`` when a driver sets ``ds->num_fwd_offloading_bridges`` to
+ a non-zero value. Returning success in this function activates the TX
+ forwarding offload bridge feature for this port, which enables the tagging
+ protocol driver to inject data plane packets towards the bridging domain that
+ the port is a part of. Data plane packets are subject to FDB lookup, hardware
+ learning on the CPU port, and do not override the port STP state.
+ Additionally, replication of data plane packets (multicast, flooding) is
+ handled in hardware and the bridge driver will transmit a single skb for each
+ packet that needs replication. The method is provided as a configuration
+ point for drivers that need to configure the hardware for enabling this
+ feature.
+
+- ``port_bridge_tx_fwd_unoffload``: bridge layer function invoken when a driver
+ leaves a bridge port which had the TX forwarding offload feature enabled.
+
Bridge VLAN filtering
---------------------
diff --git a/Documentation/networking/dsa/sja1105.rst b/Documentation/networking/dsa/sja1105.rst
index da4057ba37f1..564caeebe2b2 100644
--- a/Documentation/networking/dsa/sja1105.rst
+++ b/Documentation/networking/dsa/sja1105.rst
@@ -65,199 +65,6 @@ If that changed setting can be transmitted to the switch through the dynamic
reconfiguration interface, it is; otherwise the switch is reset and
reprogrammed with the updated static configuration.
-Traffic support
-===============
-
-The switches do not have hardware support for DSA tags, except for "slow
-protocols" for switch control as STP and PTP. For these, the switches have two
-programmable filters for link-local destination MACs.
-These are used to trap BPDUs and PTP traffic to the master netdevice, and are
-further used to support STP and 1588 ordinary clock/boundary clock
-functionality. For frames trapped to the CPU, source port and switch ID
-information is encoded by the hardware into the frames.
-
-But by leveraging ``CONFIG_NET_DSA_TAG_8021Q`` (a software-defined DSA tagging
-format based on VLANs), general-purpose traffic termination through the network
-stack can be supported under certain circumstances.
-
-Depending on VLAN awareness state, the following operating modes are possible
-with the switch:
-
-- Mode 1 (VLAN-unaware): a port is in this mode when it is used as a standalone
- net device, or when it is enslaved to a bridge with ``vlan_filtering=0``.
-- Mode 2 (fully VLAN-aware): a port is in this mode when it is enslaved to a
- bridge with ``vlan_filtering=1``. Access to the entire VLAN range is given to
- the user through ``bridge vlan`` commands, but general-purpose (anything
- other than STP, PTP etc) traffic termination is not possible through the
- switch net devices. The other packets can be still by user space processed
- through the DSA master interface (similar to ``DSA_TAG_PROTO_NONE``).
-- Mode 3 (best-effort VLAN-aware): a port is in this mode when enslaved to a
- bridge with ``vlan_filtering=1``, and the devlink property of its parent
- switch named ``best_effort_vlan_filtering`` is set to ``true``. When
- configured like this, the range of usable VIDs is reduced (0 to 1023 and 3072
- to 4094), so is the number of usable VIDs (maximum of 7 non-pvid VLANs per
- port*), and shared VLAN learning is performed (FDB lookup is done only by
- DMAC, not also by VID).
-
-To summarize, in each mode, the following types of traffic are supported over
-the switch net devices:
-
-+-------------+-----------+--------------+------------+
-| | Mode 1 | Mode 2 | Mode 3 |
-+=============+===========+==============+============+
-| Regular | Yes | No | Yes |
-| traffic | | (use master) | |
-+-------------+-----------+--------------+------------+
-| Management | Yes | Yes | Yes |
-| traffic | | | |
-| (BPDU, PTP) | | | |
-+-------------+-----------+--------------+------------+
-
-To configure the switch to operate in Mode 3, the following steps can be
-followed::
-
- ip link add dev br0 type bridge
- # swp2 operates in Mode 1 now
- ip link set dev swp2 master br0
- # swp2 temporarily moves to Mode 2
- ip link set dev br0 type bridge vlan_filtering 1
- [ 61.204770] sja1105 spi0.1: Reset switch and programmed static config. Reason: VLAN filtering
- [ 61.239944] sja1105 spi0.1: Disabled switch tagging
- # swp3 now operates in Mode 3
- devlink dev param set spi/spi0.1 name best_effort_vlan_filtering value true cmode runtime
- [ 64.682927] sja1105 spi0.1: Reset switch and programmed static config. Reason: VLAN filtering
- [ 64.711925] sja1105 spi0.1: Enabled switch tagging
- # Cannot use VLANs in range 1024-3071 while in Mode 3.
- bridge vlan add dev swp2 vid 1025 untagged pvid
- RTNETLINK answers: Operation not permitted
- bridge vlan add dev swp2 vid 100
- bridge vlan add dev swp2 vid 101 untagged
- bridge vlan
- port vlan ids
- swp5 1 PVID Egress Untagged
-
- swp2 1 PVID Egress Untagged
- 100
- 101 Egress Untagged
-
- swp3 1 PVID Egress Untagged
-
- swp4 1 PVID Egress Untagged
-
- br0 1 PVID Egress Untagged
- bridge vlan add dev swp2 vid 102
- bridge vlan add dev swp2 vid 103
- bridge vlan add dev swp2 vid 104
- bridge vlan add dev swp2 vid 105
- bridge vlan add dev swp2 vid 106
- bridge vlan add dev swp2 vid 107
- # Cannot use mode than 7 VLANs per port while in Mode 3.
- [ 3885.216832] sja1105 spi0.1: No more free subvlans
-
-\* "maximum of 7 non-pvid VLANs per port": Decoding VLAN-tagged packets on the
-CPU in mode 3 is possible through VLAN retagging of packets that go from the
-switch to the CPU. In cross-chip topologies, the port that goes to the CPU
-might also go to other switches. In that case, those other switches will see
-only a retagged packet (which only has meaning for the CPU). So if they are
-interested in this VLAN, they need to apply retagging in the reverse direction,
-to recover the original value from it. This consumes extra hardware resources
-for this switch. There is a maximum of 32 entries in the Retagging Table of
-each switch device.
-
-As an example, consider this cross-chip topology::
-
- +-------------------------------------------------+
- | Host SoC |
- | +-------------------------+ |
- | | DSA master for embedded | |
- | | switch (non-sja1105) | |
- | +--------+-------------------------+--------+ |
- | | embedded L2 switch | |
- | | | |
- | | +--------------+ +--------------+ | |
- | | |DSA master for| |DSA master for| | |
- | | | SJA1105 1 | | SJA1105 2 | | |
- +--+---+--------------+-----+--------------+---+--+
-
- +-----------------------+ +-----------------------+
- | SJA1105 switch 1 | | SJA1105 switch 2 |
- +-----+-----+-----+-----+ +-----+-----+-----+-----+
- |sw1p0|sw1p1|sw1p2|sw1p3| |sw2p0|sw2p1|sw2p2|sw2p3|
- +-----+-----+-----+-----+ +-----+-----+-----+-----+
-
-To reach the CPU, SJA1105 switch 1 (spi/spi2.1) uses the same port as is uses
-to reach SJA1105 switch 2 (spi/spi2.2), which would be port 4 (not drawn).
-Similarly for SJA1105 switch 2.
-
-Also consider the following commands, that add VLAN 100 to every sja1105 user
-port::
-
- devlink dev param set spi/spi2.1 name best_effort_vlan_filtering value true cmode runtime
- devlink dev param set spi/spi2.2 name best_effort_vlan_filtering value true cmode runtime
- ip link add dev br0 type bridge
- for port in sw1p0 sw1p1 sw1p2 sw1p3 \
- sw2p0 sw2p1 sw2p2 sw2p3; do
- ip link set dev $port master br0
- done
- ip link set dev br0 type bridge vlan_filtering 1
- for port in sw1p0 sw1p1 sw1p2 sw1p3 \
- sw2p0 sw2p1 sw2p2; do
- bridge vlan add dev $port vid 100
- done
- ip link add link br0 name br0.100 type vlan id 100 && ip link set dev br0.100 up
- ip addr add 192.168.100.3/24 dev br0.100
- bridge vlan add dev br0 vid 100 self
-
- bridge vlan
- port vlan ids
- sw1p0 1 PVID Egress Untagged
- 100
-
- sw1p1 1 PVID Egress Untagged
- 100
-
- sw1p2 1 PVID Egress Untagged
- 100
-
- sw1p3 1 PVID Egress Untagged
- 100
-
- sw2p0 1 PVID Egress Untagged
- 100
-
- sw2p1 1 PVID Egress Untagged
- 100
-
- sw2p2 1 PVID Egress Untagged
- 100
-
- sw2p3 1 PVID Egress Untagged
-
- br0 1 PVID Egress Untagged
- 100
-
-SJA1105 switch 1 consumes 1 retagging entry for each VLAN on each user port
-towards the CPU. It also consumes 1 retagging entry for each non-pvid VLAN that
-it is also interested in, which is configured on any port of any neighbor
-switch.
-
-In this case, SJA1105 switch 1 consumes a total of 11 retagging entries, as
-follows:
-
-- 8 retagging entries for VLANs 1 and 100 installed on its user ports
- (``sw1p0`` - ``sw1p3``)
-- 3 retagging entries for VLAN 100 installed on the user ports of SJA1105
- switch 2 (``sw2p0`` - ``sw2p2``), because it also has ports that are
- interested in it. The VLAN 1 is a pvid on SJA1105 switch 2 and does not need
- reverse retagging.
-
-SJA1105 switch 2 also consumes 11 retagging entries, but organized as follows:
-
-- 7 retagging entries for the bridge VLANs on its user ports (``sw2p0`` -
- ``sw2p3``).
-- 4 retagging entries for VLAN 100 installed on the user ports of SJA1105
- switch 1 (``sw1p0`` - ``sw1p3``).
-
Switching features
==================
@@ -282,33 +89,10 @@ untagged), and therefore this mode is also supported.
Segregating the switch ports in multiple bridges is supported (e.g. 2 + 2), but
all bridges should have the same level of VLAN awareness (either both have
-``vlan_filtering`` 0, or both 1). Also an inevitable limitation of the fact
-that VLAN awareness is global at the switch level is that once a bridge with
-``vlan_filtering`` enslaves at least one switch port, the other un-bridged
-ports are no longer available for standalone traffic termination.
+``vlan_filtering`` 0, or both 1).
Topology and loop detection through STP is supported.
-L2 FDB manipulation (add/delete/dump) is currently possible for the first
-generation devices. Aging time of FDB entries, as well as enabling fully static
-management (no address learning and no flooding of unknown traffic) is not yet
-configurable in the driver.
-
-A special comment about bridging with other netdevices (illustrated with an
-example):
-
-A board has eth0, eth1, swp0@eth1, swp1@eth1, swp2@eth1, swp3@eth1.
-The switch ports (swp0-3) are under br0.
-It is desired that eth0 is turned into another switched port that communicates
-with swp0-3.
-
-If br0 has vlan_filtering 0, then eth0 can simply be added to br0 with the
-intended results.
-If br0 has vlan_filtering 1, then a new br1 interface needs to be created that
-enslaves eth0 and eth1 (the DSA master of the switch ports). This is because in
-this mode, the switch ports beneath br0 are not capable of regular traffic, and
-are only used as a conduit for switchdev operations.
-
Offloads
========
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index c86628e6a235..d9b55b7a1a4d 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -595,6 +595,14 @@ Link extended substates:
that is not formally
supported, which led to
signal integrity issues
+
+ ``ETHTOOL_LINK_EXT_SUBSTATE_BSI_SERDES_REFERENCE_CLOCK_LOST`` The external clock signal for
+ SerDes is too weak or
+ unavailable.
+
+ ``ETHTOOL_LINK_EXT_SUBSTATE_BSI_SERDES_ALOS`` The received signal for
+ SerDes is too weak because
+ analog loss of signal.
================================================================= =============================
Cable issue substates:
@@ -939,12 +947,25 @@ Kernel response contents:
``ETHTOOL_A_COALESCE_TX_USECS_HIGH`` u32 delay (us), high Tx
``ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH`` u32 max packets, high Tx
``ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL`` u32 rate sampling interval
+ ``ETHTOOL_A_COALESCE_USE_CQE_TX`` bool timer reset mode, Tx
+ ``ETHTOOL_A_COALESCE_USE_CQE_RX`` bool timer reset mode, Rx
=========================================== ====== =======================
Attributes are only included in reply if their value is not zero or the
corresponding bit in ``ethtool_ops::supported_coalesce_params`` is set (i.e.
they are declared as supported by driver).
+Timer reset mode (``ETHTOOL_A_COALESCE_USE_CQE_TX`` and
+``ETHTOOL_A_COALESCE_USE_CQE_RX``) controls the interaction between packet
+arrival and the various time based delay parameters. By default timers are
+expected to limit the max delay between any packet arrival/departure and a
+corresponding interrupt. In this mode timer should be started by packet
+arrival (sometimes delivery of previous interrupt) and reset when interrupt
+is delivered.
+Setting the appropriate attribute to 1 will enable ``CQE`` mode, where
+each packet event resets the timer. In this mode timer is used to force
+the interrupt if queue goes idle, while busy queues depend on the packet
+limit to trigger interrupts.
COALESCE_SET
============
@@ -977,6 +998,8 @@ Request contents:
``ETHTOOL_A_COALESCE_TX_USECS_HIGH`` u32 delay (us), high Tx
``ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH`` u32 max packets, high Tx
``ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL`` u32 rate sampling interval
+ ``ETHTOOL_A_COALESCE_USE_CQE_TX`` bool timer reset mode, Tx
+ ``ETHTOOL_A_COALESCE_USE_CQE_RX`` bool timer reset mode, Rx
=========================================== ====== =======================
Request is rejected if it attributes declared as unsupported by driver (i.e.
diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst
index 3e2221f4abe4..ce2b8e8bb9ab 100644
--- a/Documentation/networking/filter.rst
+++ b/Documentation/networking/filter.rst
@@ -320,13 +320,6 @@ Examples for low-level BPF:
ret #-1
drop: ret #0
-**(Accelerated) VLAN w/ id 10**::
-
- ld vlan_tci
- jneq #10, drop
- ret #-1
- drop: ret #0
-
**icmp random packet sampling, 1 in 4**::
ldh [12]
@@ -358,6 +351,22 @@ Examples for low-level BPF:
bad: ret #0 /* SECCOMP_RET_KILL_THREAD */
good: ret #0x7fff0000 /* SECCOMP_RET_ALLOW */
+Examples for low-level BPF extension:
+
+**Packet for interface index 13**::
+
+ ld ifidx
+ jneq #13, drop
+ ret #-1
+ drop: ret #0
+
+**(Accelerated) VLAN w/ id 10**::
+
+ ld vlan_tci
+ jneq #10, drop
+ ret #-1
+ drop: ret #0
+
The above example code can be placed into a file (here called "foo"), and
then be passed to the bpf_asm tool for generating opcodes, output that xt_bpf
and cls_bpf understands and can directly be loaded with. Example with above
@@ -629,8 +638,8 @@ extension, PTP dissector/classifier, and much more. They are all internally
converted by the kernel into the new instruction set representation and run
in the eBPF interpreter. For in-kernel handlers, this all works transparently
by using bpf_prog_create() for setting up the filter, resp.
-bpf_prog_destroy() for destroying it. The macro
-BPF_PROG_RUN(filter, ctx) transparently invokes eBPF interpreter or JITed
+bpf_prog_destroy() for destroying it. The function
+bpf_prog_run(filter, ctx) transparently invokes eBPF interpreter or JITed
code to run the filter. 'filter' is a pointer to struct bpf_prog that we
got from bpf_prog_create(), and 'ctx' the given context (e.g.
skb pointer). All constraints and restrictions from bpf_check_classic() apply
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index e9ce55992aa9..58bc8cd367c6 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -57,6 +57,7 @@ Contents:
gen_stats
gtp
ila
+ ioam6-sysctl
ipddp
ip_dynaddr
ipsec
@@ -68,6 +69,7 @@ Contents:
l2tp
lapb-module
mac80211-injection
+ mctp
mpls-sysctl
mptcp-sysctl
multiqueue
diff --git a/Documentation/networking/ioam6-sysctl.rst b/Documentation/networking/ioam6-sysctl.rst
new file mode 100644
index 000000000000..c18cab2c481a
--- /dev/null
+++ b/Documentation/networking/ioam6-sysctl.rst
@@ -0,0 +1,26 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
+IOAM6 Sysfs variables
+=====================
+
+
+/proc/sys/net/conf/<iface>/ioam6_* variables:
+=============================================
+
+ioam6_enabled - BOOL
+ Accept (= enabled) or ignore (= disabled) IPv6 IOAM options on ingress
+ for this interface.
+
+ * 0 - disabled (default)
+ * 1 - enabled
+
+ioam6_id - SHORT INTEGER
+ Define the IOAM id of this interface.
+
+ Default is ~0.
+
+ioam6_id_wide - INTEGER
+ Define the wide IOAM id of this interface.
+
+ Default is ~0.
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 316c7dfa9693..d91ab28718d4 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1926,6 +1926,23 @@ fib_notify_on_flag_change - INTEGER
- 1 - Emit notifications.
- 2 - Emit notifications only for RTM_F_OFFLOAD_FAILED flag change.
+ioam6_id - INTEGER
+ Define the IOAM id of this node. Uses only 24 bits out of 32 in total.
+
+ Min: 0
+ Max: 0xFFFFFF
+
+ Default: 0xFFFFFF
+
+ioam6_id_wide - LONG INTEGER
+ Define the wide IOAM id of this node. Uses only 56 bits out of 64 in
+ total. Can be different from ioam6_id.
+
+ Min: 0
+ Max: 0xFFFFFFFFFFFFFF
+
+ Default: 0xFFFFFFFFFFFFFF
+
IPv6 Fragmentation:
ip6frag_high_thresh - INTEGER
diff --git a/Documentation/networking/mctp.rst b/Documentation/networking/mctp.rst
new file mode 100644
index 000000000000..6100cdc220f6
--- /dev/null
+++ b/Documentation/networking/mctp.rst
@@ -0,0 +1,213 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================================
+Management Component Transport Protocol (MCTP)
+==============================================
+
+net/mctp/ contains protocol support for MCTP, as defined by DMTF standard
+DSP0236. Physical interface drivers ("bindings" in the specification) are
+provided in drivers/net/mctp/.
+
+The core code provides a socket-based interface to send and receive MCTP
+messages, through an AF_MCTP, SOCK_DGRAM socket.
+
+Structure: interfaces & networks
+================================
+
+The kernel models the local MCTP topology through two items: interfaces and
+networks.
+
+An interface (or "link") is an instance of an MCTP physical transport binding
+(as defined by DSP0236, section 3.2.47), likely connected to a specific hardware
+device. This is represented as a ``struct netdevice``.
+
+A network defines a unique address space for MCTP endpoints by endpoint-ID
+(described by DSP0236, section 3.2.31). A network has a user-visible identifier
+to allow references from userspace. Route definitions are specific to one
+network.
+
+Interfaces are associated with one network. A network may be associated with one
+or more interfaces.
+
+If multiple networks are present, each may contain endpoint IDs (EIDs) that are
+also present on other networks.
+
+Sockets API
+===========
+
+Protocol definitions
+--------------------
+
+MCTP uses ``AF_MCTP`` / ``PF_MCTP`` for the address- and protocol- families.
+Since MCTP is message-based, only ``SOCK_DGRAM`` sockets are supported.
+
+.. code-block:: C
+
+ int sd = socket(AF_MCTP, SOCK_DGRAM, 0);
+
+The only (current) value for the ``protocol`` argument is 0.
+
+As with all socket address families, source and destination addresses are
+specified with a ``sockaddr`` type, with a single-byte endpoint address:
+
+.. code-block:: C
+
+ typedef __u8 mctp_eid_t;
+
+ struct mctp_addr {
+ mctp_eid_t s_addr;
+ };
+
+ struct sockaddr_mctp {
+ unsigned short int smctp_family;
+ int smctp_network;
+ struct mctp_addr smctp_addr;
+ __u8 smctp_type;
+ __u8 smctp_tag;
+ };
+
+ #define MCTP_NET_ANY 0x0
+ #define MCTP_ADDR_ANY 0xff
+
+
+Syscall behaviour
+-----------------
+
+The following sections describe the MCTP-specific behaviours of the standard
+socket system calls. These behaviours have been chosen to map closely to the
+existing sockets APIs.
+
+``bind()`` : set local socket address
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Sockets that receive incoming request packets will bind to a local address,
+using the ``bind()`` syscall.
+
+.. code-block:: C
+
+ struct sockaddr_mctp addr;
+
+ addr.smctp_family = AF_MCTP;
+ addr.smctp_network = MCTP_NET_ANY;
+ addr.smctp_addr.s_addr = MCTP_ADDR_ANY;
+ addr.smctp_type = MCTP_TYPE_PLDM;
+ addr.smctp_tag = MCTP_TAG_OWNER;
+
+ int rc = bind(sd, (struct sockaddr *)&addr, sizeof(addr));
+
+This establishes the local address of the socket. Incoming MCTP messages that
+match the network, address, and message type will be received by this socket.
+The reference to 'incoming' is important here; a bound socket will only receive
+messages with the TO bit set, to indicate an incoming request message, rather
+than a response.
+
+The ``smctp_tag`` value will configure the tags accepted from the remote side of
+this socket. Given the above, the only valid value is ``MCTP_TAG_OWNER``, which
+will result in remotely "owned" tags being routed to this socket. Since
+``MCTP_TAG_OWNER`` is set, the 3 least-significant bits of ``smctp_tag`` are not
+used; callers must set them to zero.
+
+A ``smctp_network`` value of ``MCTP_NET_ANY`` will configure the socket to
+receive incoming packets from any locally-connected network. A specific network
+value will cause the socket to only receive incoming messages from that network.
+
+The ``smctp_addr`` field specifies a local address to bind to. A value of
+``MCTP_ADDR_ANY`` configures the socket to receive messages addressed to any
+local destination EID.
+
+The ``smctp_type`` field specifies which message types to receive. Only the
+lower 7 bits of the type is matched on incoming messages (ie., the
+most-significant IC bit is not part of the match). This results in the socket
+receiving packets with and without a message integrity check footer.
+
+``sendto()``, ``sendmsg()``, ``send()`` : transmit an MCTP message
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+An MCTP message is transmitted using one of the ``sendto()``, ``sendmsg()`` or
+``send()`` syscalls. Using ``sendto()`` as the primary example:
+
+.. code-block:: C
+
+ struct sockaddr_mctp addr;
+ char buf[14];
+ ssize_t len;
+
+ /* set message destination */
+ addr.smctp_family = AF_MCTP;
+ addr.smctp_network = 0;
+ addr.smctp_addr.s_addr = 8;
+ addr.smctp_tag = MCTP_TAG_OWNER;
+ addr.smctp_type = MCTP_TYPE_ECHO;
+
+ /* arbitrary message to send, with message-type header */
+ buf[0] = MCTP_TYPE_ECHO;
+ memcpy(buf + 1, "hello, world!", sizeof(buf) - 1);
+
+ len = sendto(sd, buf, sizeof(buf), 0,
+ (struct sockaddr_mctp *)&addr, sizeof(addr));
+
+The network and address fields of ``addr`` define the remote address to send to.
+If ``smctp_tag`` has the ``MCTP_TAG_OWNER``, the kernel will ignore any bits set
+in ``MCTP_TAG_VALUE``, and generate a tag value suitable for the destination
+EID. If ``MCTP_TAG_OWNER`` is not set, the message will be sent with the tag
+value as specified. If a tag value cannot be allocated, the system call will
+report an errno of ``EAGAIN``.
+
+The application must provide the message type byte as the first byte of the
+message buffer passed to ``sendto()``. If a message integrity check is to be
+included in the transmitted message, it must also be provided in the message
+buffer, and the most-significant bit of the message type byte must be 1.
+
+The ``sendmsg()`` system call allows a more compact argument interface, and the
+message buffer to be specified as a scatter-gather list. At present no ancillary
+message types (used for the ``msg_control`` data passed to ``sendmsg()``) are
+defined.
+
+Transmitting a message on an unconnected socket with ``MCTP_TAG_OWNER``
+specified will cause an allocation of a tag, if no valid tag is already
+allocated for that destination. The (destination-eid,tag) tuple acts as an
+implicit local socket address, to allow the socket to receive responses to this
+outgoing message. If any previous allocation has been performed (to for a
+different remote EID), that allocation is lost.
+
+Sockets will only receive responses to requests they have sent (with TO=1) and
+may only respond (with TO=0) to requests they have received.
+
+``recvfrom()``, ``recvmsg()``, ``recv()`` : receive an MCTP message
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+An MCTP message can be received by an application using one of the
+``recvfrom()``, ``recvmsg()``, or ``recv()`` system calls. Using ``recvfrom()``
+as the primary example:
+
+.. code-block:: C
+
+ struct sockaddr_mctp addr;
+ socklen_t addrlen;
+ char buf[14];
+ ssize_t len;
+
+ addrlen = sizeof(addr);
+
+ len = recvfrom(sd, buf, sizeof(buf), 0,
+ (struct sockaddr_mctp *)&addr, &addrlen);
+
+ /* We can expect addr to describe an MCTP address */
+ assert(addrlen >= sizeof(buf));
+ assert(addr.smctp_family == AF_MCTP);
+
+ printf("received %zd bytes from remote EID %d\n", rc, addr.smctp_addr);
+
+The address argument to ``recvfrom`` and ``recvmsg`` is populated with the
+remote address of the incoming message, including tag value (this will be needed
+in order to reply to the message).
+
+The first byte of the message buffer will contain the message type byte. If an
+integrity check follows the message, it will be included in the received buffer.
+
+The ``recv()`` system call behaves in a similar way, but does not provide a
+remote address to the application. Therefore, these are only useful if the
+remote address is already known, or the message does not require a reply.
+
+Like the send calls, sockets will only receive responses to requests they have
+sent (TO=1) and may only respond (TO=0) to requests they have received.
diff --git a/Documentation/networking/mptcp-sysctl.rst b/Documentation/networking/mptcp-sysctl.rst
index 76d939e688b8..b0d4da71e68e 100644
--- a/Documentation/networking/mptcp-sysctl.rst
+++ b/Documentation/networking/mptcp-sysctl.rst
@@ -45,3 +45,15 @@ allow_join_initial_addr_port - BOOLEAN
This is a per-namespace sysctl.
Default: 1
+
+stale_loss_cnt - INTEGER
+ The number of MPTCP-level retransmission intervals with no traffic and
+ pending outstanding data on a given subflow required to declare it stale.
+ The packet scheduler ignores stale subflows.
+ A low stale_loss_cnt value allows for fast active-backup switch-over,
+ an high value maximize links utilization on edge scenarios e.g. lossy
+ link with high BER or peer pausing the data processing.
+
+ This is a per-namespace sysctl.
+
+ Default: 4
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index 17bdcb746dcf..9e4cccb90b87 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -222,6 +222,35 @@ ndo_do_ioctl:
Synchronization: rtnl_lock() semaphore.
Context: process
+ This is only called by network subsystems internally,
+ not by user space calling ioctl as it was in before
+ linux-5.14.
+
+ndo_siocbond:
+ Synchronization: rtnl_lock() semaphore.
+ Context: process
+
+ Used by the bonding driver for the SIOCBOND family of
+ ioctl commands.
+
+ndo_siocwandev:
+ Synchronization: rtnl_lock() semaphore.
+ Context: process
+
+ Used by the drivers/net/wan framework to handle
+ the SIOCWANDEV ioctl with the if_settings structure.
+
+ndo_siocdevprivate:
+ Synchronization: rtnl_lock() semaphore.
+ Context: process
+
+ This is used to implement SIOCDEVPRIVATE ioctl helpers.
+ These should not be added to new drivers, so don't use.
+
+ndo_eth_ioctl:
+ Synchronization: rtnl_lock() semaphore.
+ Context: process
+
ndo_get_stats:
Synchronization: rtnl_lock() semaphore, dev_base_lock rwlock, or RCU.
Context: atomic (can't sleep under rwlock or RCU)
diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentation/networking/nf_conntrack-sysctl.rst
index d31ed6c1cb0d..34ca762ea56f 100644
--- a/Documentation/networking/nf_conntrack-sysctl.rst
+++ b/Documentation/networking/nf_conntrack-sysctl.rst
@@ -184,6 +184,13 @@ nf_conntrack_gre_timeout_stream - INTEGER (seconds)
This extended timeout will be used in case there is an GRE stream
detected.
+nf_hooks_lwtunnel - BOOLEAN
+ - 0 - disabled (default)
+ - not 0 - enabled
+
+ If this option is enabled, the lightweight tunnel netfilter hooks are
+ enabled. This option cannot be disabled once it is enabled.
+
nf_flowtable_tcp_timeout - INTEGER (seconds)
default 30
@@ -191,19 +198,9 @@ nf_flowtable_tcp_timeout - INTEGER (seconds)
TCP connections may be offloaded from nf conntrack to nf flow table.
Once aged, the connection is returned to nf conntrack with tcp pickup timeout.
-nf_flowtable_tcp_pickup - INTEGER (seconds)
- default 120
-
- TCP connection timeout after being aged from nf flow table offload.
-
nf_flowtable_udp_timeout - INTEGER (seconds)
default 30
Control offload timeout for udp connections.
UDP connections may be offloaded from nf conntrack to nf flow table.
Once aged, the connection is returned to nf conntrack with udp pickup timeout.
-
-nf_flowtable_udp_pickup - INTEGER (seconds)
- default 30
-
- UDP connection timeout after being aged from nf flow table offload.
diff --git a/Documentation/networking/pktgen.rst b/Documentation/networking/pktgen.rst
index 7afa1c9f1183..1225f0f63ff0 100644
--- a/Documentation/networking/pktgen.rst
+++ b/Documentation/networking/pktgen.rst
@@ -248,26 +248,24 @@ Usage:::
-i : ($DEV) output interface/device (required)
-s : ($PKT_SIZE) packet size
- -d : ($DEST_IP) destination IP
+ -d : ($DEST_IP) destination IP. CIDR (e.g. 198.18.0.0/15) is also allowed
-m : ($DST_MAC) destination MAC-addr
+ -p : ($DST_PORT) destination PORT range (e.g. 433-444) is also allowed
-t : ($THREADS) threads to start
+ -f : ($F_THREAD) index of first thread (zero indexed CPU number)
-c : ($SKB_CLONE) SKB clones send before alloc new SKB
+ -n : ($COUNT) num messages to send per thread, 0 means indefinitely
-b : ($BURST) HW level bursting of SKBs
-v : ($VERBOSE) verbose
-x : ($DEBUG) debug
+ -6 : ($IP6) IPv6
+ -w : ($DELAY) Tx Delay value (ns)
+ -a : ($APPEND) Script will not reset generator's state, but will append its config
The global variables being set are also listed. E.g. the required
interface/device parameter "-i" sets variable $DEV. Copy the
pktgen_sampleXX scripts and modify them to fit your own needs.
-The old scripts::
-
- pktgen.conf-1-2 # 1 CPU 2 dev
- pktgen.conf-1-1-rdos # 1 CPU 1 dev w. route DoS
- pktgen.conf-1-1-ip6 # 1 CPU 1 dev ipv6
- pktgen.conf-1-1-ip6-rdos # 1 CPU 1 dev ipv6 w. route DoS
- pktgen.conf-1-1-flows # 1 CPU 1 dev multiple flows.
-
Interrupt affinity
===================
@@ -398,7 +396,7 @@ Current commands and configuration options
References:
- ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/
-- tp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/
+- ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/
Paper from Linux-Kongress in Erlangen 2004.
- ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/pktgen_paper.pdf
diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
index 7db3985359bc..a722eb30e014 100644
--- a/Documentation/networking/timestamping.rst
+++ b/Documentation/networking/timestamping.rst
@@ -625,7 +625,7 @@ interfaces of a DSA switch to share the same PHC.
By design, PTP timestamping with a DSA switch does not need any special
handling in the driver for the host port it is attached to. However, when the
host port also supports PTP timestamping, DSA will take care of intercepting
-the ``.ndo_do_ioctl`` calls towards the host port, and block attempts to enable
+the ``.ndo_eth_ioctl`` calls towards the host port, and block attempts to enable
hardware timestamping on it. This is because the SO_TIMESTAMPING API does not
allow the delivery of multiple hardware timestamps for the same packet, so
anybody else except for the DSA switch port must be prevented from doing so.
@@ -688,7 +688,7 @@ ethtool ioctl operations for them need to be mediated by their respective MAC
driver. Therefore, as opposed to DSA switches, modifications need to be done
to each individual MAC driver for PHY timestamping support. This entails:
-- Checking, in ``.ndo_do_ioctl``, whether ``phy_has_hwtstamp(netdev->phydev)``
+- Checking, in ``.ndo_eth_ioctl``, whether ``phy_has_hwtstamp(netdev->phydev)``
is true or not. If it is, then the MAC driver should not process this request
but instead pass it on to the PHY using ``phy_mii_ioctl()``.
@@ -747,7 +747,7 @@ For example, a typical driver design for TX timestamping might be to split the
transmission part into 2 portions:
1. "TX": checks whether PTP timestamping has been previously enabled through
- the ``.ndo_do_ioctl`` ("``priv->hwtstamp_tx_enabled == true``") and the
+ the ``.ndo_eth_ioctl`` ("``priv->hwtstamp_tx_enabled == true``") and the
current skb requires a TX timestamp ("``skb_shinfo(skb)->tx_flags &
SKBTX_HW_TSTAMP``"). If this is true, it sets the
"``skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS``" flag. Note: as
diff --git a/Documentation/networking/vrf.rst b/Documentation/networking/vrf.rst
index 0dde145043bc..0a9a6f968cb9 100644
--- a/Documentation/networking/vrf.rst
+++ b/Documentation/networking/vrf.rst
@@ -144,6 +144,19 @@ default VRF are only handled by a socket not bound to any VRF::
netfilter rules on the VRF device can be used to limit access to services
running in the default VRF context as well.
+Using VRF-aware applications (applications which simultaneously create sockets
+outside and inside VRFs) in conjunction with ``net.ipv4.tcp_l3mdev_accept=1``
+is possible but may lead to problems in some situations. With that sysctl
+value, it is unspecified which listening socket will be selected to handle
+connections for VRF traffic; ie. either a socket bound to the VRF or an unbound
+socket may be used to accept new connections from a VRF. This somewhat
+unexpected behavior can lead to problems if sockets are configured with extra
+options (ex. TCP MD5 keys) with the expectation that VRF traffic will
+exclusively be handled by sockets bound to VRFs, as would be the case with
+``net.ipv4.tcp_l3mdev_accept=0``. Finally and as a reminder, regardless of
+which listening socket is selected, established sockets will be created in the
+VRF based on the ingress interface, as documented earlier.
+
--------------------------------------------------------------------------------
Using iproute2 for VRFs
diff --git a/Documentation/trace/coresight/coresight-config.rst b/Documentation/trace/coresight/coresight-config.rst
new file mode 100644
index 000000000000..a4e3ef295240
--- /dev/null
+++ b/Documentation/trace/coresight/coresight-config.rst
@@ -0,0 +1,244 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================================
+CoreSight System Configuration Manager
+======================================
+
+ :Author: Mike Leach <mike.leach@linaro.org>
+ :Date: October 2020
+
+Introduction
+============
+
+The CoreSight System Configuration manager is an API that allows the
+programming of the CoreSight system with pre-defined configurations that
+can then be easily enabled from sysfs or perf.
+
+Many CoreSight components can be programmed in complex ways - especially ETMs.
+In addition, components can interact across the CoreSight system, often via
+the cross trigger components such as CTI and CTM. These system settings can
+be defined and enabled as named configurations.
+
+
+Basic Concepts
+==============
+
+This section introduces the basic concepts of a CoreSight system configuration.
+
+
+Features
+--------
+
+A feature is a named set of programming for a CoreSight device. The programming
+is device dependent, and can be defined in terms of absolute register values,
+resource usage and parameter values.
+
+The feature is defined using a descriptor. This descriptor is used to load onto
+a matching device, either when the feature is loaded into the system, or when the
+CoreSight device is registered with the configuration manager.
+
+The load process involves interpreting the descriptor into a set of register
+accesses in the driver - the resource usage and parameter descriptions
+translated into appropriate register accesses. This interpretation makes it easy
+and efficient for the feature to be programmed onto the device when required.
+
+The feature will not be active on the device until the feature is enabled, and
+the device itself is enabled. When the device is enabled then enabled features
+will be programmed into the device hardware.
+
+A feature is enabled as part of a configuration being enabled on the system.
+
+
+Parameter Value
+~~~~~~~~~~~~~~~
+
+A parameter value is a named value that may be set by the user prior to the
+feature being enabled that can adjust the behaviour of the operation programmed
+by the feature.
+
+For example, this could be a count value in a programmed operation that repeats
+at a given rate. When the feature is enabled then the current value of the
+parameter is used in programming the device.
+
+The feature descriptor defines a default value for a parameter, which is used
+if the user does not supply a new value.
+
+Users can update parameter values using the configfs API for the CoreSight
+system - which is described below.
+
+The current value of the parameter is loaded into the device when the feature
+is enabled on that device.
+
+
+Configurations
+--------------
+
+A configuration defines a set of features that are to be used in a trace
+session where the configuration is selected. For any trace session only one
+configuration may be selected.
+
+The features defined may be on any type of device that is registered
+to support system configuration. A configuration may select features to be
+enabled on a class of devices - i.e. any ETMv4, or specific devices, e.g. a
+specific CTI on the system.
+
+As with the feature, a descriptor is used to define the configuration.
+This will define the features that must be enabled as part of the configuration
+as well as any preset values that can be used to override default parameter
+values.
+
+
+Preset Values
+~~~~~~~~~~~~~
+
+Preset values are easily selectable sets of parameter values for the features
+that the configuration uses. The number of values in a single preset set, equals
+the sum of parameter values in the features used by the configuration.
+
+e.g. a configuration consists of 3 features, one has 2 parameters, one has
+a single parameter, and another has no parameters. A single preset set will
+therefore have 3 values.
+
+Presets are optionally defined by the configuration, up to 15 can be defined.
+If no preset is selected, then the parameter values defined in the feature
+are used as normal.
+
+
+Operation
+~~~~~~~~~
+
+The following steps take place in the operation of a configuration.
+
+1) In this example, the configuration is 'autofdo', which has an
+ associated feature 'strobing' that works on ETMv4 CoreSight Devices.
+
+2) The configuration is enabled. For example 'perf' may select the
+ configuration as part of its command line::
+
+ perf record -e cs_etm/autofdo/ myapp
+
+ which will enable the 'autofdo' configuration.
+
+3) perf starts tracing on the system. As each ETMv4 that perf uses for
+ trace is enabled, the configuration manager will check if the ETMv4
+ has a feature that relates to the currently active configuration.
+ In this case 'strobing' is enabled & programmed into the ETMv4.
+
+4) When the ETMv4 is disabled, any registers marked as needing to be
+ saved will be read back.
+
+5) At the end of the perf session, the configuration will be disabled.
+
+
+Viewing Configurations and Features
+===================================
+
+The set of configurations and features that are currently loaded into the
+system can be viewed using the configfs API.
+
+Mount configfs as normal and the 'cs-syscfg' subsystem will appear::
+
+ $ ls /config
+ cs-syscfg stp-policy
+
+This has two sub-directories::
+
+ $ cd cs-syscfg/
+ $ ls
+ configurations features
+
+The system has the configuration 'autofdo' built in. It may be examined as
+follows::
+
+ $ cd configurations/
+ $ ls
+ autofdo
+ $ cd autofdo/
+ $ ls
+ description preset1 preset3 preset5 preset7 preset9
+ feature_refs preset2 preset4 preset6 preset8
+ $ cat description
+ Setup ETMs with strobing for autofdo
+ $ cat feature_refs
+ strobing
+
+Each preset declared has a preset<n> subdirectory declared. The values for
+the preset can be examined::
+
+ $ cat preset1/values
+ strobing.window = 0x1388 strobing.period = 0x2
+ $ cat preset2/values
+ strobing.window = 0x1388 strobing.period = 0x4
+
+The features referenced by the configuration can be examined in the features
+directory::
+
+ $ cd ../../features/strobing/
+ $ ls
+ description matches nr_params params
+ $ cat description
+ Generate periodic trace capture windows.
+ parameter 'window': a number of CPU cycles (W)
+ parameter 'period': trace enabled for W cycles every period x W cycles
+ $ cat matches
+ SRC_ETMV4
+ $ cat nr_params
+ 2
+
+Move to the params directory to examine and adjust parameters::
+
+ cd params
+ $ ls
+ period window
+ $ cd period
+ $ ls
+ value
+ $ cat value
+ 0x2710
+ # echo 15000 > value
+ # cat value
+ 0x3a98
+
+Parameters adjusted in this way are reflected in all device instances that have
+loaded the feature.
+
+
+Using Configurations in perf
+============================
+
+The configurations loaded into the CoreSight configuration management are
+also declared in the perf 'cs_etm' event infrastructure so that they can
+be selected when running trace under perf::
+
+ $ ls /sys/devices/cs_etm
+ configurations format perf_event_mux_interval_ms sinks type
+ events nr_addr_filters power
+
+Key directories here are 'configurations' - which lists the loaded
+configurations, and 'events' - a generic perf directory which allows
+selection on the perf command line.::
+
+ $ ls configurations/
+ autofdo
+ $ cat configurations/autofdo
+ 0xa7c3dddd
+
+As with the sinks entries, this provides a hash of the configuration name.
+The entry in the 'events' directory uses perfs built in syntax generator
+to substitute the syntax for the name when evaluating the command::
+
+ $ ls events/
+ autofdo
+ $ cat events/autofdo
+ configid=0xa7c3dddd
+
+The 'autofdo' configuration may be selected on the perf command line::
+
+ $ perf record -e cs_etm/autofdo/u --per-thread <application>
+
+A preset to override the current parameter values can also be selected::
+
+ $ perf record -e cs_etm/autofdo,preset=1/u --per-thread <application>
+
+When configurations are selected in this way, then the trace sink used is
+automatically selected.
diff --git a/Documentation/trace/coresight/coresight.rst b/Documentation/trace/coresight/coresight.rst
index 1ec8dc35b1d8..a15571d96cc8 100644
--- a/Documentation/trace/coresight/coresight.rst
+++ b/Documentation/trace/coresight/coresight.rst
@@ -620,6 +620,19 @@ channels on the CTM (Cross Trigger Matrix).
A separate documentation file is provided to explain the use of these devices.
(Documentation/trace/coresight/coresight-ect.rst) [#fourth]_.
+CoreSight System Configuration
+------------------------------
+
+CoreSight components can be complex devices with many programming options.
+Furthermore, components can be programmed to interact with each other across the
+complete system.
+
+A CoreSight System Configuration manager is provided to allow these complex programming
+configurations to be selected and used easily from perf and sysfs.
+
+See the separate document for further information.
+(Documentation/trace/coresight/coresight-config.rst) [#fifth]_.
+
.. [#first] Documentation/ABI/testing/sysfs-bus-coresight-devices-stm
@@ -628,3 +641,5 @@ A separate documentation file is provided to explain the use of these devices.
.. [#third] https://github.com/Linaro/perf-opencsd
.. [#fourth] Documentation/trace/coresight/coresight-ect.rst
+
+.. [#fifth] Documentation/trace/coresight/coresight-config.rst
diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst
index cfc81e98e0b8..4e5b26f03d5b 100644
--- a/Documentation/trace/ftrace.rst
+++ b/Documentation/trace/ftrace.rst
@@ -2762,7 +2762,7 @@ listed in:
put_prev_task_idle
kmem_cache_create
pick_next_task_rt
- get_online_cpus
+ cpus_read_lock
pick_next_task_fair
mutex_lock
[...]
diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 1409e40e6345..b7070d76f076 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -160,7 +160,6 @@ Code Seq# Include File Comments
'K' all linux/kd.h
'L' 00-1F linux/loop.h conflict!
'L' 10-1F drivers/scsi/mpt3sas/mpt3sas_ctl.h conflict!
-'L' 20-2F linux/lightnvm.h
'L' E0-FF linux/ppdd.h encrypted disk device driver
<http://linux01.gwdg.de/~alatham/ppdd.html>
'M' all linux/soundcard.h conflict!
diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
index d61219889e49..539e9d4a4860 100644
--- a/Documentation/userspace-api/seccomp_filter.rst
+++ b/Documentation/userspace-api/seccomp_filter.rst
@@ -263,7 +263,7 @@ Userspace can also add file descriptors to the notifying process via
``ioctl(SECCOMP_IOCTL_NOTIF_ADDFD)``. The ``id`` member of
``struct seccomp_notif_addfd`` should be the same ``id`` as in
``struct seccomp_notif``. The ``newfd_flags`` flag may be used to set flags
-like O_EXEC on the file descriptor in the notifying process. If the supervisor
+like O_CLOEXEC on the file descriptor in the notifying process. If the supervisor
wants to inject the file descriptor with a specific number, the
``SECCOMP_ADDFD_FLAG_SETFD`` flag can be used, and set the ``newfd`` member to
the specific number to use. If that file descriptor is already open in the
diff --git a/Documentation/userspace-api/spec_ctrl.rst b/Documentation/userspace-api/spec_ctrl.rst
index 7ddd8f667459..5e8ed9eef9aa 100644
--- a/Documentation/userspace-api/spec_ctrl.rst
+++ b/Documentation/userspace-api/spec_ctrl.rst
@@ -106,3 +106,11 @@ Speculation misfeature controls
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
+
+- PR_SPEC_L1D_FLUSH: Flush L1D Cache on context switch out of the task
+ (works only when tasks run on non SMT cores)
+
+ Invocations:
+ * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_L1D_FLUSH, 0, 0, 0);
+ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_L1D_FLUSH, PR_SPEC_ENABLE, 0, 0);
+ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_L1D_FLUSH, PR_SPEC_DISABLE, 0, 0);
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 35eca377543d..88fa495abbac 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -25,10 +25,10 @@ On x86:
- vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
-- kvm->arch.mmu_lock is an rwlock. kvm->arch.tdp_mmu_pages_lock is
- taken inside kvm->arch.mmu_lock, and cannot be taken without already
- holding kvm->arch.mmu_lock (typically with ``read_lock``, otherwise
- there's no need to take kvm->arch.tdp_mmu_pages_lock at all).
+- kvm->arch.mmu_lock is an rwlock. kvm->arch.tdp_mmu_pages_lock and
+ kvm->arch.mmu_unsync_pages_lock are taken inside kvm->arch.mmu_lock, and
+ cannot be taken without already holding kvm->arch.mmu_lock (typically with
+ ``read_lock`` for the TDP MMU, thus the need for additional spinlocks).
Everything else is a leaf: no other lock is taken inside the critical
sections.
diff --git a/Documentation/x86/x86_64/boot-options.rst b/Documentation/x86/x86_64/boot-options.rst
index 5f62b3b86357..ccb7e86bf8d9 100644
--- a/Documentation/x86/x86_64/boot-options.rst
+++ b/Documentation/x86/x86_64/boot-options.rst
@@ -126,7 +126,7 @@ Idle loop
Rebooting
=========
- reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] [, [w]arm | [c]old]
+ reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] | p[ci] [, [w]arm | [c]old]
bios
Use the CPU reboot vector for warm reset
warm
@@ -145,6 +145,8 @@ Rebooting
Use efi reset_system runtime service. If EFI is not configured or
the EFI reset does not work, the reboot path attempts the reset using
the keyboard controller.
+ pci
+ Use a write to the PCI config space register 0xcf9 to trigger reboot.
Using warm reset will be much faster especially on big memory
systems because the BIOS will not go through the memory check.
@@ -155,6 +157,13 @@ Rebooting
Don't stop other CPUs on reboot. This can make reboot more reliable
in some cases.
+ reboot=default
+ There are some built-in platform specific "quirks" - you may see:
+ "reboot: <name> series board detected. Selecting <type> for reboots."
+ In the case where you think the quirk is in error (e.g. you have
+ newer BIOS, or newer board) using this option will ignore the built-in
+ quirk table, and use the generic default reboot actions.
+
Non Executable Mappings
=======================