summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/debugfs-cxl87
-rw-r--r--Documentation/ABI/testing/debugfs-driver-qat_telemetry27
-rw-r--r--Documentation/RCU/Design/Requirements/Requirements.rst52
-rw-r--r--Documentation/RCU/RTFP.txt6
-rw-r--r--Documentation/RCU/checklist.rst27
-rw-r--r--Documentation/RCU/index.rst6
-rw-r--r--Documentation/RCU/torture.rst4
-rw-r--r--Documentation/RCU/whatisRCU.rst150
-rw-r--r--Documentation/arch/x86/tdx.rst14
-rw-r--r--Documentation/crypto/api-aead.rst3
-rw-r--r--Documentation/crypto/api-akcipher.rst3
-rw-r--r--Documentation/crypto/api-digest.rst3
-rw-r--r--Documentation/crypto/api-kpp.rst3
-rw-r--r--Documentation/crypto/api-rng.rst3
-rw-r--r--Documentation/crypto/api-sig.rst3
-rw-r--r--Documentation/crypto/api-skcipher.rst3
-rw-r--r--Documentation/devicetree/bindings/clock/riscv,rpmi-clock.yaml64
-rw-r--r--Documentation/devicetree/bindings/clock/riscv,rpmi-mpxy-clock.yaml64
-rw-r--r--Documentation/devicetree/bindings/crypto/ti,am62l-dthev2.yaml50
-rw-r--r--Documentation/devicetree/bindings/crypto/xlnx,versal-trng.yaml35
-rw-r--r--Documentation/devicetree/bindings/interrupt-controller/riscv,rpmi-mpxy-system-msi.yaml67
-rw-r--r--Documentation/devicetree/bindings/interrupt-controller/riscv,rpmi-system-msi.yaml74
-rw-r--r--Documentation/devicetree/bindings/mailbox/riscv,rpmi-shmem-mbox.yaml124
-rw-r--r--Documentation/devicetree/bindings/mailbox/riscv,sbi-mpxy-mbox.yaml51
-rw-r--r--Documentation/devicetree/bindings/rng/hisi-rng.txt12
-rw-r--r--Documentation/devicetree/bindings/rng/hisi-rng.yaml32
-rw-r--r--Documentation/driver-api/cxl/conventions.rst135
-rw-r--r--Documentation/driver-api/cxl/maturity-map.rst2
-rw-r--r--Documentation/driver-api/cxl/platform/bios-and-efi.rst2
-rw-r--r--Documentation/filesystems/proc.rst14
-rw-r--r--Documentation/virt/kvm/api.rst9
31 files changed, 1032 insertions, 97 deletions
diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
index e95e21f131e9..2989d4da96c1 100644
--- a/Documentation/ABI/testing/debugfs-cxl
+++ b/Documentation/ABI/testing/debugfs-cxl
@@ -19,6 +19,20 @@ Description:
is returned to the user. The inject_poison attribute is only
visible for devices supporting the capability.
+ TEST-ONLY INTERFACE: This interface is intended for testing
+ and validation purposes only. It is not a data repair mechanism
+ and should never be used on production systems or live data.
+
+ DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
+ poison injection can result in permanent data loss. Injected
+ poison may render data permanently inaccessible even after
+ clearing, as the clear operation writes zeros and does not
+ recover original data.
+
+ SYSTEM STABILITY RISK: For volatile memory, poison injection
+ can cause kernel crashes, system instability, or unpredictable
+ behavior if the poisoned addresses are accessed by running code
+ or critical kernel structures.
What: /sys/kernel/debug/cxl/memX/clear_poison
Date: April, 2023
@@ -35,6 +49,79 @@ Description:
The clear_poison attribute is only visible for devices
supporting the capability.
+ TEST-ONLY INTERFACE: This interface is intended for testing
+ and validation purposes only. It is not a data repair mechanism
+ and should never be used on production systems or live data.
+
+ CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
+ specified address range and removes the address from the poison
+ list. It does NOT recover or restore original data that may have
+ been present before poison injection. Any original data at the
+ cleared address is permanently lost and replaced with zeros.
+
+ CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
+ purposes only and should not be used as a data repair tool.
+ Clearing poison is fundamentally different from data recovery
+ or error correction.
+
+What: /sys/kernel/debug/cxl/regionX/inject_poison
+Date: August, 2025
+Contact: linux-cxl@vger.kernel.org
+Description:
+ (WO) When a Host Physical Address (HPA) is written to this
+ attribute, the region driver translates it to a Device
+ Physical Address (DPA) and identifies the corresponding
+ memdev. It then sends an inject poison command to that memdev
+ at the translated DPA. Refer to the memdev ABI entry at:
+ /sys/kernel/debug/cxl/memX/inject_poison for the detailed
+ behavior. This attribute is only visible if all memdevs
+ participating in the region support both inject and clear
+ poison commands.
+
+ TEST-ONLY INTERFACE: This interface is intended for testing
+ and validation purposes only. It is not a data repair mechanism
+ and should never be used on production systems or live data.
+
+ DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
+ poison injection can result in permanent data loss. Injected
+ poison may render data permanently inaccessible even after
+ clearing, as the clear operation writes zeros and does not
+ recover original data.
+
+ SYSTEM STABILITY RISK: For volatile memory, poison injection
+ can cause kernel crashes, system instability, or unpredictable
+ behavior if the poisoned addresses are accessed by running code
+ or critical kernel structures.
+
+What: /sys/kernel/debug/cxl/regionX/clear_poison
+Date: August, 2025
+Contact: linux-cxl@vger.kernel.org
+Description:
+ (WO) When a Host Physical Address (HPA) is written to this
+ attribute, the region driver translates it to a Device
+ Physical Address (DPA) and identifies the corresponding
+ memdev. It then sends a clear poison command to that memdev
+ at the translated DPA. Refer to the memdev ABI entry at:
+ /sys/kernel/debug/cxl/memX/clear_poison for the detailed
+ behavior. This attribute is only visible if all memdevs
+ participating in the region support both inject and clear
+ poison commands.
+
+ TEST-ONLY INTERFACE: This interface is intended for testing
+ and validation purposes only. It is not a data repair mechanism
+ and should never be used on production systems or live data.
+
+ CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
+ specified address range and removes the address from the poison
+ list. It does NOT recover or restore original data that may have
+ been present before poison injection. Any original data at the
+ cleared address is permanently lost and replaced with zeros.
+
+ CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
+ purposes only and should not be used as a data repair tool.
+ Clearing poison is fundamentally different from data recovery
+ or error correction.
+
What: /sys/kernel/debug/cxl/einj_types
Date: January, 2024
KernelVersion: v6.9
diff --git a/Documentation/ABI/testing/debugfs-driver-qat_telemetry b/Documentation/ABI/testing/debugfs-driver-qat_telemetry
index 0dfd8d97e169..06097ee0f154 100644
--- a/Documentation/ABI/testing/debugfs-driver-qat_telemetry
+++ b/Documentation/ABI/testing/debugfs-driver-qat_telemetry
@@ -57,6 +57,7 @@ Description: (RO) Reports device telemetry counters.
gp_lat_acc_avg average get to put latency [ns]
bw_in PCIe, write bandwidth [Mbps]
bw_out PCIe, read bandwidth [Mbps]
+ re_acc_avg average ring empty time [ns]
at_page_req_lat_avg Address Translator(AT), average page
request latency [ns]
at_trans_lat_avg AT, average page translation latency [ns]
@@ -85,6 +86,32 @@ Description: (RO) Reports device telemetry counters.
exec_cph<N> execution count of Cipher slice N
util_ath<N> utilization of Authentication slice N [%]
exec_ath<N> execution count of Authentication slice N
+ cmdq_wait_cnv<N> wait time for cmdq N to get Compression and verify
+ slice ownership
+ cmdq_exec_cnv<N> Compression and verify slice execution time while
+ owned by cmdq N
+ cmdq_drain_cnv<N> time taken for cmdq N to release Compression and
+ verify slice ownership
+ cmdq_wait_dcprz<N> wait time for cmdq N to get Decompression
+ slice N ownership
+ cmdq_exec_dcprz<N> Decompression slice execution time while
+ owned by cmdq N
+ cmdq_drain_dcprz<N> time taken for cmdq N to release Decompression
+ slice ownership
+ cmdq_wait_pke<N> wait time for cmdq N to get PKE slice ownership
+ cmdq_exec_pke<N> PKE slice execution time while owned by cmdq N
+ cmdq_drain_pke<N> time taken for cmdq N to release PKE slice
+ ownership
+ cmdq_wait_ucs<N> wait time for cmdq N to get UCS slice ownership
+ cmdq_exec_ucs<N> UCS slice execution time while owned by cmdq N
+ cmdq_drain_ucs<N> time taken for cmdq N to release UCS slice
+ ownership
+ cmdq_wait_ath<N> wait time for cmdq N to get Authentication slice
+ ownership
+ cmdq_exec_ath<N> Authentication slice execution time while owned
+ by cmdq N
+ cmdq_drain_ath<N> time taken for cmdq N to release Authentication
+ slice ownership
======================= ========================================
The telemetry report file can be read with the following command::
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index b0395540296b..f24b3c0b9b0d 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -1973,9 +1973,7 @@ code, and the FQS loop, all of which refer to or modify this bookkeeping.
Note that grace period initialization (rcu_gp_init()) must carefully sequence
CPU hotplug scanning with grace period state changes. For example, the
following race could occur in rcu_gp_init() if rcu_seq_start() were to happen
-after the CPU hotplug scanning.
-
-.. code-block:: none
+after the CPU hotplug scanning::
CPU0 (rcu_gp_init) CPU1 CPU2
--------------------- ---- ----
@@ -2008,22 +2006,22 @@ after the CPU hotplug scanning.
kfree(r1);
r2 = *r0; // USE-AFTER-FREE!
-By incrementing gp_seq first, CPU1's RCU read-side critical section
+By incrementing ``gp_seq`` first, CPU1's RCU read-side critical section
is guaranteed to not be missed by CPU2.
-**Concurrent Quiescent State Reporting for Offline CPUs**
+Concurrent Quiescent State Reporting for Offline CPUs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RCU must ensure that CPUs going offline report quiescent states to avoid
blocking grace periods. This requires careful synchronization to handle
race conditions
-**Race condition causing Offline CPU to hang GP**
-
-A race between CPU offlining and new GP initialization (gp_init) may occur
-because `rcu_report_qs_rnp()` in `rcutree_report_cpu_dead()` must temporarily
-release the `rcu_node` lock to wake the RCU grace-period kthread:
+Race condition causing Offline CPU to hang GP
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-.. code-block:: none
+A race between CPU offlining and new GP initialization (gp_init()) may occur
+because rcu_report_qs_rnp() in rcutree_report_cpu_dead() must temporarily
+release the ``rcu_node`` lock to wake the RCU grace-period kthread::
CPU1 (going offline) CPU0 (GP kthread)
-------------------- -----------------
@@ -2044,15 +2042,14 @@ release the `rcu_node` lock to wake the RCU grace-period kthread:
// Reacquire lock (but too late)
rnp->qsmaskinitnext &= ~mask // Finally clears bit
-Without `ofl_lock`, the new grace period includes the offline CPU and waits
+Without ``ofl_lock``, the new grace period includes the offline CPU and waits
forever for its quiescent state causing a GP hang.
-**A solution with ofl_lock**
+A solution with ofl_lock
+^^^^^^^^^^^^^^^^^^^^^^^^
-The `ofl_lock` (offline lock) prevents `rcu_gp_init()` from running during
-the vulnerable window when `rcu_report_qs_rnp()` has released `rnp->lock`:
-
-.. code-block:: none
+The ``ofl_lock`` (offline lock) prevents rcu_gp_init() from running during
+the vulnerable window when rcu_report_qs_rnp() has released ``rnp->lock``::
CPU0 (rcu_gp_init) CPU1 (rcutree_report_cpu_dead)
------------------ ------------------------------
@@ -2065,21 +2062,20 @@ the vulnerable window when `rcu_report_qs_rnp()` has released `rnp->lock`:
arch_spin_unlock(&ofl_lock) ---> // Now CPU1 can proceed
} // But snapshot already taken
-**Another race causing GP hangs in rcu_gpu_init(): Reporting QS for Now-offline CPUs**
+Another race causing GP hangs in rcu_gpu_init(): Reporting QS for Now-offline CPUs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
After the first loop takes an atomic snapshot of online CPUs, as shown above,
-the second loop in `rcu_gp_init()` detects CPUs that went offline between
-releasing `ofl_lock` and acquiring the per-node `rnp->lock`. This detection is
-crucial because:
+the second loop in rcu_gp_init() detects CPUs that went offline between
+releasing ``ofl_lock`` and acquiring the per-node ``rnp->lock``.
+This detection is crucial because:
1. The CPU might have gone offline after the snapshot but before the second loop
2. The offline CPU cannot report its own QS if it's already dead
3. Without this detection, the grace period would wait forever for CPUs that
are now offline.
-The second loop performs this detection safely:
-
-.. code-block:: none
+The second loop performs this detection safely::
rcu_for_each_node_breadth_first(rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
@@ -2093,10 +2089,10 @@ The second loop performs this detection safely:
}
This approach ensures atomicity: quiescent state reporting for offline CPUs
-happens either in `rcu_gp_init()` (second loop) or in `rcutree_report_cpu_dead()`,
-never both and never neither. The `rnp->lock` held throughout the sequence
-prevents races - `rcutree_report_cpu_dead()` also acquires this lock when
-clearing `qsmaskinitnext`, ensuring mutual exclusion.
+happens either in rcu_gp_init() (second loop) or in rcutree_report_cpu_dead(),
+never both and never neither. The ``rnp->lock`` held throughout the sequence
+prevents races - rcutree_report_cpu_dead() also acquires this lock when
+clearing ``qsmaskinitnext``, ensuring mutual exclusion.
Scheduler and RCU
~~~~~~~~~~~~~~~~~
diff --git a/Documentation/RCU/RTFP.txt b/Documentation/RCU/RTFP.txt
index db8f16b392aa..8d4e8de4c460 100644
--- a/Documentation/RCU/RTFP.txt
+++ b/Documentation/RCU/RTFP.txt
@@ -641,7 +641,7 @@ Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni"
,Month="July"
,Year="2001"
,note="Available:
-\url{http://www.linuxsymposium.org/2001/abstracts/readcopy.php}
+\url{https://kernel.org/doc/ols/2001/read-copy.pdf}
\url{http://www.rdrop.com/users/paulmck/RCU/rclock_OLS.2001.05.01c.pdf}
[Viewed June 23, 2004]"
,annotation={
@@ -1480,7 +1480,7 @@ Suparna Bhattacharya"
,Year="2006"
,pages="v2 123-138"
,note="Available:
-\url{http://www.linuxsymposium.org/2006/view_abstract.php?content_key=184}
+\url{https://kernel.org/doc/ols/2006/ols2006v2-pages-131-146.pdf}
\url{http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf}
[Viewed January 1, 2007]"
,annotation={
@@ -1511,7 +1511,7 @@ Canis Rufus and Zoicon5 and Anome and Hal Eisen"
,Year="2006"
,pages="v2 249-254"
,note="Available:
-\url{http://www.linuxsymposium.org/2006/view_abstract.php?content_key=184}
+\url{https://kernel.org/doc/ols/2006/ols2006v2-pages-249-262.pdf}
[Viewed January 11, 2009]"
,annotation={
Uses RCU-protected radix tree for a lockless page cache.
diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst
index 7de3e308f330..c9bfb2b218e5 100644
--- a/Documentation/RCU/checklist.rst
+++ b/Documentation/RCU/checklist.rst
@@ -69,7 +69,13 @@ over a rather long period of time, but improvements are always welcome!
Explicit disabling of preemption (preempt_disable(), for example)
can serve as rcu_read_lock_sched(), but is less readable and
prevents lockdep from detecting locking issues. Acquiring a
- spinlock also enters an RCU read-side critical section.
+ raw spinlock also enters an RCU read-side critical section.
+
+ The guard(rcu)() and scoped_guard(rcu) primitives designate
+ the remainder of the current scope or the next statement,
+ respectively, as the RCU read-side critical section. Use of
+ these guards can be less error-prone than rcu_read_lock(),
+ rcu_read_unlock(), and friends.
Please note that you *cannot* rely on code known to be built
only in non-preemptible kernels. Such code can and will break,
@@ -405,9 +411,11 @@ over a rather long period of time, but improvements are always welcome!
13. Unlike most flavors of RCU, it *is* permissible to block in an
SRCU read-side critical section (demarked by srcu_read_lock()
and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
- Please note that if you don't need to sleep in read-side critical
- sections, you should be using RCU rather than SRCU, because RCU
- is almost always faster and easier to use than is SRCU.
+ As with RCU, guard(srcu)() and scoped_guard(srcu) forms are
+ available, and often provide greater ease of use. Please note
+ that if you don't need to sleep in read-side critical sections,
+ you should be using RCU rather than SRCU, because RCU is almost
+ always faster and easier to use than is SRCU.
Also unlike other forms of RCU, explicit initialization and
cleanup is required either at build time via DEFINE_SRCU()
@@ -443,10 +451,13 @@ over a rather long period of time, but improvements are always welcome!
real-time workloads than is synchronize_rcu_expedited().
It is also permissible to sleep in RCU Tasks Trace read-side
- critical section, which are delimited by rcu_read_lock_trace() and
- rcu_read_unlock_trace(). However, this is a specialized flavor
- of RCU, and you should not use it without first checking with
- its current users. In most cases, you should instead use SRCU.
+ critical section, which are delimited by rcu_read_lock_trace()
+ and rcu_read_unlock_trace(). However, this is a specialized
+ flavor of RCU, and you should not use it without first checking
+ with its current users. In most cases, you should instead
+ use SRCU. As with RCU and SRCU, guard(rcu_tasks_trace)() and
+ scoped_guard(rcu_tasks_trace) are available, and often provide
+ greater ease of use.
Note that rcu_assign_pointer() relates to SRCU just as it does to
other forms of RCU, but instead of rcu_dereference() you should
diff --git a/Documentation/RCU/index.rst b/Documentation/RCU/index.rst
index 84a79903f6a8..ef26c78507d3 100644
--- a/Documentation/RCU/index.rst
+++ b/Documentation/RCU/index.rst
@@ -1,13 +1,13 @@
.. SPDX-License-Identifier: GPL-2.0
-.. _rcu_concepts:
+.. _rcu_handbook:
============
-RCU concepts
+RCU Handbook
============
.. toctree::
- :maxdepth: 3
+ :maxdepth: 2
checklist
lockdep
diff --git a/Documentation/RCU/torture.rst b/Documentation/RCU/torture.rst
index 4b1f99c4181f..1ad5cc793811 100644
--- a/Documentation/RCU/torture.rst
+++ b/Documentation/RCU/torture.rst
@@ -344,7 +344,7 @@ painstaking and error-prone.
And this is why the kvm-remote.sh script exists.
-If you the following command works::
+If the following command works::
ssh system0 date
@@ -364,7 +364,7 @@ systems must come first.
The kvm.sh ``--dryrun scenarios`` argument is useful for working out
how many scenarios may be run in one batch across a group of systems.
-You can also re-run a previous remote run in a manner similar to kvm.sh:
+You can also re-run a previous remote run in a manner similar to kvm.sh::
kvm-remote.sh "system0 system1 system2 system3 system4 system5" \
tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28-remote \
diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst
index be2eb6be16ec..cf0b0ac9f463 100644
--- a/Documentation/RCU/whatisRCU.rst
+++ b/Documentation/RCU/whatisRCU.rst
@@ -1021,32 +1021,41 @@ RCU list traversal::
list_entry_rcu
list_entry_lockless
list_first_entry_rcu
+ list_first_or_null_rcu
+ list_tail_rcu
list_next_rcu
+ list_next_or_null_rcu
list_for_each_entry_rcu
list_for_each_entry_continue_rcu
list_for_each_entry_from_rcu
- list_first_or_null_rcu
- list_next_or_null_rcu
+ list_for_each_entry_lockless
hlist_first_rcu
hlist_next_rcu
hlist_pprev_rcu
hlist_for_each_entry_rcu
+ hlist_for_each_entry_rcu_notrace
hlist_for_each_entry_rcu_bh
hlist_for_each_entry_from_rcu
hlist_for_each_entry_continue_rcu
hlist_for_each_entry_continue_rcu_bh
hlist_nulls_first_rcu
+ hlist_nulls_next_rcu
hlist_nulls_for_each_entry_rcu
+ hlist_nulls_for_each_entry_safe
hlist_bl_first_rcu
hlist_bl_for_each_entry_rcu
RCU pointer/list update::
rcu_assign_pointer
+ rcu_replace_pointer
+ INIT_LIST_HEAD_RCU
list_add_rcu
list_add_tail_rcu
list_del_rcu
list_replace_rcu
+ list_splice_init_rcu
+ list_splice_tail_init_rcu
hlist_add_behind_rcu
hlist_add_before_rcu
hlist_add_head_rcu
@@ -1054,34 +1063,53 @@ RCU pointer/list update::
hlist_del_rcu
hlist_del_init_rcu
hlist_replace_rcu
- list_splice_init_rcu
- list_splice_tail_init_rcu
hlist_nulls_del_init_rcu
hlist_nulls_del_rcu
hlist_nulls_add_head_rcu
+ hlist_nulls_add_tail_rcu
+ hlist_nulls_add_fake
+ hlists_swap_heads_rcu
hlist_bl_add_head_rcu
- hlist_bl_del_init_rcu
hlist_bl_del_rcu
hlist_bl_set_first_rcu
RCU::
- Critical sections Grace period Barrier
-
- rcu_read_lock synchronize_net rcu_barrier
- rcu_read_unlock synchronize_rcu
- rcu_dereference synchronize_rcu_expedited
- rcu_read_lock_held call_rcu
- rcu_dereference_check kfree_rcu
- rcu_dereference_protected
+ Critical sections Grace period Barrier
+
+ rcu_read_lock synchronize_net rcu_barrier
+ rcu_read_unlock synchronize_rcu
+ guard(rcu)() synchronize_rcu_expedited
+ scoped_guard(rcu) synchronize_rcu_mult
+ rcu_dereference call_rcu
+ rcu_dereference_check call_rcu_hurry
+ rcu_dereference_protected kfree_rcu
+ rcu_read_lock_held kvfree_rcu
+ rcu_read_lock_any_held kfree_rcu_mightsleep
+ rcu_pointer_handoff cond_synchronize_rcu
+ unrcu_pointer cond_synchronize_rcu_full
+ cond_synchronize_rcu_expedited
+ cond_synchronize_rcu_expedited_full
+ get_completed_synchronize_rcu
+ get_completed_synchronize_rcu_full
+ get_state_synchronize_rcu
+ get_state_synchronize_rcu_full
+ poll_state_synchronize_rcu
+ poll_state_synchronize_rcu_full
+ same_state_synchronize_rcu
+ same_state_synchronize_rcu_full
+ start_poll_synchronize_rcu
+ start_poll_synchronize_rcu_full
+ start_poll_synchronize_rcu_expedited
+ start_poll_synchronize_rcu_expedited_full
bh::
Critical sections Grace period Barrier
- rcu_read_lock_bh call_rcu rcu_barrier
- rcu_read_unlock_bh synchronize_rcu
- [local_bh_disable] synchronize_rcu_expedited
+ rcu_read_lock_bh [Same as RCU] [Same as RCU]
+ rcu_read_unlock_bh
+ [local_bh_disable]
[and friends]
rcu_dereference_bh
rcu_dereference_bh_check
@@ -1092,9 +1120,9 @@ sched::
Critical sections Grace period Barrier
- rcu_read_lock_sched call_rcu rcu_barrier
- rcu_read_unlock_sched synchronize_rcu
- [preempt_disable] synchronize_rcu_expedited
+ rcu_read_lock_sched [Same as RCU] [Same as RCU]
+ rcu_read_unlock_sched
+ [preempt_disable]
[and friends]
rcu_read_lock_sched_notrace
rcu_read_unlock_sched_notrace
@@ -1104,46 +1132,104 @@ sched::
rcu_read_lock_sched_held
+RCU: Initialization/cleanup/ordering::
+
+ RCU_INIT_POINTER
+ RCU_INITIALIZER
+ RCU_POINTER_INITIALIZER
+ init_rcu_head
+ destroy_rcu_head
+ init_rcu_head_on_stack
+ destroy_rcu_head_on_stack
+ SLAB_TYPESAFE_BY_RCU
+
+
+RCU: Quiescents states and control::
+
+ cond_resched_tasks_rcu_qs
+ rcu_all_qs
+ rcu_softirq_qs_periodic
+ rcu_end_inkernel_boot
+ rcu_expedite_gp
+ rcu_gp_is_expedited
+ rcu_unexpedite_gp
+ rcu_cpu_stall_reset
+ rcu_head_after_call_rcu
+ rcu_is_watching
+
+
+RCU-sync primitive::
+
+ rcu_sync_is_idle
+ rcu_sync_init
+ rcu_sync_enter
+ rcu_sync_exit
+ rcu_sync_dtor
+
+
RCU-Tasks::
- Critical sections Grace period Barrier
+ Critical sections Grace period Barrier
- N/A call_rcu_tasks rcu_barrier_tasks
+ N/A call_rcu_tasks rcu_barrier_tasks
synchronize_rcu_tasks
RCU-Tasks-Rude::
- Critical sections Grace period Barrier
+ Critical sections Grace period Barrier
- N/A N/A
- synchronize_rcu_tasks_rude
+ N/A synchronize_rcu_tasks_rude rcu_barrier_tasks_rude
+ call_rcu_tasks_rude
RCU-Tasks-Trace::
- Critical sections Grace period Barrier
+ Critical sections Grace period Barrier
- rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace
+ rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace
rcu_read_unlock_trace synchronize_rcu_tasks_trace
+ guard(rcu_tasks_trace)()
+ scoped_guard(rcu_tasks_trace)
-SRCU::
+SRCU list traversal::
+ list_for_each_entry_srcu
+ hlist_for_each_entry_srcu
- Critical sections Grace period Barrier
- srcu_read_lock call_srcu srcu_barrier
- srcu_read_unlock synchronize_srcu
- srcu_dereference synchronize_srcu_expedited
+SRCU::
+
+ Critical sections Grace period Barrier
+
+ srcu_read_lock call_srcu srcu_barrier
+ srcu_read_unlock synchronize_srcu
+ srcu_read_lock_fast synchronize_srcu_expedited
+ srcu_read_unlock_fast get_state_synchronize_srcu
+ srcu_read_lock_nmisafe start_poll_synchronize_srcu
+ srcu_read_unlock_nmisafe start_poll_synchronize_srcu_expedited
+ srcu_read_lock_notrace poll_state_synchronize_srcu
+ srcu_read_unlock_notrace
+ srcu_down_read
+ srcu_up_read
+ srcu_down_read_fast
+ srcu_up_read_fast
+ guard(srcu)()
+ scoped_guard(srcu)
+ srcu_read_lock_held
+ srcu_dereference
srcu_dereference_check
+ srcu_dereference_notrace
srcu_read_lock_held
-SRCU: Initialization/cleanup::
+
+SRCU: Initialization/cleanup/ordering::
DEFINE_SRCU
DEFINE_STATIC_SRCU
init_srcu_struct
cleanup_srcu_struct
+ smp_mb__after_srcu_read_unlock
All: lockdep-checked RCU utility APIs::
diff --git a/Documentation/arch/x86/tdx.rst b/Documentation/arch/x86/tdx.rst
index 719043cd8b46..61670e7df2f7 100644
--- a/Documentation/arch/x86/tdx.rst
+++ b/Documentation/arch/x86/tdx.rst
@@ -142,13 +142,6 @@ but depends on the BIOS to behave correctly.
Note TDX works with CPU logical online/offline, thus the kernel still
allows to offline logical CPU and online it again.
-Kexec()
-~~~~~~~
-
-TDX host support currently lacks the ability to handle kexec. For
-simplicity only one of them can be enabled in the Kconfig. This will be
-fixed in the future.
-
Erratum
~~~~~~~
@@ -171,6 +164,13 @@ If the platform has such erratum, the kernel prints additional message in
machine check handler to tell user the machine check may be caused by
kernel bug on TDX private memory.
+Kexec
+~~~~~~~
+
+Currently kexec doesn't work on the TDX platforms with the aforementioned
+erratum. It fails when loading the kexec kernel image. Otherwise it
+works normally.
+
Interaction vs S3 and deeper states
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/Documentation/crypto/api-aead.rst b/Documentation/crypto/api-aead.rst
index d15256f1ae36..78d073319f96 100644
--- a/Documentation/crypto/api-aead.rst
+++ b/Documentation/crypto/api-aead.rst
@@ -1,3 +1,6 @@
+Authenticated Encryption With Associated Data (AEAD)
+====================================================
+
Authenticated Encryption With Associated Data (AEAD) Algorithm Definitions
--------------------------------------------------------------------------
diff --git a/Documentation/crypto/api-akcipher.rst b/Documentation/crypto/api-akcipher.rst
index ca1ecdd4a7d3..a31f5aef7667 100644
--- a/Documentation/crypto/api-akcipher.rst
+++ b/Documentation/crypto/api-akcipher.rst
@@ -1,3 +1,6 @@
+Asymmetric Cipher
+=================
+
Asymmetric Cipher Algorithm Definitions
---------------------------------------
diff --git a/Documentation/crypto/api-digest.rst b/Documentation/crypto/api-digest.rst
index 7a1e670d6ce1..02a2bcc26a64 100644
--- a/Documentation/crypto/api-digest.rst
+++ b/Documentation/crypto/api-digest.rst
@@ -1,3 +1,6 @@
+Message Digest
+==============
+
Message Digest Algorithm Definitions
------------------------------------
diff --git a/Documentation/crypto/api-kpp.rst b/Documentation/crypto/api-kpp.rst
index 7d86ab906bdf..5794e2d10c95 100644
--- a/Documentation/crypto/api-kpp.rst
+++ b/Documentation/crypto/api-kpp.rst
@@ -1,3 +1,6 @@
+Key-agreement Protocol Primitives (KPP)
+=======================================
+
Key-agreement Protocol Primitives (KPP) Cipher Algorithm Definitions
--------------------------------------------------------------------
diff --git a/Documentation/crypto/api-rng.rst b/Documentation/crypto/api-rng.rst
index 10ba7436cee4..23a94c0b272e 100644
--- a/Documentation/crypto/api-rng.rst
+++ b/Documentation/crypto/api-rng.rst
@@ -1,3 +1,6 @@
+Random Number Generator (RNG)
+=============================
+
Random Number Algorithm Definitions
-----------------------------------
diff --git a/Documentation/crypto/api-sig.rst b/Documentation/crypto/api-sig.rst
index aaec18e26d54..4d8aba8aee8e 100644
--- a/Documentation/crypto/api-sig.rst
+++ b/Documentation/crypto/api-sig.rst
@@ -1,3 +1,6 @@
+Asymmetric Signature
+====================
+
Asymmetric Signature Algorithm Definitions
------------------------------------------
diff --git a/Documentation/crypto/api-skcipher.rst b/Documentation/crypto/api-skcipher.rst
index 04d6cc5357c8..4b7c8160790a 100644
--- a/Documentation/crypto/api-skcipher.rst
+++ b/Documentation/crypto/api-skcipher.rst
@@ -1,3 +1,6 @@
+Symmetric Key Cipher
+====================
+
Block Cipher Algorithm Definitions
----------------------------------
diff --git a/Documentation/devicetree/bindings/clock/riscv,rpmi-clock.yaml b/Documentation/devicetree/bindings/clock/riscv,rpmi-clock.yaml
new file mode 100644
index 000000000000..5d62bf8215c8
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/riscv,rpmi-clock.yaml
@@ -0,0 +1,64 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/riscv,rpmi-clock.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V RPMI clock service group based clock controller
+
+maintainers:
+ - Anup Patel <anup@brainfault.org>
+
+description: |
+ The RISC-V Platform Management Interface (RPMI) [1] defines a
+ messaging protocol which is modular and extensible. The supervisor
+ software can send/receive RPMI messages via SBI MPXY extension [2]
+ or some dedicated supervisor-mode RPMI transport.
+
+ The RPMI specification [1] defines clock service group for accessing
+ system clocks managed by a platform microcontroller. The supervisor
+ software can access RPMI clock service group via SBI MPXY channel or
+ some dedicated supervisor-mode RPMI transport.
+
+ ===========================================
+ References
+ ===========================================
+
+ [1] RISC-V Platform Management Interface (RPMI) v1.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-rpmi/releases
+
+ [2] RISC-V Supervisor Binary Interface (SBI) v3.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-sbi-doc/releases
+
+properties:
+ compatible:
+ description:
+ Intended for use by the supervisor software.
+ const: riscv,rpmi-clock
+
+ mboxes:
+ maxItems: 1
+ description:
+ Mailbox channel of the underlying RPMI transport or SBI message proxy channel.
+
+ "#clock-cells":
+ const: 1
+ description:
+ Platform specific CLOCK_ID as defined by the RISC-V Platform Management
+ Interface (RPMI) specification.
+
+required:
+ - compatible
+ - mboxes
+ - "#clock-cells"
+
+additionalProperties: false
+
+examples:
+ - |
+ clock-controller {
+ compatible = "riscv,rpmi-clock";
+ mboxes = <&mpxy_mbox 0x1000 0x0>;
+ #clock-cells = <1>;
+ };
+...
diff --git a/Documentation/devicetree/bindings/clock/riscv,rpmi-mpxy-clock.yaml b/Documentation/devicetree/bindings/clock/riscv,rpmi-mpxy-clock.yaml
new file mode 100644
index 000000000000..76f2a1b3d30d
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/riscv,rpmi-mpxy-clock.yaml
@@ -0,0 +1,64 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/riscv,rpmi-mpxy-clock.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V RPMI clock service group based message proxy
+
+maintainers:
+ - Anup Patel <anup@brainfault.org>
+
+description: |
+ The RISC-V Platform Management Interface (RPMI) [1] defines a
+ messaging protocol which is modular and extensible. The supervisor
+ software can send/receive RPMI messages via SBI MPXY extension [2]
+ or some dedicated supervisor-mode RPMI transport.
+
+ The RPMI specification [1] defines clock service group for accessing
+ system clocks managed by a platform microcontroller. The SBI implementation
+ (machine mode firmware or hypervisor) can implement an SBI MPXY channel
+ to allow RPMI clock service group access to the supervisor software.
+
+ ===========================================
+ References
+ ===========================================
+
+ [1] RISC-V Platform Management Interface (RPMI) v1.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-rpmi/releases
+
+ [2] RISC-V Supervisor Binary Interface (SBI) v3.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-sbi-doc/releases
+
+properties:
+ compatible:
+ description:
+ Intended for use by the SBI implementation.
+ const: riscv,rpmi-mpxy-clock
+
+ mboxes:
+ maxItems: 1
+ description:
+ Mailbox channel of the underlying RPMI transport.
+
+ riscv,sbi-mpxy-channel-id:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description:
+ The SBI MPXY channel id to be used for providing RPMI access to
+ the supervisor software.
+
+required:
+ - compatible
+ - mboxes
+ - riscv,sbi-mpxy-channel-id
+
+additionalProperties: false
+
+examples:
+ - |
+ clock-service {
+ compatible = "riscv,rpmi-mpxy-clock";
+ mboxes = <&rpmi_shmem_mbox 0x8>;
+ riscv,sbi-mpxy-channel-id = <0x1000>;
+ };
+...
diff --git a/Documentation/devicetree/bindings/crypto/ti,am62l-dthev2.yaml b/Documentation/devicetree/bindings/crypto/ti,am62l-dthev2.yaml
new file mode 100644
index 000000000000..5486bfeb2fe8
--- /dev/null
+++ b/Documentation/devicetree/bindings/crypto/ti,am62l-dthev2.yaml
@@ -0,0 +1,50 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/crypto/ti,am62l-dthev2.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: K3 SoC DTHE V2 crypto module
+
+maintainers:
+ - T Pratham <t-pratham@ti.com>
+
+properties:
+ compatible:
+ enum:
+ - ti,am62l-dthev2
+
+ reg:
+ maxItems: 1
+
+ dmas:
+ items:
+ - description: AES Engine RX DMA Channel
+ - description: AES Engine TX DMA Channel
+ - description: SHA Engine TX DMA Channel
+
+ dma-names:
+ items:
+ - const: rx
+ - const: tx1
+ - const: tx2
+
+required:
+ - compatible
+ - reg
+ - dmas
+ - dma-names
+
+additionalProperties: false
+
+examples:
+ - |
+ crypto@40800000 {
+ compatible = "ti,am62l-dthev2";
+ reg = <0x40800000 0x10000>;
+
+ dmas = <&main_bcdma 0 0 0x4700 0>,
+ <&main_bcdma 0 0 0xc701 0>,
+ <&main_bcdma 0 0 0xc700 0>;
+ dma-names = "rx", "tx1", "tx2";
+ };
diff --git a/Documentation/devicetree/bindings/crypto/xlnx,versal-trng.yaml b/Documentation/devicetree/bindings/crypto/xlnx,versal-trng.yaml
new file mode 100644
index 000000000000..9dfb0b0ab5c8
--- /dev/null
+++ b/Documentation/devicetree/bindings/crypto/xlnx,versal-trng.yaml
@@ -0,0 +1,35 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/crypto/xlnx,versal-trng.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Xilinx Versal True Random Number Generator Hardware Accelerator
+
+maintainers:
+ - Harsh Jain <h.jain@amd.com>
+ - Mounika Botcha <mounika.botcha@amd.com>
+
+description:
+ The Versal True Random Number Generator consists of Ring Oscillators as
+ entropy source and a deterministic CTR_DRBG random bit generator (DRBG).
+
+properties:
+ compatible:
+ const: xlnx,versal-trng
+
+ reg:
+ maxItems: 1
+
+required:
+ - reg
+
+additionalProperties: false
+
+examples:
+ - |
+ rng@f1230000 {
+ compatible = "xlnx,versal-trng";
+ reg = <0xf1230000 0x1000>;
+ };
+...
diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,rpmi-mpxy-system-msi.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,rpmi-mpxy-system-msi.yaml
new file mode 100644
index 000000000000..1991f5c7446a
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,rpmi-mpxy-system-msi.yaml
@@ -0,0 +1,67 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,rpmi-mpxy-system-msi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V RPMI system MSI service group based message proxy
+
+maintainers:
+ - Anup Patel <anup@brainfault.org>
+
+description: |
+ The RISC-V Platform Management Interface (RPMI) [1] defines a
+ messaging protocol which is modular and extensible. The supervisor
+ software can send/receive RPMI messages via SBI MPXY extension [2]
+ or some dedicated supervisor-mode RPMI transport.
+
+ The RPMI specification [1] defines system MSI service group which
+ allow application processors to receive MSIs upon system events
+ such as P2A doorbell, graceful shutdown/reboot request, CPU hotplug
+ event, memory hotplug event, etc from the platform microcontroller.
+ The SBI implementation (machine mode firmware or hypervisor) can
+ implement an SBI MPXY channel to allow RPMI system MSI service
+ group access to the supervisor software.
+
+ ===========================================
+ References
+ ===========================================
+
+ [1] RISC-V Platform Management Interface (RPMI) v1.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-rpmi/releases
+
+ [2] RISC-V Supervisor Binary Interface (SBI) v3.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-sbi-doc/releases
+
+properties:
+ compatible:
+ description:
+ Intended for use by the SBI implementation.
+ const: riscv,rpmi-mpxy-system-msi
+
+ mboxes:
+ maxItems: 1
+ description:
+ Mailbox channel of the underlying RPMI transport.
+
+ riscv,sbi-mpxy-channel-id:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description:
+ The SBI MPXY channel id to be used for providing RPMI access to
+ the supervisor software.
+
+required:
+ - compatible
+ - mboxes
+ - riscv,sbi-mpxy-channel-id
+
+additionalProperties: false
+
+examples:
+ - |
+ interrupt-controller {
+ compatible = "riscv,rpmi-mpxy-system-msi";
+ mboxes = <&rpmi_shmem_mbox 0x2>;
+ riscv,sbi-mpxy-channel-id = <0x2000>;
+ };
+...
diff --git a/Documentation/devicetree/bindings/interrupt-controller/riscv,rpmi-system-msi.yaml b/Documentation/devicetree/bindings/interrupt-controller/riscv,rpmi-system-msi.yaml
new file mode 100644
index 000000000000..b10a0532e586
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/riscv,rpmi-system-msi.yaml
@@ -0,0 +1,74 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interrupt-controller/riscv,rpmi-system-msi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V RPMI system MSI service group based interrupt controller
+
+maintainers:
+ - Anup Patel <anup@brainfault.org>
+
+description: |
+ The RISC-V Platform Management Interface (RPMI) [1] defines a
+ messaging protocol which is modular and extensible. The supervisor
+ software can send/receive RPMI messages via SBI MPXY extension [2]
+ or some dedicated supervisor-mode RPMI transport.
+
+ The RPMI specification [1] defines system MSI service group which
+ allow application processors to receive MSIs upon system events
+ such as P2A doorbell, graceful shutdown/reboot request, CPU hotplug
+ event, memory hotplug event, etc from the platform microcontroller.
+ The supervisor software can access RPMI system MSI service group via
+ SBI MPXY channel or some dedicated supervisor-mode RPMI transport.
+
+ ===========================================
+ References
+ ===========================================
+
+ [1] RISC-V Platform Management Interface (RPMI) v1.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-rpmi/releases
+
+ [2] RISC-V Supervisor Binary Interface (SBI) v3.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-sbi-doc/releases
+
+allOf:
+ - $ref: /schemas/interrupt-controller.yaml#
+
+properties:
+ compatible:
+ description:
+ Intended for use by the supervisor software.
+ const: riscv,rpmi-system-msi
+
+ mboxes:
+ maxItems: 1
+ description:
+ Mailbox channel of the underlying RPMI transport or SBI message proxy channel.
+
+ msi-parent: true
+
+ interrupt-controller: true
+
+ "#interrupt-cells":
+ const: 1
+
+required:
+ - compatible
+ - mboxes
+ - msi-parent
+ - interrupt-controller
+ - "#interrupt-cells"
+
+additionalProperties: false
+
+examples:
+ - |
+ interrupt-controller {
+ compatible = "riscv,rpmi-system-msi";
+ mboxes = <&mpxy_mbox 0x2000 0x0>;
+ msi-parent = <&imsic_slevel>;
+ interrupt-controller;
+ #interrupt-cells = <1>;
+ };
+...
diff --git a/Documentation/devicetree/bindings/mailbox/riscv,rpmi-shmem-mbox.yaml b/Documentation/devicetree/bindings/mailbox/riscv,rpmi-shmem-mbox.yaml
new file mode 100644
index 000000000000..3aabc52a0c03
--- /dev/null
+++ b/Documentation/devicetree/bindings/mailbox/riscv,rpmi-shmem-mbox.yaml
@@ -0,0 +1,124 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/mailbox/riscv,rpmi-shmem-mbox.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V Platform Management Interface (RPMI) shared memory mailbox
+
+maintainers:
+ - Anup Patel <anup@brainfault.org>
+
+description: |
+ The RISC-V Platform Management Interface (RPMI) [1] defines a common shared
+ memory based RPMI transport. This RPMI shared memory transport integrates as
+ mailbox controller in the SBI implementation or supervisor software whereas
+ each RPMI service group is mailbox client in the SBI implementation and
+ supervisor software.
+
+ ===========================================
+ References
+ ===========================================
+
+ [1] RISC-V Platform Management Interface (RPMI) v1.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-rpmi/releases
+
+properties:
+ compatible:
+ const: riscv,rpmi-shmem-mbox
+
+ reg:
+ minItems: 2
+ items:
+ - description: A2P request queue base address
+ - description: P2A acknowledgment queue base address
+ - description: P2A request queue base address
+ - description: A2P acknowledgment queue base address
+ - description: A2P doorbell address
+
+ reg-names:
+ minItems: 2
+ items:
+ - const: a2p-req
+ - const: p2a-ack
+ - enum: [ p2a-req, a2p-doorbell ]
+ - const: a2p-ack
+ - const: a2p-doorbell
+
+ interrupts:
+ maxItems: 1
+ description:
+ The RPMI shared memory transport supports P2A doorbell as a wired
+ interrupt and this property specifies the interrupt source.
+
+ msi-parent:
+ description:
+ The RPMI shared memory transport supports P2A doorbell as a system MSI
+ and this property specifies the target MSI controller.
+
+ riscv,slot-size:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 64
+ description:
+ Power-of-2 RPMI slot size of the RPMI shared memory transport.
+
+ riscv,a2p-doorbell-value:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ default: 0x1
+ description:
+ Value written to the 32-bit A2P doorbell register.
+
+ riscv,p2a-doorbell-sysmsi-index:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description:
+ The RPMI shared memory transport supports P2A doorbell as a system MSI
+ and this property specifies system MSI index to be used for configuring
+ the P2A doorbell MSI.
+
+ "#mbox-cells":
+ const: 1
+ description:
+ The first cell specifies RPMI service group ID.
+
+required:
+ - compatible
+ - reg
+ - reg-names
+ - riscv,slot-size
+ - "#mbox-cells"
+
+anyOf:
+ - required:
+ - interrupts
+ - required:
+ - msi-parent
+
+additionalProperties: false
+
+examples:
+ - |
+ // Example 1 (RPMI shared memory with only 2 queues):
+ mailbox@10080000 {
+ compatible = "riscv,rpmi-shmem-mbox";
+ reg = <0x10080000 0x10000>,
+ <0x10090000 0x10000>;
+ reg-names = "a2p-req", "p2a-ack";
+ msi-parent = <&imsic_mlevel>;
+ riscv,slot-size = <64>;
+ #mbox-cells = <1>;
+ };
+ - |
+ // Example 2 (RPMI shared memory with only 4 queues):
+ mailbox@10001000 {
+ compatible = "riscv,rpmi-shmem-mbox";
+ reg = <0x10001000 0x800>,
+ <0x10001800 0x800>,
+ <0x10002000 0x800>,
+ <0x10002800 0x800>,
+ <0x10003000 0x4>;
+ reg-names = "a2p-req", "p2a-ack", "p2a-req", "a2p-ack", "a2p-doorbell";
+ msi-parent = <&imsic_mlevel>;
+ riscv,slot-size = <64>;
+ riscv,a2p-doorbell-value = <0x00008000>;
+ #mbox-cells = <1>;
+ };
diff --git a/Documentation/devicetree/bindings/mailbox/riscv,sbi-mpxy-mbox.yaml b/Documentation/devicetree/bindings/mailbox/riscv,sbi-mpxy-mbox.yaml
new file mode 100644
index 000000000000..061437a0b45a
--- /dev/null
+++ b/Documentation/devicetree/bindings/mailbox/riscv,sbi-mpxy-mbox.yaml
@@ -0,0 +1,51 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/mailbox/riscv,sbi-mpxy-mbox.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V SBI Message Proxy (MPXY) extension based mailbox
+
+maintainers:
+ - Anup Patel <anup@brainfault.org>
+
+description: |
+ The RISC-V SBI Message Proxy (MPXY) extension [1] allows supervisor
+ software to send messages through the SBI implementation (M-mode
+ firmware or HS-mode hypervisor). The underlying message protocol
+ and message format used by the supervisor software could be some
+ other standard protocol compatible with the SBI MPXY extension
+ (such as RISC-V Platform Management Interface (RPMI) [2]).
+
+ ===========================================
+ References
+ ===========================================
+
+ [1] RISC-V Supervisor Binary Interface (SBI) v3.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-sbi-doc/releases
+
+ [2] RISC-V Platform Management Interface (RPMI) v1.0 (or higher)
+ https://github.com/riscv-non-isa/riscv-rpmi/releases
+
+properties:
+ compatible:
+ const: riscv,sbi-mpxy-mbox
+
+ "#mbox-cells":
+ const: 2
+ description:
+ The first cell specifies channel_id of the SBI MPXY channel,
+ the second cell specifies MSG_PROT_ID of the SBI MPXY channel
+
+required:
+ - compatible
+ - "#mbox-cells"
+
+additionalProperties: false
+
+examples:
+ - |
+ mailbox {
+ compatible = "riscv,sbi-mpxy-mbox";
+ #mbox-cells = <2>;
+ };
diff --git a/Documentation/devicetree/bindings/rng/hisi-rng.txt b/Documentation/devicetree/bindings/rng/hisi-rng.txt
deleted file mode 100644
index d04d55a6c2f5..000000000000
--- a/Documentation/devicetree/bindings/rng/hisi-rng.txt
+++ /dev/null
@@ -1,12 +0,0 @@
-Hisilicon Random Number Generator
-
-Required properties:
-- compatible : Should be "hisilicon,hip04-rng" or "hisilicon,hip05-rng"
-- reg : Offset and length of the register set of this block
-
-Example:
-
-rng@d1010000 {
- compatible = "hisilicon,hip05-rng";
- reg = <0xd1010000 0x100>;
-};
diff --git a/Documentation/devicetree/bindings/rng/hisi-rng.yaml b/Documentation/devicetree/bindings/rng/hisi-rng.yaml
new file mode 100644
index 000000000000..5406b2596f42
--- /dev/null
+++ b/Documentation/devicetree/bindings/rng/hisi-rng.yaml
@@ -0,0 +1,32 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/rng/hisi-rng.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Hisilicon Random Number Generator
+
+maintainers:
+ - Kefeng Wang <wangkefeng.wang@huawei>
+
+properties:
+ compatible:
+ enum:
+ - hisilicon,hip04-rng
+ - hisilicon,hip05-rng
+
+ reg:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+
+additionalProperties: false
+
+examples:
+ - |
+ rng@d1010000 {
+ compatible = "hisilicon,hip05-rng";
+ reg = <0xd1010000 0x100>;
+ };
diff --git a/Documentation/driver-api/cxl/conventions.rst b/Documentation/driver-api/cxl/conventions.rst
index da347a81a237..e37336d7b116 100644
--- a/Documentation/driver-api/cxl/conventions.rst
+++ b/Documentation/driver-api/cxl/conventions.rst
@@ -45,3 +45,138 @@ Detailed Description of the Change
----------------------------------
<Propose spec language that corrects the conflict.>
+
+
+Resolve conflict between CFMWS, Platform Memory Holes, and Endpoint Decoders
+============================================================================
+
+Document
+--------
+
+CXL Revision 3.2, Version 1.0
+
+License
+-------
+
+SPDX-License Identifier: CC-BY-4.0
+
+Creator/Contributors
+--------------------
+
+- Fabio M. De Francesco, Intel
+- Dan J. Williams, Intel
+- Mahesh Natu, Intel
+
+Summary of the Change
+---------------------
+
+According to the current Compute Express Link (CXL) Specifications (Revision
+3.2, Version 1.0), the CXL Fixed Memory Window Structure (CFMWS) describes zero
+or more Host Physical Address (HPA) windows associated with each CXL Host
+Bridge. Each window represents a contiguous HPA range that may be interleaved
+across one or more targets, including CXL Host Bridges. Each window has a set
+of restrictions that govern its usage. It is the Operating System-directed
+configuration and Power Management (OSPM) responsibility to utilize each window
+for the specified use.
+
+Table 9-22 of the current CXL Specifications states that the Window Size field
+contains the total number of consecutive bytes of HPA this window describes.
+This value must be a multiple of the Number of Interleave Ways (NIW) * 256 MB.
+
+Platform Firmware (BIOS) might reserve physical addresses below 4 GB where a
+memory gap such as the Low Memory Hole for PCIe MMIO may exist. In such cases,
+the CFMWS Range Size may not adhere to the NIW * 256 MB rule.
+
+The HPA represents the actual physical memory address space that the CXL devices
+can decode and respond to, while the System Physical Address (SPA), a related
+but distinct concept, represents the system-visible address space that users can
+direct transaction to and so it excludes reserved regions.
+
+BIOS publishes CFMWS to communicate the active SPA ranges that, on platforms
+with LMH's, map to a strict subset of the HPA. The SPA range trims out the hole,
+resulting in lost capacity in the Endpoints with no SPA to map to that part of
+the HPA range that intersects the hole.
+
+E.g, an x86 platform with two CFMWS and an LMH starting at 2 GB:
+
+ +--------+------------+-------------------+------------------+-------------------+------+
+ | Window | CFMWS Base | CFMWS Size | HDM Decoder Base | HDM Decoder Size | Ways |
+ +========+============+===================+==================+===================+======+
+ |  0 | 0 GB | 2 GB | 0 GB | 3 GB | 12 |
+ +--------+------------+-------------------+------------------+-------------------+------+
+ |  1 | 4 GB | NIW*256MB Aligned | 4 GB | NIW*256MB Aligned | 12 |
+ +--------+------------+-------------------+------------------+-------------------+------+
+
+HDM decoder base and HDM decoder size represent all the 12 Endpoint Decoders of
+a 12 ways region and all the intermediate Switch Decoders. They are configured
+by the BIOS according to the NIW * 256MB rule, resulting in a HPA range size of
+3GB. Instead, the CFMWS Base and CFMWS Size are used to configure the Root
+Decoder HPA range that results smaller (2GB) than that of the Switch and
+Endpoint Decoders in the hierarchy (3GB).
+
+This creates 2 issues which lead to a failure to construct a region:
+
+1) A mismatch in region size between root and any HDM decoder. The root decoders
+ will always be smaller due to the trim.
+
+2) The trim causes the root decoder to violate the (NIW * 256MB) rule.
+
+This change allows a region with a base address of 0GB to bypass these checks to
+allow for region creation with the trimmed root decoder address range.
+
+This change does not allow for any other arbitrary region to violate these
+checks - it is intended exclusively to enable x86 platforms which map CXL memory
+under 4GB.
+
+Despite the HDM decoders covering the PCIE hole HPA region, it is expected that
+the platform will never route address accesses to the CXL complex because the
+root decoder only covers the trimmed region (which excludes this). This is
+outside the ability of Linux to enforce.
+
+On the example platform, only the first 2GB will be potentially usable, but
+Linux, aiming to adhere to the current specifications, fails to construct
+Regions and attach Endpoint and intermediate Switch Decoders to them.
+
+There are several points of failure that due to the expectation that the Root
+Decoder HPA size, that is equal to the CFMWS from which it is configured, has
+to be greater or equal to the matching Switch and Endpoint HDM Decoders.
+
+In order to succeed with construction and attachment, Linux must construct a
+Region with Root Decoder HPA range size, and then attach to that all the
+intermediate Switch Decoders and Endpoint Decoders that belong to the hierarchy
+regardless of their range sizes.
+
+Benefits of the Change
+----------------------
+
+Without the change, the OSPM wouldn't match intermediate Switch and Endpoint
+Decoders with Root Decoders configured with CFMWS HPA sizes that don't align
+with the NIW * 256MB constraint, and so it leads to lost memdev capacity.
+
+This change allows the OSPM to construct Regions and attach intermediate Switch
+and Endpoint Decoders to them, so that the addressable part of the memory
+devices total capacity is made available to the users.
+
+References
+----------
+
+Compute Express Link Specification Revision 3.2, Version 1.0
+<https://www.computeexpresslink.org/>
+
+Detailed Description of the Change
+----------------------------------
+
+The description of the Window Size field in table 9-22 needs to account for
+platforms with Low Memory Holes, where SPA ranges might be subsets of the
+endpoints HPA. Therefore, it has to be changed to the following:
+
+"The total number of consecutive bytes of HPA this window represents. This value
+shall be a multiple of NIW * 256 MB.
+
+On platforms that reserve physical addresses below 4 GB, such as the Low Memory
+Hole for PCIe MMIO on x86, an instance of CFMWS whose Base HPA range is 0 might
+have a size that doesn't align with the NIW * 256 MB constraint.
+
+Note that the matching intermediate Switch Decoders and the Endpoint Decoders
+HPA range sizes must still align to the above-mentioned rule, but the memory
+capacity that exceeds the CFMWS window size won't be accessible.".
diff --git a/Documentation/driver-api/cxl/maturity-map.rst b/Documentation/driver-api/cxl/maturity-map.rst
index 1330f3f52129..282c1102dd81 100644
--- a/Documentation/driver-api/cxl/maturity-map.rst
+++ b/Documentation/driver-api/cxl/maturity-map.rst
@@ -173,7 +173,7 @@ Accelerator
User Flow Support
-----------------
-* [0] Inject & clear poison by HPA
+* [2] Inject & clear poison by region offset
Details
=======
diff --git a/Documentation/driver-api/cxl/platform/bios-and-efi.rst b/Documentation/driver-api/cxl/platform/bios-and-efi.rst
index 645322632cc9..a9aa0ccd92af 100644
--- a/Documentation/driver-api/cxl/platform/bios-and-efi.rst
+++ b/Documentation/driver-api/cxl/platform/bios-and-efi.rst
@@ -202,7 +202,7 @@ future and such a configuration should be avoided.
Memory Holes
------------
-If your platform includes memory holes intersparsed between your CXL memory, it
+If your platform includes memory holes interspersed between your CXL memory, it
is recommended to utilize multiple decoders to cover these regions of memory,
rather than try to program the decoders to accept the entire range and expect
Linux to manage the overlap.
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 3002258c9c7f..0b86a8022fa1 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -2159,6 +2159,20 @@ DMA Buffer files
where 'size' is the size of the DMA buffer in bytes. 'count' is the file count of
the DMA buffer file. 'exp_name' is the name of the DMA buffer exporter.
+VFIO Device files
+~~~~~~~~~~~~~~~~~
+
+::
+
+ pos: 0
+ flags: 02000002
+ mnt_id: 17
+ ino: 5122
+ vfio-device-syspath: /sys/devices/pci0000:e0/0000:e0:01.1/0000:e1:00.0/0000:e2:05.0/0000:e8:00.0
+
+where 'vfio-device-syspath' is the sysfs path corresponding to the VFIO device
+file.
+
3.9 /proc/<pid>/map_files - Information about memory mapped files
---------------------------------------------------------------------
This directory contains symbolic links which represent memory mapped files
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 6aa40ee05a4a..c17a87a0a5ac 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6414,6 +6414,15 @@ most one mapping per page, i.e. binding multiple memory regions to a single
guest_memfd range is not allowed (any number of memory regions can be bound to
a single guest_memfd file, but the bound ranges must not overlap).
+When the capability KVM_CAP_GUEST_MEMFD_MMAP is supported, the 'flags' field
+supports GUEST_MEMFD_FLAG_MMAP. Setting this flag on guest_memfd creation
+enables mmap() and faulting of guest_memfd memory to host userspace.
+
+When the KVM MMU performs a PFN lookup to service a guest fault and the backing
+guest_memfd has the GUEST_MEMFD_FLAG_MMAP set, then the fault will always be
+consumed from guest_memfd, regardless of whether it is a shared or a private
+fault.
+
See KVM_SET_USER_MEMORY_REGION2 for additional details.
4.143 KVM_PRE_FAULT_MEMORY