summaryrefslogtreecommitdiff
path: root/Documentation/x86
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/x86')
-rw-r--r--Documentation/x86/amd-memory-encryption.rst36
-rw-r--r--Documentation/x86/resctrl.rst165
-rw-r--r--Documentation/x86/x86_64/mm.rst2
3 files changed, 191 insertions, 12 deletions
diff --git a/Documentation/x86/amd-memory-encryption.rst b/Documentation/x86/amd-memory-encryption.rst
index a1940ebe7be5..934310ce7258 100644
--- a/Documentation/x86/amd-memory-encryption.rst
+++ b/Documentation/x86/amd-memory-encryption.rst
@@ -95,3 +95,39 @@ by supplying mem_encrypt=on on the kernel command line. However, if BIOS does
not enable SME, then Linux will not be able to activate memory encryption, even
if configured to do so by default or the mem_encrypt=on command line parameter
is specified.
+
+Secure Nested Paging (SNP)
+==========================
+
+SEV-SNP introduces new features (SEV_FEATURES[1:63]) which can be enabled
+by the hypervisor for security enhancements. Some of these features need
+guest side implementation to function correctly. The below table lists the
+expected guest behavior with various possible scenarios of guest/hypervisor
+SNP feature support.
+
++-----------------+---------------+---------------+------------------+
+| Feature Enabled | Guest needs | Guest has | Guest boot |
+| by the HV | implementation| implementation| behaviour |
++=================+===============+===============+==================+
+| No | No | No | Boot |
+| | | | |
++-----------------+---------------+---------------+------------------+
+| No | Yes | No | Boot |
+| | | | |
++-----------------+---------------+---------------+------------------+
+| No | Yes | Yes | Boot |
+| | | | |
++-----------------+---------------+---------------+------------------+
+| Yes | No | No | Boot with |
+| | | | feature enabled |
++-----------------+---------------+---------------+------------------+
+| Yes | Yes | No | Graceful boot |
+| | | | failure |
++-----------------+---------------+---------------+------------------+
+| Yes | Yes | Yes | Boot with |
+| | | | feature enabled |
++-----------------+---------------+---------------+------------------+
+
+More details in AMD64 APM[1] Vol 2: 15.34.10 SEV_STATUS MSR
+
+[1] https://www.amd.com/system/files/TechDocs/40332.pdf
diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index 71a531061e4e..387ccbcb558f 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -17,14 +17,21 @@ AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).
This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo
flag bits:
-============================================= ================================
+=============================================== ================================
RDT (Resource Director Technology) Allocation "rdt_a"
CAT (Cache Allocation Technology) "cat_l3", "cat_l2"
CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2"
CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc"
MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local"
MBA (Memory Bandwidth Allocation) "mba"
-============================================= ================================
+SMBA (Slow Memory Bandwidth Allocation) ""
+BMEC (Bandwidth Monitoring Event Configuration) ""
+=============================================== ================================
+
+Historically, new features were made visible by default in /proc/cpuinfo. This
+resulted in the feature flags becoming hard to parse by humans. Adding a new
+flag to /proc/cpuinfo should be avoided if user space can obtain information
+about the feature from resctrl's info directory.
To use the feature mount the file system::
@@ -161,6 +168,83 @@ with the following files:
"mon_features":
Lists the monitoring events if
monitoring is enabled for the resource.
+ Example::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mon_features
+ llc_occupancy
+ mbm_total_bytes
+ mbm_local_bytes
+
+ If the system supports Bandwidth Monitoring Event
+ Configuration (BMEC), then the bandwidth events will
+ be configurable. The output will be::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mon_features
+ llc_occupancy
+ mbm_total_bytes
+ mbm_total_bytes_config
+ mbm_local_bytes
+ mbm_local_bytes_config
+
+"mbm_total_bytes_config", "mbm_local_bytes_config":
+ Read/write files containing the configuration for the mbm_total_bytes
+ and mbm_local_bytes events, respectively, when the Bandwidth
+ Monitoring Event Configuration (BMEC) feature is supported.
+ The event configuration settings are domain specific and affect
+ all the CPUs in the domain. When either event configuration is
+ changed, the bandwidth counters for all RMIDs of both events
+ (mbm_total_bytes as well as mbm_local_bytes) are cleared for that
+ domain. The next read for every RMID will report "Unavailable"
+ and subsequent reads will report the valid value.
+
+ Following are the types of events supported:
+
+ ==== ========================================================
+ Bits Description
+ ==== ========================================================
+ 6 Dirty Victims from the QOS domain to all types of memory
+ 5 Reads to slow memory in the non-local NUMA domain
+ 4 Reads to slow memory in the local NUMA domain
+ 3 Non-temporal writes to non-local NUMA domain
+ 2 Non-temporal writes to local NUMA domain
+ 1 Reads to memory in the non-local NUMA domain
+ 0 Reads to memory in the local NUMA domain
+ ==== ========================================================
+
+ By default, the mbm_total_bytes configuration is set to 0x7f to count
+ all the event types and the mbm_local_bytes configuration is set to
+ 0x15 to count all the local memory events.
+
+ Examples:
+
+ * To view the current configuration::
+ ::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+ 0=0x7f;1=0x7f;2=0x7f;3=0x7f
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+ 0=0x15;1=0x15;3=0x15;4=0x15
+
+ * To change the mbm_total_bytes to count only reads on domain 0,
+ the bits 0, 1, 4 and 5 needs to be set, which is 110011b in binary
+ (in hexadecimal 0x33):
+ ::
+
+ # echo "0=0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+ 0=0x33;1=0x7f;2=0x7f;3=0x7f
+
+ * To change the mbm_local_bytes to count all the slow memory reads on
+ domain 0 and 1, the bits 4 and 5 needs to be set, which is 110000b
+ in binary (in hexadecimal 0x30):
+ ::
+
+ # echo "0=0x30;1=0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+ 0=0x30;1=0x30;3=0x15;4=0x15
"max_threshold_occupancy":
Read/write file provides the largest value (in
@@ -464,6 +548,25 @@ Memory bandwidth domain is L3 cache.
MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
+Slow Memory Bandwidth Allocation (SMBA)
+---------------------------------------
+AMD hardware supports Slow Memory Bandwidth Allocation (SMBA).
+CXL.memory is the only supported "slow" memory device. With the
+support of SMBA, the hardware enables bandwidth allocation on
+the slow memory devices. If there are multiple such devices in
+the system, the throttling logic groups all the slow sources
+together and applies the limit on them as a whole.
+
+The presence of SMBA (with CXL.memory) is independent of slow memory
+devices presence. If there are no such devices on the system, then
+configuring SMBA will have no impact on the performance of the system.
+
+The bandwidth domain for slow memory is L3 cache. Its schemata file
+is formatted as:
+::
+
+ SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
+
Reading/writing the schemata file
---------------------------------
Reading the schemata file will show the state of all resources
@@ -479,6 +582,46 @@ which you wish to change. E.g.
L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
+Reading/writing the schemata file (on AMD systems)
+--------------------------------------------------
+Reading the schemata file will show the current bandwidth limit on all
+domains. The allocated resources are in multiples of one eighth GB/s.
+When writing to the file, you need to specify what cache id you wish to
+configure the bandwidth limit.
+
+For example, to allocate 2GB/s limit on the first cache id:
+
+::
+
+ # cat schemata
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+ # echo "MB:1=16" > schemata
+ # cat schemata
+ MB:0=2048;1= 16;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+Reading/writing the schemata file (on AMD systems) with SMBA feature
+--------------------------------------------------------------------
+Reading and writing the schemata file is the same as without SMBA in
+above section.
+
+For example, to allocate 8GB/s limit on the first cache id:
+
+::
+
+ # cat schemata
+ SMBA:0=2048;1=2048;2=2048;3=2048
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+ # echo "SMBA:1=64" > schemata
+ # cat schemata
+ SMBA:0=2048;1= 64;2=2048;3=2048
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
Cache Pseudo-Locking
====================
CAT enables a user to specify the amount of cache space that an
@@ -608,12 +751,12 @@ how we can measure the latency in cycles of reading from this region and
visualize this data with a histogram that is available if CONFIG_HIST_TRIGGERS
is set::
- # :> /sys/kernel/debug/tracing/trace
- # echo 'hist:keys=latency' > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/trigger
- # echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable
+ # :> /sys/kernel/tracing/trace
+ # echo 'hist:keys=latency' > /sys/kernel/tracing/events/resctrl/pseudo_lock_mem_latency/trigger
+ # echo 1 > /sys/kernel/tracing/events/resctrl/pseudo_lock_mem_latency/enable
# echo 1 > /sys/kernel/debug/resctrl/newlock/pseudo_lock_measure
- # echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/enable
- # cat /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_mem_latency/hist
+ # echo 0 > /sys/kernel/tracing/events/resctrl/pseudo_lock_mem_latency/enable
+ # cat /sys/kernel/tracing/events/resctrl/pseudo_lock_mem_latency/hist
# event histogram
#
@@ -642,11 +785,11 @@ cache of a platform. Here is how we can obtain details of the cache hits
and misses using the platform's precision counters.
::
- # :> /sys/kernel/debug/tracing/trace
- # echo 1 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_l2/enable
+ # :> /sys/kernel/tracing/trace
+ # echo 1 > /sys/kernel/tracing/events/resctrl/pseudo_lock_l2/enable
# echo 2 > /sys/kernel/debug/resctrl/newlock/pseudo_lock_measure
- # echo 0 > /sys/kernel/debug/tracing/events/resctrl/pseudo_lock_l2/enable
- # cat /sys/kernel/debug/tracing/trace
+ # echo 0 > /sys/kernel/tracing/events/resctrl/pseudo_lock_l2/enable
+ # cat /sys/kernel/tracing/trace
# tracer: nop
#
diff --git a/Documentation/x86/x86_64/mm.rst b/Documentation/x86/x86_64/mm.rst
index 9798676bb0bf..35e5e18c83d0 100644
--- a/Documentation/x86/x86_64/mm.rst
+++ b/Documentation/x86/x86_64/mm.rst
@@ -140,7 +140,7 @@ The direct mapping covers all memory in the system up to the highest
memory address (this means in some cases it can also include PCI memory
holes).
-We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual
+We map EFI runtime services in the 'efi_pgd' PGD in a 64GB large virtual
memory window (this size is arbitrary, it can be raised later if needed).
The mappings are not part of any other kernel PGD and are only available
during EFI runtime calls.