diff options
Diffstat (limited to 'Documentation/driver-api/cxl')
-rw-r--r-- | Documentation/driver-api/cxl/access-coordinates.rst | 91 | ||||
-rw-r--r-- | Documentation/driver-api/cxl/index.rst | 3 | ||||
-rw-r--r-- | Documentation/driver-api/cxl/maturity-map.rst | 202 | ||||
-rw-r--r-- | Documentation/driver-api/cxl/memory-devices.rst | 15 |
4 files changed, 311 insertions, 0 deletions
diff --git a/Documentation/driver-api/cxl/access-coordinates.rst b/Documentation/driver-api/cxl/access-coordinates.rst new file mode 100644 index 000000000000..b07950ea30c9 --- /dev/null +++ b/Documentation/driver-api/cxl/access-coordinates.rst @@ -0,0 +1,91 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. include:: <isonum.txt> + +================================== +CXL Access Coordinates Computation +================================== + +Shared Upstream Link Calculation +================================ +For certain CXL region construction with endpoints behind CXL switches (SW) or +Root Ports (RP), there is the possibility of the total bandwidth for all +the endpoints behind a switch being more than the switch upstream link. +A similar situation can occur within the host, upstream of the root ports. +The CXL driver performs an additional pass after all the targets have +arrived for a region in order to recalculate the bandwidths with possible +upstream link being a limiting factor in mind. + +The algorithm assumes the configuration is a symmetric topology as that +maximizes performance. When asymmetric topology is detected, the calculation +is aborted. An asymmetric topology is detected during topology walk where the +number of RPs detected as a grandparent is not equal to the number of devices +iterated in the same iteration loop. The assumption is made that subtle +asymmetry in properties does not happen and all paths to EPs are equal. + +There can be multiple switches under an RP. There can be multiple RPs under +a CXL Host Bridge (HB). There can be multiple HBs under a CXL Fixed Memory +Window Structure (CFMWS). + +An example hierarchy: + +> CFMWS 0 +> | +> _________|_________ +> | | +> ACPI0017-0 ACPI0017-1 +> GP0/HB0/ACPI0016-0 GP1/HB1/ACPI0016-1 +> | | | | +> RP0 RP1 RP2 RP3 +> | | | | +> SW 0 SW 1 SW 2 SW 3 +> | | | | | | | | +> EP0 EP1 EP2 EP3 EP4 EP5 EP6 EP7 + +Computation for the example hierarchy: + +Min (GP0 to CPU BW, + Min(SW 0 Upstream Link to RP0 BW, + Min(SW0SSLBIS for SW0DSP0 (EP0), EP0 DSLBIS, EP0 Upstream Link) + + Min(SW0SSLBIS for SW0DSP1 (EP1), EP1 DSLBIS, EP1 Upstream link)) + + Min(SW 1 Upstream Link to RP1 BW, + Min(SW1SSLBIS for SW1DSP0 (EP2), EP2 DSLBIS, EP2 Upstream Link) + + Min(SW1SSLBIS for SW1DSP1 (EP3), EP3 DSLBIS, EP3 Upstream link))) + +Min (GP1 to CPU BW, + Min(SW 2 Upstream Link to RP2 BW, + Min(SW2SSLBIS for SW2DSP0 (EP4), EP4 DSLBIS, EP4 Upstream Link) + + Min(SW2SSLBIS for SW2DSP1 (EP5), EP5 DSLBIS, EP5 Upstream link)) + + Min(SW 3 Upstream Link to RP3 BW, + Min(SW3SSLBIS for SW3DSP0 (EP6), EP6 DSLBIS, EP6 Upstream Link) + + Min(SW3SSLBIS for SW3DSP1 (EP7), EP7 DSLBIS, EP7 Upstream link)))) + +The calculation starts at cxl_region_shared_upstream_perf_update(). A xarray +is created to collect all the endpoint bandwidths via the +cxl_endpoint_gather_bandwidth() function. The min() of bandwidth from the +endpoint CDAT and the upstream link bandwidth is calculated. If the endpoint +has a CXL switch as a parent, then min() of calculated bandwidth and the +bandwidth from the SSLBIS for the switch downstream port that is associated +with the endpoint is calculated. The final bandwidth is stored in a +'struct cxl_perf_ctx' in the xarray indexed by a device pointer. If the +endpoint is direct attached to a root port (RP), the device pointer would be an +RP device. If the endpoint is behind a switch, the device pointer would be the +upstream device of the parent switch. + +At the next stage, the code walks through one or more switches if they exist +in the topology. For endpoints directly attached to RPs, this step is skipped. +If there is another switch upstream, the code takes the min() of the current +gathered bandwidth and the upstream link bandwidth. If there's a switch +upstream, then the SSLBIS of the upstream switch. + +Once the topology walk reaches the RP, whether it's direct attached endpoints +or walking through the switch(es), cxl_rp_gather_bandwidth() is called. At +this point all the bandwidths are aggregated per each host bridge, which is +also the index for the resulting xarray. + +The next step is to take the min() of the per host bridge bandwidth and the +bandwidth from the Generic Port (GP). The bandwidths for the GP is retrieved +via ACPI tables SRAT/HMAT. The min bandwidth are aggregated under the same +ACPI0017 device to form a new xarray. + +Finally, the cxl_region_update_bandwidth() is called and the aggregated +bandwidth from all the members of the last xarray is updated for the +access coordinates residing in the cxl region (cxlr) context. diff --git a/Documentation/driver-api/cxl/index.rst b/Documentation/driver-api/cxl/index.rst index 036e49553542..965ba90e8fb7 100644 --- a/Documentation/driver-api/cxl/index.rst +++ b/Documentation/driver-api/cxl/index.rst @@ -8,5 +8,8 @@ Compute Express Link :maxdepth: 1 memory-devices + access-coordinates + + maturity-map .. only:: subproject and html diff --git a/Documentation/driver-api/cxl/maturity-map.rst b/Documentation/driver-api/cxl/maturity-map.rst new file mode 100644 index 000000000000..a2288f9df658 --- /dev/null +++ b/Documentation/driver-api/cxl/maturity-map.rst @@ -0,0 +1,202 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. include:: <isonum.txt> + +=========================================== +Compute Express Link Subsystem Maturity Map +=========================================== + +The Linux CXL subsystem tracks the dynamic `CXL specification +<https://computeexpresslink.org/cxl-specification-landing-page>`_ that +continues to respond to new use cases with new features, capability +updates and fixes. At any given point some aspects of the subsystem are +more mature than others. While the periodic pull requests summarize the +`work being incorporated each merge window +<https://lore.kernel.org/linux-cxl/?q=s%3APULL+s%3ACXL+tc%3Atorvalds+NOT+s%3ARe>`_, +those do not always convey progress relative to a starting point and a +future end goal. + +What follows is a coarse breakdown of the subsystem's major +responsibilities along with a maturity score. The expectation is that +the change-history of this document provides an overview summary of the +subsystem maturation over time. + +The maturity scores are: + +- [3] Mature: Work in this area is complete and no changes on the horizon. + Note that this score can regress from one kernel release to the next + based on new test results or end user reports. + +- [2] Stabilizing: Major functionality operational, common cases are + mature, but known corner cases are still a work in progress. + +- [1] Initial: Capability that has exited the Proof of Concept phase, but + may still have significant gaps to close and fixes to apply as real + world testing occurs. + +- [0] Known gap: Feature is on a medium to long term horizon to + implement. If the specification has a feature that does not even have + a '0' score in this document, there is a good chance that no one in + the linux-cxl@vger.kernel.org community has started to look at it. + +- X: Out of scope for kernel enabling, or kernel enabling not required + +Feature and Capabilities +======================== + +Enumeration / Provisioning +-------------------------- +All of the fundamental enumeration an object model of the subsystem is +in place, but there are several corner cases that are pending closure. + + +* [2] CXL Window Enumeration + + * [0] :ref:`Extended-linear memory-side cache <extended-linear>` + * [0] Low Memory-hole + * [0] Hetero-interleave + +* [2] Switch Enumeration + + * [0] CXL register enumeration link-up dependency + +* [2] HDM Decoder Configuration + + * [0] Decoder target and granularity constraints + +* [2] Performance enumeration + + * [3] Endpoint CDAT + * [3] Switch CDAT + * [1] CDAT to Core-mm integration + + * [1] x86 + * [0] Arm64 + * [0] All other arch. + + * [0] Shared link + +* [2] Hotplug + (see CXL Window Enumeration) + + * [0] Handle Soft Reserved conflicts + +* [0] :ref:`RCH link status <rch-link-status>` +* [0] Fabrics / G-FAM (chapter 7) +* [0] Global Access Endpoint + + +RAS +--- +In many ways CXL can be seen as a standardization of what would normally +be handled by custom EDAC drivers. The open development here is +mainly caused by the enumeration corner cases above. + +* [3] Component events (OS) +* [2] Component events (FFM) +* [1] Endpoint protocol errors (OS) +* [1] Endpoint protocol errors (FFM) +* [0] Switch protocol errors (OS) +* [1] Switch protocol errors (FFM) +* [2] DPA->HPA Address translation + + * [1] XOR Interleave translation + (see CXL Window Enumeration) + +* [1] Memory Failure coordination +* [0] Scrub control +* [2] ACPI error injection EINJ + + * [0] EINJ v2 + * [X] Compliance DOE + +* [2] Native error injection +* [3] RCH error handling +* [1] VH error handling +* [0] PPR +* [0] Sparing +* [0] Device built in test + + +Mailbox commands +---------------- + +* [3] Firmware update +* [3] Health / Alerts +* [1] :ref:`Background commands <background-commands>` +* [3] Sanitization +* [3] Security commands +* [3] RAW Command Debug Passthrough +* [0] CEL-only-validation Passthrough +* [0] Switch CCI +* [3] Timestamp +* [1] PMEM labels +* [3] PMEM GPF / Dirty Shutdown +* [0] Scan Media + +PMU +--- +* [1] Type 3 PMU +* [0] Switch USP/ DSP, Root Port + +Security +-------- + +* [X] CXL Trusted Execution Environment Security Protocol (TSP) +* [X] CXL IDE (subsumed by TSP) + +Memory-pooling +-------------- + +* [1] Hotplug of LDs (via PCI hotplug) +* [0] Dynamic Capacity Device (DCD) Support + +Multi-host sharing +------------------ + +* [0] Hardware coherent shared memory +* [0] Software managed coherency shared memory + +Multi-host memory +----------------- + +* [0] Dynamic Capacity Device Support +* [0] Sharing + +Accelerator +----------- + +* [0] Accelerator memory enumeration HDM-D (CXL 1.1/2.0 Type-2) +* [0] Accelerator memory enumeration HDM-DB (CXL 3.0 Type-2) +* [0] CXL.cache 68b (CXL 2.0) +* [0] CXL.cache 256b Cache IDs (CXL 3.0) + +User Flow Support +----------------- + +* [0] HPA->DPA Address translation (need xormaps export solution) + +Details +======= + +.. _extended-linear: + +* **Extended-linear memory-side cache**: An HMAT proposal to enumerate the presence of a + memory-side cache where the cache capacity extends the SRAT address + range capacity. `See the ECN + <https://lore.kernel.org/linux-cxl/6650e4f835a0e_195e294a8@dwillia2-mobl3.amr.corp.intel.com.notmuch/>`_ + for more details: + +.. _rch-link-status: + +* **RCH Link Status**: RCH (Restricted CXL Host) topologies, end up + hiding some standard registers like PCIe Link Status / Capabilities in + the CXL RCRB (Root Complex Register Block). + +.. _background-commands: + +* **Background commands**: The CXL background command mechanism is + awkward as the single slot is monopolized potentially indefinitely by + various commands. A `cancel on conflict + <http://lore.kernel.org/r/66035c2e8ba17_770232948b@dwillia2-xfh.jf.intel.com.notmuch>`_ + facility is needed to make sure the kernel can ensure forward progress + of priority commands. diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst index 5149ecdc53c7..d732c42526df 100644 --- a/Documentation/driver-api/cxl/memory-devices.rst +++ b/Documentation/driver-api/cxl/memory-devices.rst @@ -328,6 +328,12 @@ CXL Memory Device .. kernel-doc:: drivers/cxl/mem.c :doc: cxl mem +.. kernel-doc:: drivers/cxl/cxlmem.h + :internal: + +.. kernel-doc:: drivers/cxl/core/memdev.c + :identifiers: + CXL Port -------- .. kernel-doc:: drivers/cxl/port.c @@ -341,6 +347,15 @@ CXL Core .. kernel-doc:: drivers/cxl/cxl.h :internal: +.. kernel-doc:: drivers/cxl/core/hdm.c + :doc: cxl core hdm + +.. kernel-doc:: drivers/cxl/core/hdm.c + :identifiers: + +.. kernel-doc:: drivers/cxl/core/cdat.c + :identifiers: + .. kernel-doc:: drivers/cxl/core/port.c :doc: cxl core |