Age | Commit message (Collapse) | Author |
|
Update Common Event Record to CXL r3.2 definition.
Add additional validity check for event records.
Add memory sparing event record tracing.
|
|
CXL rev 3.2 section 8.2.10.2.1.4 Table 8-60 defines the Memory Sparing
Event Record.
Determine if the event read is memory sparing record and if so trace the
record.
Memory device shall produce a memory sparing event record
1. After completion of a PPR maintenance operation if the memory sparing
event record enable bit is set (Field: sPPR/hPPR Operation Mode in
Table 8-128/Table 8-131).
2. In response to a query request by the host (see section 8.2.10.7.1.4)
to determine the availability of sparing resources.
The device shall report the resource availability by producing the Memory
Sparing Event Record (see Table 8-60) in which the channel, rank, nibble
mask, bank group, bank, row, column, sub-channel fields are a copy of the
values specified in the request. If the controller does not support
reporting whether a resource is available, and a perform maintenance
operation for memory sparing is issued with query resources set to 1, the
controller shall return invalid input.
Example trace log for produce memory sparing event record on completion
of a soft PPR operation,
cxl_memory_sparing: memdev=mem1 host=0000:0f:00.0 serial=3
log=Informational : time=55045163029
uuid=e71f3a40-2d29-4092-8a39-4d1c966c7c65 len=128 flags='0x1' handle=1
related_handle=0 maint_op_class=2 maint_op_sub_class=1
ld_id=0 head_id=0 : flags='' result=0
validity_flags='CHANNEL|RANK|NIBBLE|BANK GROUP|BANK|ROW|COLUMN'
spare resource avail=1 channel=2 rank=5 nibble_mask=a59c bank_group=2
bank=4 row=13 column=23 sub_channel=0
comp_id=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
comp_id_pldm_valid_flags='' pldm_entity_id=0x00 pldm_resource_id=0x00
Note: For memory sparing event record, fields 'maintenance operation
class' and 'maintenance operation subclass' are defined twice, first
in the common event record (Table 8-55) and second in the memory
sparing event record (Table 8-60). Thus those in the sparing event
record coded as reserved, to be removed when the spec is updated.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250717101817.2104-5-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
According to the CXL Specification Revision 3.2, Section 8.2.10.2.1.2,
Table 8-58 (DRAM Event Record), the CVME (Corrected Volatile Memory Error)
Count field is valid under the following conditions:
1. The Threshold Event bit is set in the Memory Event Descriptor field,
and
2. The CVME Count must be greater than 0 for events where the Advanced
Programmable Threshold Counter has expired.
Additionally, if the Advanced Programmable Corrected Memory Error Counter
Expire bit in the Memory Event Type field is set, then the Threshold Event
bit in the Memory Event Descriptor field shall also be set.
Add validity checks for the above conditions while reporting the event to
the userspace.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250717101817.2104-4-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
General Media Event Record
According to the CXL Specification Revision 3.2, Section 8.2.10.2.1.1,
Table 8-57 (General Media Event Record), the Corrected Memory Error Count
field is valid under the following conditions:
1. The Threshold Event bit is set in the Memory Event Descriptor field,
and
2. The Corrected Memory Error Count must be greater than 0 for events
where the Advanced Programmable Threshold Counter has expired.
Additionally, if the Advanced Programmable Corrected Memory Error Counter
Expire bit in the Memory Event Type field is set, then the Threshold Event
bit in the Memory Event Descriptor field shall also be set.
Add validity checks for the above conditions while reporting the event to
the userspace.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250717101817.2104-3-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Use ACQUIRE() to cleanup conditional locking paths in the CXL driver
The ACQUIRE() macro and its associated ACQUIRE_ERR() helpers, like
scoped_cond_guard(), arrange for scoped-based conditional locking. Unlike
scoped_cond_guard(), these macros arrange for an ERR_PTR() to be retrieved
representing the state of the conditional lock.
The goal of this conversion is to complete the removal of all explicit
unlock calls in the subsystem. I.e. the methods to acquire a lock are
solely via guard(), scoped_guard() (for limited cases), or ACQUIRE(). All
unlock is implicit / scope-based. In order to make sure all lock sites are
converted, the existing rwsem's are consolidated and renamed in 'struct
cxl_rwsem'. While that makes the patch noisier it gives a clean cut-off
between old-world (explicit unlock allowed), and new world (explicit unlock
deleted).
Cc: David Lechner <dlechner@baylibre.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Shiju Jose <shiju.jose@huawei.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250711234932.671292-9-dan.j.williams@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Towards removing all explicit unlock calls in the CXL subsystem, convert
the conditional poison list mutex to use a conditional lock guard.
Rename the lock to have the compiler validate that all existing call sites
are converted.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250711234932.671292-3-dan.j.williams@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Certain operations on memory, such as memory repair, are permitted
only when the address and other attributes for the operation are
from the current boot. This is determined by checking whether the
memory attributes for the operation match those in the CXL gen_media
or CXL DRAM memory event records reported during the current boot.
The CXL event records must be backed up because they are cleared
in the hardware after being processed by the kernel.
Support is added for storing CXL gen_media or CXL DRAM memory event
records in xarrays. Old records are deleted when they expire or when
there is an overflow and which depends on platform correctly report
Event Record Timestamp field of CXL spec Table 8-55 Common Event
Record Format.
Additionally, helper functions are implemented to find a matching
record in the xarray storage based on the memory attributes and
repair type.
Add validity check, when matching attributes for sparing, using
the validity flag in the DRAM event record, to ensure that all
required attributes for a requested repair operation are valid and
set.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-7-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add CXL Features support. Setup code for enabling in kernel usage of CXL
Features. Expecting EDAC/RAS to utilize CXL Features in kernel for
things such as memory sparing. Also prepartion for enabling of CXL FWCTL
support to issue allowed Features from user space.
|
|
CONFIG_CXL_MCE depends on CONFIG_MEMORY_FAILURE, if
CONFIG_CXL_MCE is not set, devm_cxl_register_mce_notifier() will return
an -EOPNOTSUPP, it will cause cxl_mem_state_create() failure , and then
cxl pci device probing failed. In this case, it should not break cxl pci
device probing.
Add a checking in cxl_mem_state_create() to check if the returned value
of devm_cxl_register_mce_notifier() is -EOPNOTSUPP, if yes, just output
a warning log, do not let cxl_mem_state_create() return an error.
Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20250227101848.388595-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add support for Extended Linear Cache for CXL. Add enumeration support
of the cache. Add MCE notification of the aliased memory address.
|
|
Add support for Global Persistent Flush (GPF) and dirty shutdown
accounting.
|
|
A series of CXL refactoring using scope based resource management to
remove goto patterns on the cleanup paths.
|
|
Similar to how the acpi_nfit driver exports Optane dirty shutdown count,
introduce:
/sys/bus/cxl/devices/nvdimm-bridge0/ndbusX/nmemY/cxl/dirty_shutdown
Under the conditions that 1) dirty shutdown can be set, 2) Device GPF
DVSEC exists, and 3) the count itself can be retrieved.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250220220235.276831-4-dave@stgolabs.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
... to a better suited 'cxl_arm_dirty_shutdown()'.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250220220235.276831-3-dave@stgolabs.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add support for GPF flows. It is found that the CXL specification
around this to be a bit too involved from the driver side. And while
this should really all handled by the hardware, this patch takes
things with a grain of salt.
Upon respective port enumeration, both phase timeouts are set to
a max of 20 seconds, which is the NMI watchdog default for lockup
detection. The premise is that the kernel does not have enough
information to set anything better than a max across the board
and hope devices finish their GPF flows within the platform energy
budget.
Timeout detection is based on dirty Shutdown semantics. The driver
will mark it as dirty, expecting that the device clear it upon a
successful GPF event. The admin may consult the device Health and
check the dirty shutdown counter to see if there was a problem
with data integrity.
[ davej: Explicitly set return to 0 in update_gpf_port_dvsec() ]
[ davej: Add spec reference for 'struct cxl_mbox_set_shutdown_state_in ]
[ davej: Fix 0-day reported issue ]
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250124233533.910535-1-dave@stgolabs.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The next volatile and next persistent values are unused and are
cluttering the cxl_memdev_state.
Remove these values.
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20250206-cxl-cleanup-v1-1-9ddf26dd8433@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
In cxl_mem_sanitize(), the down_read() and up_read() for
cxl_region_rwsem can be simply replaced by a guard(rwsem_read), and the
local variable 'rc' can be removed.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Li Ming <ming.li@zohomail.com>
Link: https://patch.msgid.link/20250221012453.126366-3-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Below is a setup with extended linear cache configuration with an example
layout of memory region shown below presented as a single memory region
consists of 256G memory where there's 128G of DRAM and 128G of CXL memory.
The kernel sees a region of total 256G of system memory.
128G DRAM 128G CXL memory
|-----------------------------------|-------------------------------------|
Data resides in either DRAM or far memory (FM) with no replication. Hot
data is swapped into DRAM by the hardware behind the scenes. When error is
detected in one location, it is possible that error also resides in the
aliased location. Therefore when a memory location that is flagged by MCE
is part of the special region, the aliased memory location needs to be
offlined as well.
Add an mce notify callback to identify if the MCE address location is part
of an extended linear cache region and handle accordingly.
Added symbol export to set_mce_nospec() in x86 code in order to call
set_mce_nospec() from the CXL MCE notify callback.
Link: https://lore.kernel.org/linux-cxl/668333b17e4b2_5639294fd@dwillia2-xfh.jf.intel.com.notmuch/
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250226162224.3633792-5-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add the aliased address of extended linear cache when emitting event
trace for poison, DRAM and general media of CXL events.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250226162224.3633792-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add feature commands enumeration code in order to detect and enumerate
the 3 feature related commands "get supported features", "get feature",
and "set feature". The enumeration will help determine whether the driver
can issue any of the 3 commands to the device.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250220194438.2281088-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
With 'struct cxl_mailbox' context introduced, the helper functions
cxl_query_cmd() and cxl_send_cmd() can take a cxl_mailbox directly
rather than a cxl_memdev parameter. Refactor to use cxl_mailbox
directly.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250204220430.4146187-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The pending efforts to add CXL Accelerator (type-2) device [1], and
Dynamic Capacity (DCD) support [2], tripped on the
no-longer-fit-for-purpose design in the CXL subsystem for tracking
device-physical-address (DPA) metadata. Trip hazards include:
- CXL Memory Devices need to consider a PMEM partition, but Accelerator
devices with CXL.mem likely do not in the common case.
- CXL Memory Devices enumerate DPA through Memory Device mailbox
commands like Partition Info, Accelerators devices do not.
- CXL Memory Devices that support DCD support more than 2 partitions.
Some of the driver algorithms are awkward to expand to > 2 partition
cases.
- DPA performance data is a general capability that can be shared with
accelerators, so tracking it in 'struct cxl_memdev_state' is no longer
suitable.
- Hardcoded assumptions around the PMEM partition always being index-1
if RAM is zero-sized or PMEM is zero sized.
- 'enum cxl_decoder_mode' is sometimes a partition id and sometimes a
memory property, it should be phased in favor of a partition id and
the memory property comes from the partition info.
Towards cleaning up those issues and allowing a smoother landing for the
aforementioned pending efforts, introduce a 'struct cxl_dpa_partition'
array to 'struct cxl_dev_state', and 'struct cxl_range_info' as a shared
way for Memory Devices and Accelerators to initialize the DPA information
in 'struct cxl_dev_state'.
For now, split a new cxl_dpa_setup() from cxl_mem_create_range_info() to
get the new data structure initialized, and cleanup some qos_class init.
Follow on patches will go further to use the new data structure to
cleanup algorithms that are better suited to loop over all possible
partitions.
cxl_dpa_setup() follows the locking expectations of mutating the device
DPA map, and is suitable for Accelerator drivers to use. Accelerators
likely only have one hardcoded 'ram' partition to convey to the
cxl_core.
Link: http://lore.kernel.org/20241230214445.27602-1-alejandro.lucero-palau@amd.com [1]
Link: http://lore.kernel.org/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com [2]
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Alejandro Lucero <alucerop@amd.com>
Link: https://patch.msgid.link/173864305827.668823.13978794102080021276.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
In preparation for consolidating all DPA partition information into an
array of DPA metadata, introduce helpers that hide the layout of the
current data. I.e. make the eventual replacement of ->ram_res,
->pmem_res, ->ram_perf, and ->pmem_perf with a new DPA metadata array a
no-op for code paths that consume that information, and reduce the noise
of follow-on patches.
The end goal is to consolidate all DPA information in 'struct
cxl_dev_state', but for now the helpers just make it appear that all DPA
metadata is relative to @cxlds.
As the conversion to generic partition metadata walking is completed,
these helpers will naturally be eliminated, or reduced in scope.
Cc: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Alejandro Lucero <alucerop@amd.com>
Link: https://patch.msgid.link/173864305238.668823.16553986866633608541.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Clean up the existing export namespace code along the same lines of
commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.
Scripted using
git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
do
awk -i inplace '
/^#define EXPORT_SYMBOL_NS/ {
gsub(/__stringify\(ns\)/, "ns");
print;
next;
}
/^#define MODULE_IMPORT_NS/ {
gsub(/__stringify\(ns\)/, "ns");
print;
next;
}
/MODULE_IMPORT_NS/ {
$0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
}
/EXPORT_SYMBOL_NS/ {
if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &&
$0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &&
$0 !~ /^my/) {
getline line;
gsub(/[[:space:]]*\\$/, "");
gsub(/[[:space:]]/, "", line);
$0 = $0 " " line;
}
$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
"\\1(\\2, \"\\3\")", "g");
}
}
{ print }' $file;
done
Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.
auto-generated by the following:
for i in `git grep -l -w asm/unaligned.h`; do
sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
|
|
The code indicates that the min of n_commands and total commands
is returned. The comment incorrectly says it's the max(). Correct
comment to min().
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20240913223216.3234173-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
With the CXL mailbox context split out, cxl_internal_send_cmd() can take
'struct cxl_mailbox' as an input parameter rather than
'struct memdev_dev_state'. Change input parameter for
cxl_internal_send_cmd() and fixup all impacted call sites.
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20240905223711.1990186-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Create a new 'struct cxl_mailbox' and move all mailbox related bits to
it. This allows isolation of all CXL mailbox data in order to export
some of the calls to external kernel callers and avoid exporting of CXL
driver specific bits such has device states. The allocation of
'struct cxl_mailbox' is also split out with cxl_mailbox_init() so the
mailbox can be created independently.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20240905223711.1990186-3-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
A device_lock() and device_unlock() pair can be replaced by a cleanup
helper scoped_guard() or guard(), that can enhance code readability. In
CXL subsystem, still use device_lock() and device_unlock() pairs for cxl
port resource protection, most of them can be replaced by a
scoped_guard() or a guard() simply.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Li Ming <ming4.li@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20240830013138.2256244-2-ming4.li@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Series to fix XOR math for DPA to SPA translation
- Refactor and fold cxl_trace_hpa() into cxl_dpa_to_hpa()
- Complete DPA->HPA->SPA translation and correct XOR translation issue
- Add new method to verify a CXL target position
- Remove old method of CXL target position verifiation
|
|
Although cxl_trace_hpa() is used to populate TRACE EVENTs with HPA
addresses the work it performs is a DPA to HPA translation not a
trace. Tidy up this naming by moving the minimal work done in
cxl_trace_hpa() into cxl_dpa_to_hpa() and use cxl_dpa_to_hpa()
for trace event callbacks.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Link: https://patch.msgid.link/452a9b0c525b774c72d9d5851515ffa928750132.1719980933.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
cxl_event_common was an unfortunate naming choice and caused confusion with
the existing Common Event Record. Furthermore, its fields didn't map all
the common information between DRAM and General Media Events.
Remove cxl_event_common and introduce cxl_event_media_hdr to record common
information between DRAM and General Media events.
cxl_event_media_hdr, which is embedded in both cxl_event_gen_media and
cxl_event_dram, leverages the commonalities between the two events to
simplify their respective handling.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240607144423.48681-1-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Support for HPA to DPA translation for CXL events cxl_dram and
cxl_general_media.
|
|
User space may need to know which region, if any, maps the DPAs
(device physical addresses) reported in a cxl_general_media or
cxl_dram event. Since the mapping can change, the kernel provides
this information at the time the event occurs. This informs user
space that at event <timestamp> this <region> mapped this <DPA>
to this <HPA>.
Add the same region info that is included in the cxl_poison trace
event: the DPA->HPA translation, region name, and region uuid.
The new fields are inserted in the trace event and no existing
fields are modified. If the DPA is not mapped, user will see:
hpa=ULLONG_MAX, region="", and uuid=0
This work must be protected by dpa_rwsem & region_rwsem since
it is looking up region mappings.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/dd8d708b7a7ebfb64a27020a5eb338091336b34d.1714496730.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Adding UAPI support for CXL r3.1 8.2.9.5.4
Clear Log command.
This proposed patch will be useful for clearing and populating
the Vendor debug log in certain scenarios, allowing for the
aggregation of results over time.
Signed-off-by: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240313071218.729-3-sthanneeru.opensrc@micron.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Adding UAPI support for
1. CXL r3.1 8.2.9.5.3 Get Log Capabilities.
2. CXL r3.1 8.2.9.5.6 Get Supported Logs Sub-List.
Signed-off-by: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240313071218.729-2-sthanneeru.opensrc@micron.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
A recent change to cxl_mem_get_records_log() [1] highlighted a subtle
nuance of looping calls to cxl_internal_send_cmd(), i.e. that
cxl_internal_send_cmd() modifies the 'size_out' member of the @mbox_cmd
argument. That mechanism is useful for communicating underflow, but it
is unwanted when reusing @mbox_cmd for a subsequent submission. It turns
out that cxl_xfer_log() avoids this scenario by always redefining
@mbox_cmd each iteration.
Update cxl_mem_get_records_log() and cxl_mem_get_poison() to follow the
same style as cxl_xfer_log(), i.e. re-define @mbox_cmd each iteration.
The cxl_mem_get_records_log() change is just a style fixup, but the
cxl_mem_get_poison() change is a potential fix, per Alison [2]:
Poison list retrieval can hit this case if the MORE flag is set and
a follow on read of the list delivers more records than the previous
read. ie. device gives one record, sets the _MORE flag, then gives 5.
Not an urgent fix since this behavior has not been seen in the wild,
but worth tracking as a fix.
Cc: Kwangjin Ko <kwangjin.ko@sk.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Fixes: ed83f7ca398b ("cxl/mbox: Add GET_POISON_LIST mailbox command")
Link: http://lore.kernel.org/r/20240402081404.1106-2-kwangjin.ko@sk.com [1]
Link: http://lore.kernel.org/r/ZhAhAL/GOaWFrauw@aschofie-mobl2 [2]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/171235441633.2716581.12330082428680958635.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Since mbox_cmd.size_out is overwritten with the actual output size in
the function below, it needs to be initialized every time.
cxl_internal_send_cmd -> __cxl_pci_mbox_send_cmd
Problem scenario:
1) The size_out variable is initially set to the size of the mailbox.
2) Read an event.
- size_out is set to 160 bytes(header 32B + one event 128B).
- Two event are created while reading.
3) Read the new *two* events.
- size_out is still set to 160 bytes.
- Although the value of out_len is 288 bytes, only 160 bytes are
copied from the mailbox register to the local variable.
- record_count is set to 2.
- Accessing records[1] will result in reading incorrect data.
Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
Tested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The dev_dbg info for Clear Event Records mailbox command would report
the handle of the next record to clear not the current one.
This was because the index 'i' had incremented before printing the
current handle value.
Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
Signed-off-by: Yuquan Wang <wangyuquan1236@phytium.com.cn>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
cxl_dpa_perf'
In order to address the issue with being able to expose qos_class sysfs
attributes under 'ram' and 'pmem' sub-directories, the attributes must
be defined as static attributes rather than under driver->dev_groups.
To avoid implementing locking for accessing the 'struct cxl_dpa_perf`
lists, convert the list to a single 'struct cxl_dpa_perf' entry in
preparation to move the attributes to statically defined.
While theoretically a partition may have multiple qos_class via CDAT, this
has not been encountered with testing on available hardware. The code is
simplified for now to not support the complex case until a use case is
needed to support that.
Link: https://lore.kernel.org/linux-cxl/65b200ba228f_2d43c29468@dwillia2-mobl3.amr.corp.intel.com.notmuch/
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20240206190431.1810289-2-dave.jiang@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Pick up the CPER to CXL driver integration work for v6.8. Some
additional cleanup of cper_estatus_print() messages is needed, but that
is to be handled incrementally.
|
|
If the firmware has configured CXL event support to be firmware first
the OS can process those events through CPER records. The CXL layer has
unique DPA to HPA knowledge and standard event trace parsing in place.
CPER records contain Bus, Device, Function information which can be used
to identify the PCI device which is sending the event.
Change the PCI driver registration to include registration of a CXL
CPER callback to process events through the trace subsystem.
Use new scoped based management to simplify the handling of the PCI
device object.
Tested-by: Smita-Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Smita-Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-9-1bb8a4ca2c7a@intel.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
[djbw: use new pci_dev guard, flip init order]
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The CXL CPER and event log records share everything but a UUID/GUID in
their structures.
Define a cxl_event union without the UUID/GUID to be shared between the
CPER and event log record formats. Adjust the code to use this union.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-6-1bb8a4ca2c7a@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The UEFI CXL CPER structure does not include the UUID. Now that the
UUID is passed separately to the trace event there is no need to have
the UUID in those structures.
Move UUID from the event record header to the raw structures. Adjust
cxl-test to Create dummy structures for creating test records.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-5-1bb8a4ca2c7a@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The UUID data is redundant in the known event trace types. The addition
of static defines allows the trace macros to create the UUID data inside
the trace thus removing unnecessary code.
Have well known trace events use static data to set the uuid field based
on the event type.
Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-4-1bb8a4ca2c7a@intel.com
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Dan points out in review that the cxl_test code could be made better
through the use of UUID's defines rather than being open coded.[1]
Create UUID defines and use them rather than open coding them.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: http://lore.kernel.org/r/65738d09e30e2_45e0129451@dwillia2-xfh.jf.intel.com.notmuch [1]
Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-3-1bb8a4ca2c7a@intel.com
[djbw: clang-format uuid definitions]
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
CXL CPER events are identified by the CPER Section Type GUID. The GUID
correlates with the CXL UUID for the event record. It turns out that a
CXL CPER record is a strict subset of the CXL event record, only the
UUID header field is chopped.
In order to unify handling between native and CPER flavors of CXL
events, prepare the code for the UUID to be passed in rather than
inferred from the record itself.
Later patches update the passed in record to only refer to the common
data between the formats.
Pass the UUID explicitly to each trace event to be able to remove the
UUID from the event structures.
Originally it was desirable to remove the UUID from the well known event
because the UUID value was redundant. However, the trace API was
already in place.[1]
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/all/36f2d12934d64a278f2c0313cbd01abc@huawei.com [1]
Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-1-1bb8a4ca2c7a@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Pick up the CDAT parsing and QOS class infrastructure for v6.8.
|
|
Once the QTG ID _DSM is executed successfully, the QTG ID is retrieved from
the return package. Create a list of entries in the cxl_memdev context and
store the QTG ID as qos_class token and the associated DPA range. This
information can be exposed to user space via sysfs in order to help region
setup for hot-plugged CXL memory devices.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/170319625109.2212653.11872111896220384056.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Add the call to the UAPI such that userspace may corelate the
timestamps from the device log with system wall time, if, for
example there's any sort of inaccuracy or skew in the device.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20230829152014.15452-1-dave@stgolabs.net
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|