summaryrefslogtreecommitdiff
path: root/drivers/cxl/core
AgeCommit message (Collapse)Author
2025-01-29Merge tag 'cxl-for-6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull Compute Express Link (CXL) updates from Dave Jiang: "A tweak to the HMAT output that was acked by Rafael, a prep patch for CXL type2 devices support that's coming soon, refactoring of the CXL regblock enumeration code, and a series of patches to update the event records to CXL spec r3.1: - Move HMAT printouts to pr_debug() - Add CXL type2 support to cxl_dvsec_rr_decode() in preparation for type2 support - A series that updates CXL event records to spec r3.1 and related changes - Refactoring of cxl_find_regblock_instance() to count regblocks" * tag 'cxl-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/core/regs: Refactor out functions to count regblocks of given type cxl/test: Update test code for event records to CXL spec rev 3.1 cxl/events: Update Memory Module Event Record to CXL spec rev 3.1 cxl/events: Update DRAM Event Record to CXL spec rev 3.1 cxl/events: Update General Media Event Record to CXL spec rev 3.1 cxl/events: Add Component Identifier formatting for CXL spec rev 3.1 cxl/events: Update Common Event Record to CXL spec rev 3.1 cxl/pci: Add CXL Type 1/2 support to cxl_dvsec_rr_decode() ACPI/HMAT: Move HMAT messages to pr_debug()
2025-01-22cxl/core/regs: Refactor out functions to count regblocks of given typeHuaisheng Ye
cxl_find_regblock_instance() counts the number of instances of a register block as a side effect of searching through all available register blocks. cxl_count_regblock() throws away that work and recounts all the register blocks by asking cxl_find_regblock_instance() to redo work it has already done until it finally returns an error, that is needlessly wasteful. Let cxl_count_regblock() leverage the counting that cxl_find_regblock_instance() already does by passing in a sentinel value (CXL_INSTANCES_COUNT) that triggers the count to be returned. [ davej: Updated to more concise commit log supplied by djbw ] [ davej: Fix up checkpatch formatting warnings ] Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250115152600.26482-2-huaisheng.ye@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-01-13cxl/events: Update Memory Module Event Record to CXL spec rev 3.1Shiju Jose
CXL spec 3.1 section 8.2.9.2.1.3 Table 8-47, Memory Module Event Record has updated with following new fields and new info for Device Event Type and Device Health Information fields. 1. Validity Flags 2. Component Identifier 3. Device Event Sub-Type Update the Memory Module event record and Memory Module trace event for the above spec changes. The new fields are inserted in logical places. Example trace print of cxl_memory_module trace event, cxl_memory_module: memdev=mem3 host=0000:0f:00.0 serial=3 log=Fatal : \ time=371709344709 uuid=fe927475-dd59-4339-a586-79bab113b774 len=128 \ flags='0x1' handle=2 related_handle=0 maint_op_class=0 \ maint_op_sub_class=0 : event_type='Temperature Change' \ event_sub_type='Unsupported Config Data' \ health_status='MAINTENANCE_NEEDED|REPLACEMENT_NEEDED' \ media_status='All Data Loss in Event of Power Loss' as_life_used=0x3 \ as_dev_temp=Normal as_cor_vol_err_cnt=Normal as_cor_per_err_cnt=Normal \ life_used=8 device_temp=3 dirty_shutdown_cnt=33 cor_vol_err_cnt=25 \ cor_per_err_cnt=45 validity_flags='COMPONENT|COMPONENT PLDM FORMAT' \ comp_id=03 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \ comp_id_pldm_valid_flags='Resource ID' \ pldm_entity_id=0x00 pldm_resource_id=fc d2 7e 2f Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250111091756.1682-6-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-01-13cxl/events: Update DRAM Event Record to CXL spec rev 3.1Shiju Jose
CXL spec 3.1 section 8.2.9.2.1.2 Table 8-46, DRAM Event Record has updated with following new fields and new types for Memory Event Type, Transaction Type and Validity Flags fields. 1. Component Identifier 2. Sub-channel 3. Advanced Programmable Corrected Memory Error Threshold Event Flags 4. Corrected Memory Error Count at Event 5. Memory Event Sub-Type Update DRAM events record and DRAM trace event for the above spec changes. The new fields are inserted in logical places. Includes trivial consistency of white space improvements. Example trace print of cxl_dram trace event, cxl_dram: memdev=mem0 host=0000:0f:00.0 serial=3 log=Informational : \ time=54799339519 uuid=601dcbb3-9c06-4eab-b8af-4e9bfb5c9624 len=128 \ flags='0x1' handle=1 related_handle=0 maint_op_class=1 \ maint_op_sub_class=3 : dpa=18680 dpa_flags='' \ descriptor='UNCORRECTABLE_EVENT|THRESHOLD_EVENT' type='Data Path Error' \ sub_type='Media Link CRC Error' transaction_type='Internal Media Scrub' \ channel=3 rank=17 nibble_mask=3b00b2 bank_group=7 bank=11 row=2 \ column=77 cor_mask=21 00 00 00 00 00 00 00 2c 00 00 00 00 00 00 00 37 00 \ 00 00 00 00 00 00 42 00 00 00 00 00 00 00 validity_flags='CHANNEL|RANK|NIBBLE|\ BANK GROUP|BANK|ROW|COLUMN|CORRECTION MASK|COMPONENT|COMPONENT PLDM FORMAT' \ comp_id=01 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \ comp_id_pldm_valid_flags='PLDM Entity ID' pldm_entity_id=74 c5 08 9a 1a 0b \ pldm_resource_id=0x00 hpa=ffffffffffffffff region= \ region_uuid=00000000-0000-0000-0000-000000000000 sub_channel=5 \ cme_threshold_ev_flags='Corrected Memory Errors in Multiple Media Components|\ Exceeded Programmable Threshold' cvme_count=148 Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250111091756.1682-5-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-01-13cxl/events: Update General Media Event Record to CXL spec rev 3.1Shiju Jose
CXL spec rev 3.1 section 8.2.9.2.1.1 Table 8-45, General Media Event Record has updated with following new fields and new types for Memory Event Type and Transaction Type fields. 1. Advanced Programmable Corrected Memory Error Threshold Event Flags 2. Corrected Memory Error Count at Event 3. Memory Event Sub-Type The format of component identifier has changed (CXL spec 3.1 section 8.2.9.2.1 Table 8-44). Update the general media event record and general media trace event for the above spec changes. The new fields are inserted in logical places. Example trace log of cxl_general_media trace event, cxl_general_media: memdev=mem0 host=0000:0f:00.0 serial=3 log=Fatal : \ time=156831237413 uuid=fbcd0a77-c260-417f-85a9-088b1621eba6 len=128 \ flags='0x1' handle=1 related_handle=0 maint_op_class=2 \ maint_op_sub_class=4 : dpa=30d40 dpa_flags='' \ descriptor='UNCORRECTABLE_EVENT|THRESHOLD_EVENT|POISON_LIST_OVERFLOW' \ type='TE State Violation' sub_type='Media Link Command Training Error' \ transaction_type='Host Inject Poison' channel=3 rank=33 device=5 \ validity_flags='CHANNEL|RANK|DEVICE|COMPONENT|COMPONENT PLDM FORMAT' \ comp_id=03 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \ comp_id_pldm_valid_flags='PLDM Entity ID | Resource ID' \ pldm_entity_id=74 c5 08 9a 1a 0b pldm_resource_id=fc d2 7e 2f \ hpa=ffffffffffffffff region= \ region_uuid=00000000-0000-0000-0000-000000000000 \ cme_threshold_ev_flags='Corrected Memory Errors in Multiple Media \ Components|Exceeded Programmable Threshold' cme_count=120 Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250111091756.1682-4-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-01-13cxl/events: Add Component Identifier formatting for CXL spec rev 3.1Shiju Jose
Add Component Identifier formatting for CXL spec rev 3.1, Section 8.2.9.2.1, Table 8-44. Examples for Component Identifier format in trace log, validity_flags='CHANNEL|RANK|DEVICE|COMPONENT|COMPONENT PLDM FORMAT' \ comp_id=03 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \ comp_id_pldm_valid_flags='PLDM Entity ID | Resource ID' \ pldm_entity_id=74 c5 08 9a 1a 0b pldm_resource_id=fc d2 7e 2f \ validity_flags='COMPONENT|COMPONENT PLDM FORMAT' \ comp_id=02 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \ comp_id_pldm_valid_flags='Resource ID' \ pldm_entity_id=0x00 pldm_resource_id=fc d2 7e 2f If the validity flags for component ID/component ID format or PLDM ID or resource ID are not set, then pldm_entity_id=0x00 or pldm_resource_id=0x00 would be printed. Component identifier formatting is used in the subsequent patches. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250111091756.1682-3-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-01-13cxl/events: Update Common Event Record to CXL spec rev 3.1Shiju Jose
CXL spec 3.1 section 8.2.9.2.1 Table 8-42, Common Event Record format has updated with Maintenance Operation Subclass information. Add updates for the above spec change in the CXL events record and CXL common trace event implementations. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Link: https://patch.msgid.link/20250111091756.1682-2-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-01-13Merge 6.13-rc7 into driver-core-nextGreg Kroah-Hartman
We need the debugfs / driver-core fixes in here as well for testing and to build on top of. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-10driver core: Correct API device_for_each_child_reverse_from() prototypeZijun Hu
For API device_for_each_child_reverse_from(..., const void *data, int (*fn)(struct device *dev, const void *data)) - Type of @data is const pointer, and means caller's data @*data is not allowed to be modified, but that usually is not proper for such non finding device iterating API. - Types for both @data and @fn are not consistent with all other for_each device iterating APIs device_for_each_child(_reverse)(), bus_for_each_dev() and (driver|class)_for_each_device(). Correct its prototype by removing const from parameter types, then adapt for various existing usages. An dedicated typedef device_iter_t will be introduced as @fn() type for various for_each device interating APIs later. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20250105-class_fix-v6-6-3a2f1768d4d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03cxl/pmem: Remove is_cxl_nvdimm_bridge()Zijun Hu
Remove is_cxl_nvdimm_bridge() which has no caller now. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-11-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03cxl/pmem: Replace match_nvdimm_bridge() with API device_match_type()Zijun Hu
Static match_nvdimm_bridge(), as matching function of device_find_child() matches a device with device type @cxl_nvdimm_bridge_type, and its task can be simplified by the recently introduced API device_match_type(). Replace match_nvdimm_bridge() usage with device_match_type(). Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-10-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-03driver core: Constify API device_find_child() and adapt for various usagesZijun Hu
Constify the following API: struct device *device_find_child(struct device *dev, void *data, int (*match)(struct device *dev, void *data)); To : struct device *device_find_child(struct device *dev, const void *data, device_match_t match); typedef int (*device_match_t)(struct device *dev, const void *data); with the following reasons: - Protect caller's match data @*data which is for comparison and lookup and the API does not actually need to modify @*data. - Make the API's parameters (@match)() and @data have the same type as all of other device finding APIs (bus|class|driver)_find_device(). - All kinds of existing device match functions can be directly taken as the API's argument, they were exported by driver core. Constify the API and adapt for various existing usages. BTW, various subsystem changes are squashed into this commit to meet 'git bisect' requirement, and this commit has the minimal and simplest changes to complement squashing shortcoming, and that may bring extra code improvement. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Acked-by: Uwe Kleine-König <ukleinek@kernel.org> # for drivers/pwm Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/20241224-const_dfc_done-v5-4-6623037414d4@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-02cxl/pci: Add CXL Type 1/2 support to cxl_dvsec_rr_decode()Alejandro Lucero
In cxl_dvsec_rr_decode() the pci driver expects to retrieve a cxlds, struct cxl_dev_state, from the driver_data field of struct device. While that works for Type 3, drivers for Type 1/2 devices may not put a cxlds in the driver_data field. In preparation for supporting Type 1/2 devices, replace parameter 'struct device' with 'struct cxl_dev_state' in cxl_dvsec_rr_decode(). Remove the unused parameter 'cxl_port' in cxl_dvsec_rr_decode(). Signed-off-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20241203162112.5088-1-alucerop@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-12-10cxl/region: Fix region creation for greater than x2 switchesHuaisheng Ye
The cxl_port_setup_targets() algorithm fails to identify valid target list ordering in the presence of 4-way and above switches resulting in 'cxl create-region' failures of the form: $ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0 cxl region: create_region: region0: failed to set target7 to mem0 cxl region: cmd_create_region: created 0 regions [kernel debug message] check_last_peer:1213: cxl region0: pci0000:0c:port1: cannot host mem6:decoder7.0 at 2 bus_remove_device:574: bus: 'cxl': remove device region0 QEMU can create this failing topology: ACPI0017:00 [root0] | HB_0 [port1] / \ RP_0 RP_1 | | USP [port2] USP [port3] / / \ \ / / \ \ DSP DSP DSP DSP DSP DSP DSP DSP | | | | | | | | mem4 mem6 mem2 mem7 mem1 mem3 mem5 mem0 Pos: 0 2 4 6 1 3 5 7 HB: Host Bridge RP: Root Port USP: Upstream Port DSP: Downstream Port ...with the following command steps: $ qemu-system-x86_64 -machine q35,cxl=on,accel=tcg \ -smp cpus=8 \ -m 8G \ -hda /home/work/vm-images/centos-stream8-02.qcow2 \ -object memory-backend-ram,size=4G,id=m0 \ -object memory-backend-ram,size=4G,id=m1 \ -object memory-backend-ram,size=2G,id=cxl-mem0 \ -object memory-backend-ram,size=2G,id=cxl-mem1 \ -object memory-backend-ram,size=2G,id=cxl-mem2 \ -object memory-backend-ram,size=2G,id=cxl-mem3 \ -object memory-backend-ram,size=2G,id=cxl-mem4 \ -object memory-backend-ram,size=2G,id=cxl-mem5 \ -object memory-backend-ram,size=2G,id=cxl-mem6 \ -object memory-backend-ram,size=2G,id=cxl-mem7 \ -numa node,memdev=m0,cpus=0-3,nodeid=0 \ -numa node,memdev=m1,cpus=4-7,nodeid=1 \ -netdev user,id=net0,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=net0 \ -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \ -device cxl-upstream,bus=root_port0,id=us0 \ -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem0,id=cxl-vmem0 \ -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem1,id=cxl-vmem1 \ -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ -device cxl-type3,bus=swport2,volatile-memdev=cxl-mem2,id=cxl-vmem2 \ -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ -device cxl-type3,bus=swport3,volatile-memdev=cxl-mem3,id=cxl-vmem3 \ -device cxl-upstream,bus=root_port1,id=us1 \ -device cxl-downstream,port=4,bus=us1,id=swport4,chassis=0,slot=8 \ -device cxl-type3,bus=swport4,volatile-memdev=cxl-mem4,id=cxl-vmem4 \ -device cxl-downstream,port=5,bus=us1,id=swport5,chassis=0,slot=9 \ -device cxl-type3,bus=swport5,volatile-memdev=cxl-mem5,id=cxl-vmem5 \ -device cxl-downstream,port=6,bus=us1,id=swport6,chassis=0,slot=10 \ -device cxl-type3,bus=swport6,volatile-memdev=cxl-mem6,id=cxl-vmem6 \ -device cxl-downstream,port=7,bus=us1,id=swport7,chassis=0,slot=11 \ -device cxl-type3,bus=swport7,volatile-memdev=cxl-mem7,id=cxl-vmem7 \ -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=32G & In Guest OS: $ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0 Fix the method to calculate @distance by iterativeley multiplying the number of targets per switch port. This also follows the algorithm recommended here [1]. Fixes: 27b3f8d13830 ("cxl/region: Program target lists") Link: http://lore.kernel.org/6538824b52349_7258329466@dwillia2-xfh.jf.intel.com.notmuch [1] Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com> Tested-by: Li Zhijian <lizhijian@fujitsu.com> [djbw: add a comment explaining 'distance'] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/173378716722.1270362.9546805175813426729.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-12-02module: Convert symbol namespace to string literalPeter Zijlstra
Clean up the existing export namespace code along the same lines of commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo) to __section("foo")") and for the same reason, it is not desired for the namespace argument to be a macro expansion itself. Scripted using git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file; do awk -i inplace ' /^#define EXPORT_SYMBOL_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /^#define MODULE_IMPORT_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /MODULE_IMPORT_NS/ { $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g"); } /EXPORT_SYMBOL_NS/ { if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) { if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ && $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ && $0 !~ /^my/) { getline line; gsub(/[[:space:]]*\\$/, ""); gsub(/[[:space:]]/, "", line); $0 = $0 " " line; } $0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/, "\\1(\\2, \"\\3\")", "g"); } } { print }' $file; done Requested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc Acked-by: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-22Merge tag 'cxl-for-6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl updates from Dave Jiang: - Constify range_contains() input parameters to prevent changes - Add support for displaying RCD capabilities in sysfs to support lspci for CXL device - Downgrade warning message to debug in cxl_probe_component_regs() - Add support for adding a printf specifier '%pra' to emit 'struct range' content: - Add sanity tests for 'struct resource' - Add documentation for special case - Add %pra for 'struct range' - Add %pra usage in CXL code - Add preparation code for DCD support: - Add range_overlaps() - Add CDAT DSMAS table shared and read only flag in ACPICA - Add documentation to 'struct dev_dax_range' - Delay event buffer allocation in CXL PCI code until needed - Use guard() in cxl_dpa_set_mode() - Refactor create region code to consolidate common code * tag 'cxl-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Refactor common create region code cxl/hdm: Use guard() in cxl_dpa_set_mode() cxl/pci: Delay event buffer allocation dax: Document struct dev_dax_range ACPI/CDAT: Add CDAT/DSMAS shared and read only flag values range: Add range_overlaps() cxl/cdat: Use %pra for dpa range outputs printf: Add print format (%pra) for struct range Documentation/printf: struct resource add start == end special case test printf: Add very basic struct resource tests cxl: downgrade a warning message to debug level in cxl_probe_component_regs() cxl/pci: Add sysfs attribute for CXL 1.1 device link status cxl/core/regs: Add rcd_pcie_cap initialization kernel/range: Const-ify range_contains parameters
2024-11-08Merge branch 'cxl/for-6.13/dcd-prep' into cxl-for-nextDave Jiang
Add preparation patches for coming soon DCD changes. - Add range_overlaps() - Add CDAT/DSMAS shared and read only flag in ACPICA - Add documentation to struct dev_dax_range - Delay event buffer allocation in CXL PCI - Use guard() in cxl_dpa_set_mode() - Refactor common create region code to reduce redudant code
2024-11-08cxl/region: Refactor common create region codeIra Weiny
create_pmem_region_store() and create_ram_region_store() are identical with the exception of the region mode. With the addition of DC region mode this would end up being 3 copies of the same code. Refactor create_pmem_region_store() and create_ram_region_store() to use a single common function to be used in subsequent DC code. Suggested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Li Ming <ming4.li@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20241107-dcd-type2-upstream-v7-6-56a84e66bc36@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-11-08cxl/hdm: Use guard() in cxl_dpa_set_mode()Ira Weiny
Additional DCD functionality is being added to this call which will be simplified by the use of guard() with the cxl_dpa_rwsem. Convert the function to use guard() prior to adding DCD functionality. Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20241107-dcd-type2-upstream-v7-5-56a84e66bc36@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-10-28Merge branch 'cxl/for-6.12/printf' into cxl-for-nextDave Jiang
Add support for adding a printf specifier '$pra' to emit 'struct range' content.
2024-10-28cxl/cdat: Use %pra for dpa range outputsIra Weiny
Now that there is a printf specifier for struct range use it to enhance the debug output of CDAT data. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Link: https://patch.msgid.link/20241025-cxl-pra-v2-4-123a825daba2@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-10-28cxl: downgrade a warning message to debug level in cxl_probe_component_regs()Coly Li
In cxl_probe_component_regs() the error message "Couldn't locate the CXL.cache and CXL.mem capability array header." is potentially a false positive error condition. Downgrade the message from error level to debug level by using dev_dbg() to print the message, and the end users won't worry about the message anymore. [djbw/iweiny: Fix up changelog] Reported-by: Kelvin Shieh <kshieh@lenovo.com> Signed-off-by: Coly Li <colyli@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Cc: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20241021050443.318712-1-colyli@suse.de Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-10-28cxl/core/regs: Add rcd_pcie_cap initializationKobayashi,Daisuke
Add rcd_pcie_cap and its initialization to cache the offset of cxl1.1 device link status information. By caching it, avoid the walking memory map area to find the offset when output the register value. Given that this solution involves port lookups via cxl_pci_find_port() and multiple exit paths where that reference needs to be dropped, introduce a new put_cxl_root() scope-based-free handler. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Kobayashi,Daisuke <kobayashi.da-06@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20241002011549.408412-2-kobayashi.da-06@fujitsu.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-10-25cxl/port: Prevent out-of-order decoder allocationDan Williams
With the recent change to allow out-of-order decoder de-commit it highlights a need to strengthen the in-order decoder commit guarantees. As it stands match_free_decoder() ensures that if 2 regions are racing decoder allocations the one that wins the race will get the lower id decoder, but that still leaves the race to *commit* the decoder. Rather than have this complicated case of "reserved in-order, but may still commit out-of-order", just arrange for the reservation order to match the commit-order. In other words, prevent subsequent allocations until the last reservation is committed. This precludes overlapping region creation events and requires the previous regionN to either move forward to the decoder commit stage or drop its reservation before regionN+1 can move forward. That is, provided that regionN and regionN+1 decode through the same switch port. As a side effect this allows match_free_decoder() to drop its dependency on needing write access to the device_find_child() @data parameter [1]. Reported-by: Zijun Hu <quic_zijuhu@quicinc.com> Closes: http://lore.kernel.org/20240905-const_dfc_prepare-v4-0-4180e1d5a244@quicinc.com Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964783668.81806.14962699553881333486.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25cxl/port: Fix use-after-free, permit out-of-order decoder shutdownDan Williams
In support of investigating an initialization failure report [1], cxl_test was updated to register mock memory-devices after the mock root-port/bus device had been registered. That led to cxl_test crashing with a use-after-free bug with the following signature: cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1 cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1 cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0 1) cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1 [..] cxld_unregister: cxl decoder14.0: cxl_region_decode_reset: cxl_region region3: mock_decoder_reset: cxl_port port3: decoder3.0 reset 2) mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1 cxl_endpoint_decoder_release: cxl decoder14.0: [..] cxld_unregister: cxl decoder7.0: 3) cxl_region_decode_reset: cxl_region region3: Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI [..] RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core] [..] Call Trace: <TASK> cxl_region_decode_reset+0x69/0x190 [cxl_core] cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x5d/0x60 [cxl_core] At 1) a region has been established with 2 endpoint decoders (7.0 and 14.0). Those endpoints share a common switch-decoder in the topology (3.0). At teardown, 2), decoder14.0 is the first to be removed and hits the "out of order reset case" in the switch decoder. The effect though is that region3 cleanup is aborted leaving it in-tact and referencing decoder14.0. At 3) the second attempt to teardown region3 trips over the stale decoder14.0 object which has long since been deleted. The fix here is to recognize that the CXL specification places no mandate on in-order shutdown of switch-decoders, the driver enforces in-order allocation, and hardware enforces in-order commit. So, rather than fail and leave objects dangling, always remove them. In support of making cxl_region_decode_reset() always succeed, cxl_region_invalidate_memregion() failures are turned into warnings. Crashing the kernel is ok there since system integrity is at risk if caches cannot be managed around physical address mutation events like CXL region destruction. A new device_for_each_child_reverse_from() is added to cleanup port->commit_end after all dependent decoders have been disabled. In other words if decoders are allocated 0->1->2 and disabled 1->2->0 then port->commit_end only decrements from 2 after 2 has been disabled, and it decrements all the way to zero since 1 was disabled previously. Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Cc: stable@vger.kernel.org Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()Dan Williams
It turns out since its original introduction, pre-2.6.12, bus_rescan_devices() has skipped devices that might be in the process of attaching or detaching from their driver. For CXL this behavior is unwanted and expects that cxl_bus_rescan() is a probe barrier. That behavior is simple enough to achieve with bus_for_each_dev() paired with call to device_attach(), and it is unclear why bus_rescan_devices() took the position of lockless consumption of dev->driver which is racy. The "Fixes:" but no "Cc: stable" on this patch reflects that the issue is merely by inspection since the bug that triggered the discovery of this potential problem [1] is fixed by other means. However, a stable backport should do no harm. Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25cxl/events: Fix Trace DRAM Event RecordShiju Jose
CXL spec rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record. Fix decode memory event type field of DRAM Event Record. For e.g. if value is 0x1 it will be reported as an Invalid Address (General Media Event Record - Memory Event Type) instead of Scrub Media ECC Error (DRAM Event Record - Memory Event Type) and so on. Fixes: 2d6c1e6d60ba ("cxl/mem: Trace DRAM Event Record") Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20241014143003.1170-1-shiju.jose@huawei.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-24cxl/core: Return error when cxl_endpoint_gather_bandwidth() handles a ↵Li Zhijian
non-PCI device The function cxl_endpoint_gather_bandwidth() invokes pci_bus_read/write_XXX(), however, not all CXL devices are presently implemented via PCI. It is recognized that the cxl_test has realized a CXL device using a platform device. Calling pci_bus_read/write_XXX() in cxl_test will cause kernel panic: platform cxl_host_bridge.3: host supports CXL (restricted) Oops: general protection fault, probably for non-canonical address 0x3ef17856fcae4fbd: 0000 [#1] PREEMPT SMP PTI Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? die_addr+0x38/0x60 ? exc_general_protection+0x1f5/0x4b0 ? asm_exc_general_protection+0x22/0x30 ? pci_bus_read_config_word+0x1c/0x60 pcie_capability_read_word+0x93/0xb0 pcie_link_speed_mbps+0x18/0x50 cxl_pci_get_bandwidth+0x18/0x60 [cxl_core] cxl_endpoint_gather_bandwidth.constprop.0+0xf4/0x230 [cxl_core] ? xas_store+0x54/0x660 ? preempt_count_add+0x69/0xa0 ? _raw_spin_lock+0x13/0x40 ? __kmalloc_cache_noprof+0xe7/0x270 cxl_region_shared_upstream_bandwidth_update+0x9c/0x790 [cxl_core] cxl_region_attach+0x520/0x7e0 [cxl_core] store_targetN+0xf2/0x120 [cxl_core] kernfs_fop_write_iter+0x13a/0x1f0 vfs_write+0x23b/0x410 ksys_write+0x53/0xd0 do_syscall_64+0x62/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e And Ying also reported a KASAN error with similar calltrace. Reported-by: Huang, Ying <ying.huang@intel.com> Closes: http://lore.kernel.org/87y12w9vp5.fsf@yhuang6-desk2.ccr.corp.intel.com Fixes: a5ab0de0ebaa ("cxl: Calculate region bandwidth of targets with shared upstream link") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Huang, Ying <ying.huang@intel.com> Link: https://patch.msgid.link/20241022030054.258942-1-lizhijian@fujitsu.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-02move asm/unaligned.h to linux/unaligned.hAl Viro
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-09-22cxl: Calculate region bandwidth of targets with shared upstream linkDave Jiang
The current bandwidth calculation aggregates all the targets. This simple method does not take into account where multiple targets sharing under a switch or a root port where the aggregated bandwidth can be greater than the upstream link of the switch. To accurately account for the shared upstream uplink cases, a new update function is introduced by walking from the leaves to the root of the hierarchy and clamp the bandwidth in the process as needed. This process is done when all the targets for a region are present but before the final values are send to the HMAT handling code cached access_coordinate targets. The original perf calculation path was kept to calculate the latency performance data that does not require the shared link consideration. The shared upstream link calculation is done as a second pass when all the endpoints have arrived. Testing is done via qemu with CXL hierarchy. run_qemu[1] is modified to support several CXL hierarchy layouts. The following layouts are tested: HB: Host Bridge RP: Root Port SW: Switch EP: End Point 2 HB 2 RP 2 EP: resulting bandwidth: 624 1 HB 2 RP 2 EP: resulting bandwidth: 624 2 HB 2 RP 2 SW 4 EP: resulting bandwidth: 624 Current testing, perf number from SRAT/HMAT is hacked into the kernel code. However with new QEMU support of Generic Target Port that's incoming, the perf data injection is no longer needed. [1]: https://github.com/pmem/run_qemu Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://lore.kernel.org/linux-cxl/20240501152503.00002e60@Huawei.com/ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20240904001316.1688225-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-22cxl: Preserve the CDAT access_coordinate for an endpointDave Jiang
Keep the access_coordinate from the CDAT tables for region perf calculations. The region perf calculation requires all participating endpoints to have arrived in order to determine if there are limitations of bandwidth data due to shared uplink. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20240904001316.1688225-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-18cxl: Fix comment regarding cxl_query_cmd() return dataDave Jiang
The code indicates that the min of n_commands and total commands is returned. The comment incorrectly says it's the max(). Correct comment to min(). Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240913223216.3234173-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-12cxl: Convert cxl_internal_send_cmd() to use 'struct cxl_mailbox' as inputDave Jiang
With the CXL mailbox context split out, cxl_internal_send_cmd() can take 'struct cxl_mailbox' as an input parameter rather than 'struct memdev_dev_state'. Change input parameter for cxl_internal_send_cmd() and fixup all impacted call sites. Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20240905223711.1990186-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-12cxl: Move mailbox related bits to the same contextDave Jiang
Create a new 'struct cxl_mailbox' and move all mailbox related bits to it. This allows isolation of all CXL mailbox data in order to export some of the calls to external kernel callers and avoid exporting of CXL driver specific bits such has device states. The allocation of 'struct cxl_mailbox' is also split out with cxl_mailbox_init() so the mailbox can be created independently. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alejandro Lucero <alucerop@amd.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20240905223711.1990186-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09cxl: move cxl headers to new include/cxl/ directoryDave Jiang
Group all cxl related kernel headers into include/cxl/ directory. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20240905223711.1990186-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09cxl/region: Remove lock from memory notifier callbackIra Weiny
In testing Dynamic Capacity Device (DCD) support, a lockdep splat revealed an ABBA issue between the memory notifiers and the DCD extent processing code.[0] Changing the lock ordering within DCD proved difficult because regions must be stable while searching for the proper region and then the device lock must be held to properly notify the DAX region driver of memory changes. Dan points out in the thread that notifiers should be able to trust that it is safe to access static data. Region data is static once the device is realized and until it's destruction. Thus it is better to manage the notifiers within the region driver. Remove the need for a lock by ensuring the notifiers are active only during the region's lifetime. Furthermore, remove cxl_region_nid() because resource can't be NULL while the region is stable. Link: https://lore.kernel.org/all/66b4cf539a79b_a36e829416@iweiny-mobl.notmuch/ [0] Cc: Ying Huang <ying.huang@intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ying Huang <ying.huang@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20240904-fix-notifiers-v3-1-576b4e950266@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09cxl/pci: simplify the check of mem_enabled in cxl_hdm_decode_init()Yanfei Xu
Cases can be divided into two categories which are DVSEC range enabled and not enabled when HDM decoders exist but is not enabled. To avoid checking info->mem_enabled, which indicates the enablement of DVSEC range, every time, we can check !info->mem_enabled once in advance. This simplification can make the code clearer. No functional change intended. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240828084231.1378789-5-yanfei.xu@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09cxl/pci: Check Mem_info_valid bit for each applicable DVSECYanfei Xu
In theory a device might set the mem_info_valid bit for a first range after it is ready but before as second range has reached that state. Therefore, the correct approach is to check the Mem_info_valid bit for each applicable DVSEC range against HDM_COUNT, rather than only for the DVSEC range 1. Consequently, let's move the check into the "for loop" that handles each DVSEC range. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240828084231.1378789-4-yanfei.xu@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09cxl/pci: Remove duplicated implementation of waiting for memory_info_validYanfei Xu
commit ce17ad0d5498 ("cxl: Wait Memory_Info_Valid before access memory related info") added another implementation, which is cxl_dvsec_mem_range_valid(), of waiting for memory_info_valid without realizing it duplicated wait_for_valid(). Remove wait_for_valid() and retain cxl_dvsec_mem_range_valid() as the former is hardcoded to check only the Memory_Info_Valid bit of DVSEC range 1, while the latter allows for selection between DVSEC range 1 or 2 via parameter. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240828084231.1378789-3-yanfei.xu@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09cxl/pci: Fix to record only non-zero rangesYanfei Xu
The function cxl_dvsec_rr_decode() retrieves and records DVSEC ranges into info->dvsec_range[], regardless of whether it is non-zero range, and the variable info->ranges indicates the number of non-zero ranges. However, in cxl_hdm_decode_init(), the validation for info->dvsec_range[] occurs in a for loop that iterates based on info->ranges. It may result in zero range to be validated but non-zero range not be validated, in turn, the number of allowed ranges is to be 0. Address it by only record non-zero ranges. This fix is not urgent as it requires a configuration that zeroes out the first dvsec range while populating the second. This has not been observed, but it is theoretically possible. If this gets picked up for -stable, no harm done, but there is no urgency to backport. Fixes: 560f78559006 ("cxl/pci: Retrieve CXL DVSEC memory info") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240828084231.1378789-2-yanfei.xu@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-03cxl/pci: Remove duplicate host_bridge->native_aer checkingLi Ming
cxl_dport_init_ras_reporting() already checks host_bridge->native_aer before invoking cxl_disable_rch_root_ints(), so cxl_disable_rch_root_ints() does not need to check it again. Signed-off-by: Li Ming <ming4.li@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240830061308.2327065-3-ming4.li@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-03cxl/pci: cxl_dport_map_rch_aer() cleanupLi Ming
cxl_dport_map_ras() is used to map CXL RAS capability, the RCH AER capability should not be mapped in the function but should mapped in cxl_dport_init_ras_reporting(). Moving cxl_dport_map_ras() out of cxl_dport_map_ras() and into cxl_dport_init_ras_reporting(). In cxl_dport_init_ras_reporting(), the AER capability position in RCRB will be located but the position is only used in cxl_dport_map_rch_aer(), getting the position in cxl_dport_map_rch_aer() rather than cxl_dport_init_ras_reporting() is more reasonable and makes the code clearer. Besides, some local variables in cxl_dport_map_rch_aer() are unnecessary, remove them to make the function more concise. Signed-off-by: Li Ming <ming4.li@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240830061308.2327065-2-ming4.li@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-03cxl/pci: Rename cxl_setup_parent_dport() and cxl_dport_map_regs()Li Ming
The name of cxl_setup_parent_dport() function is not clear, the function is used to initialize AER and RAS capabilities on a dport, therefore, rename the function to cxl_dport_init_ras_reporting(), it is easier for user to understand what the function does. Besides, adjust the order of the function parameters, the subject of cxl_dport_init_ras_reporting() is a cxl dport, so a struct cxl_dport as the first parameter of the function should be better. cxl_dport_map_regs() is used to map CXL RAS capability on a cxl dport, using cxl_dport_map_ras() as the function name. Signed-off-by: Li Ming <ming4.li@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240830061308.2327065-1-ming4.li@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-03cxl/port: Refactor __devm_cxl_add_port() to drop goto patternLi Ming
In __devm_cxl_add_port(), there is a 'goto' to call put_device() for the error cases between device_initialize() and device_add() to dereference the 'struct device' of a new cxl_port. The 'goto' pattern in the case can be removed by refactoring. Introducing a new function called cxl_port_add() which is used to add the 'struct device' of a new cxl_port to device hierarchy, moving the functions needing the help of the 'goto' into cxl_port_add(), and using a scoped-based resource management __free() to drop the open coded put_device() and the 'goto' for the error cases. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Li Ming <ming4.li@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwaei.com> Link: https://patch.msgid.link/20240830013138.2256244-3-ming4.li@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-03cxl/port: Use scoped_guard()/guard() to drop device_lock() for cxl_portLi Ming
A device_lock() and device_unlock() pair can be replaced by a cleanup helper scoped_guard() or guard(), that can enhance code readability. In CXL subsystem, still use device_lock() and device_unlock() pairs for cxl port resource protection, most of them can be replaced by a scoped_guard() or a guard() simply. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Li Ming <ming4.li@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240830013138.2256244-2-ming4.li@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-03cxl/port: Use __free() to drop put_device() for cxl_portLi Ming
Using scope-based resource management __free() marco with a new helper called put_cxl_port() to drop open coded the put_device() used to dereference the 'struct device' in cxl_port. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Li Ming <ming4.li@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240830013138.2256244-1-ming4.li@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-03cxl: Remove duplicate included header file core.hHongbo Li
The header file core.h is included twice. Remove the last one. The compilation test has passed. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20240830080016.3542184-1-lihongbo22@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-03cxl/port: Convert to use ERR_CAST()Yuesong Li
Use ERR_CAST() as it is designed for casting an error pointer to another type. This macro utilizes the __force and __must_check modifiers, which instruct the compiler to verify for errors at the locations where it is employed. Signed-off-by: Yuesong Li <liyuesong@vivo.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240829125235.3266865-1-liyuesong@vivo.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-08-09cxl/pci: Get AER capability address from RCRB only for RCH dportLi Ming
cxl_setup_parent_dport() needs to get RCH dport AER capability address from RCRB to disable AER interrupt. The function does not check if dport is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev in VH case because dport_dev points to a pci device(RP or switch DSP) rather than a pci host bridge device. Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery") Signed-off-by: Li Ming <ming4.li@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240809082750.3015641-2-ming4.li@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-28Merge tag 'cxl-for-6.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL updates from Dave Jiang: "Core: - A CXL maturity map has been added to the documentation to detail the current state of CXL enabling. It provides the status of the current state of various CXL features to inform current and future contributors of where things are and which areas need contribution. - A notifier handler has been added in order for a newly created CXL memory region to trigger the abstract distance metrics calculation. This should bring parity for CXL memory to the same level vs hotplugged DRAM for NUMA abstract distance calculation. The abstract distance reflects relative performance used for memory tiering handling. - An addition for XOR math has been added to address the CXL DPA to SPA translation. CXL address translation did not support address interleave math with XOR prior to this change. Fixes: - Fix to address race condition in the CXL memory hotplug notifier - Add missing MODULE_DESCRIPTION() for CXL modules - Fix incorrect vendor debug UUID define Misc: - A warning has been added to inform users of an unsupported configuration when mixing CXL VH and RCH/RCD hierarchies - The ENXIO error code has been replaced with EBUSY for inject poison limit reached via debugfs and cxl-test support - Moving the PCI config read in cxl_dvsec_rr_decode() to avoid unnecessary PCI config reads - A refactor to a common struct for DRAM and general media CXL events" * tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/core/pci: Move reading of control register to immediately before usage cxl: Remove defunct code calculating host bridge target positions cxl/region: Verify target positions using the ordered target list cxl: Restore XOR'd position bits during address translation cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa() cxl/test: Replace ENXIO with EBUSY for inject poison limit reached cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy cxl/core: Fix incorrect vendor debug UUID define Documentation: CXL Maturity Map cxl/region: Simplify cxl_region_nid() cxl/region: Support to calculate memory tier abstract distance cxl/region: Fix a race condition in memory hotplug notifier cxl: add missing MODULE_DESCRIPTION() macros cxl/events: Use a common struct for DRAM and General Media events