summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu/cacheinfo.c
AgeCommit message (Collapse)Author
2025-05-16x86/cacheinfo: Rename CPUID(0x2) descriptors iterator parameterAhmed S. Darwish
The CPUID(0x2) descriptors iterator has been renamed from: for_each_leaf_0x2_entry() to: for_each_cpuid_0x2_desc() since it iterates over CPUID(0x2) cache and TLB "descriptors", not "entries". In the macro's x86/cacheinfo call-site, rename the parameter denoting the parsed descriptor at each iteration from 'entry' to 'desc'. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250508150240.172915-7-darwi@linutronix.de
2025-05-16x86/cpuid: Rename cpuid_get_leaf_0x2_regs() to cpuid_leaf_0x2()Ahmed S. Darwish
Rename the CPUID(0x2) register accessor function: cpuid_get_leaf_0x2_regs(regs) to: cpuid_leaf_0x2(regs) for consistency with other <cpuid/api.h> accessors that return full CPUID registers outputs like: cpuid_leaf(regs) cpuid_subleaf(regs) In the same vein, rename the CPUID(0x2) iteration macro: for_each_leaf_0x2_entry() to: for_each_cpuid_0x2_desc() to include "cpuid" in the macro name, and since what is iterated upon is CPUID(0x2) cache and TLB "descriptos", not "entries". Prefix an underscore to that iterator macro parameters, so that the newly renamed 'desc' parameter do not get mixed with "union leaf_0x2_regs :: desc[]" in the macro's implementation. Adjust all the affected call-sites accordingly. While at it, use "CPUID(0x2)" instead of "CPUID leaf 0x2" as this is the recommended style. No change in functionality intended. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250508150240.172915-6-darwi@linutronix.de
2025-05-15x86/cpuid: Set <asm/cpuid/api.h> as the main CPUID headerAhmed S. Darwish
The main CPUID header <asm/cpuid.h> was originally a storefront for the headers: <asm/cpuid/api.h> <asm/cpuid/leaf_0x2_api.h> Now that the latter CPUID(0x2) header has been merged into the former, there is no practical difference between <asm/cpuid.h> and <asm/cpuid/api.h>. Migrate all users to the <asm/cpuid/api.h> header, in preparation of the removal of <asm/cpuid.h>. Don't remove <asm/cpuid.h> just yet, in case some new code in -next started using it. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250508150240.172915-3-darwi@linutronix.de
2025-04-14x86/platform/amd: Move the <asm/amd_nb.h> header to <asm/amd/nb.h>Ingo Molnar
Collect AMD specific platform header files in <asm/amd/*.h>. Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mario Limonciello <superm1@kernel.org> Link: https://lore.kernel.org/r/20250413084144.3746608-4-mingo@kernel.org
2025-04-11x86/cacheinfo: Standardize header files and CPUID referencesAhmed S. Darwish
Reference header files using their canonical form <linux/cacheinfo.h>. Standardize on CPUID(0xN), instead of CPUID(N), for all standard leaves. This removes ambiguity and aligns them with their extended counterparts like CPUID(0x8000001d). References: 0dd09e215a39 ("x86/cacheinfo: Apply maintainer-tip coding style fixes") Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Link: https://lore.kernel.org/r/20250411070401.1358760-3-darwi@linutronix.de
2025-04-09x86/cacheinfo: Properly parse CPUID(0x80000006) L2/L3 associativityAhmed S. Darwish
Complete the AMD CPUID(4) emulation logic, which uses CPUID(0x80000006) for L2/L3 cache info and an assocs[] associativity mapping array, by adding entries for 3-way caches and 6-way caches. Properly handle the case where CPUID(0x80000006) returns an L2/L3 associativity of 9. This is not real associativity, but a marker to indicate that the respective L2/L3 cache information should be retrieved from CPUID(0x8000001d) instead. If such a marker is encountered, return early from legacy_amd_cpuid4(), thus effectively emulating an "invalid index" CPUID(4) response with a cache type of zero. When checking if CPUID(0x80000006) L2/L3 cache info output is valid, and given the associtivity marker 9 above, do not just check if the whole ECX/EDX register is zero. Rather, check if the associativity is zero or 9. An associativity of zero implies no L2/L3 cache, which make it the more correct check anyway vs. a zero check of the whole output register. Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors") Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250409122233.1058601-3-darwi@linutronix.de
2025-04-09x86/cacheinfo: Properly parse CPUID(0x80000005) L1d/L1i associativityAhmed S. Darwish
For the AMD CPUID(4) emulation cache info logic, the same associativity mapping array, assocs[], is used for both CPUID(0x80000005) and CPUID(0x80000006). This is incorrect since per the AMD manuals, the mappings for CPUID(0x80000005) L1d/L1i associativity is: n = 0x1 -> 0xfe n n = 0xff fully associative while assocs[] maps these values to: n = 0x1, 0x2, 0x4 n n = 0x3, 0x7, 0x9 0 n = 0x6 8 n = 0x8 16 n = 0xa 32 n = 0xb 48 n = 0xc 64 n = 0xd 96 n = 0xe 128 n = 0xf fully associative which is only valid for CPUID(0x80000006). Parse CPUID(0x80000005) L1d/L1i associativity values as shown in the AMD manuals. Since the 0xffff literal is used to denote full associativity at the AMD CPUID(4)-emulation logic, define AMD_CPUID4_FULLY_ASSOCIATIVE for it instead of spreading that literal in more places. Mark the assocs[] mapping array as only valid for CPUID(0x80000006) L2/L3 cache information. Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors") Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250409122233.1058601-2-darwi@linutronix.de
2025-03-25x86/cacheinfo: Apply maintainer-tip coding style fixesAhmed S. Darwish
The x86/cacheinfo code has been heavily refactored and fleshed out at parent commits, where any necessary coding style fixes were also done in place. Apply Documentation/process/maintainer-tip.rst coding style fixes to the rest of the code, and align its assignment expressions for readability. Standardize on CPUID(n) when mentioning leaf queries. Avoid breaking long lines when doing so helps readability. At cacheinfo_amd_init_llc_id(), rename variable 'msb' to 'index_msb' as this is how it's called at the rest of cacheinfo.c code. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-30-darwi@linutronix.de
2025-03-25x86/cacheinfo: Introduce cpuid_amd_hygon_has_l3_cache()Ahmed S. Darwish
Multiple code paths at cacheinfo.c and amd_nb.c check for AMD/Hygon CPUs L3 cache presensce by directly checking leaf 0x80000006 EDX output. Extract that logic into its own function. While at it, rework the AMD/Hygon LLC topology ID caclculation comments for clarity. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-29-darwi@linutronix.de
2025-03-25x86/cacheinfo: Relocate CPUID leaf 0x4 cache_type mappingAhmed S. Darwish
The cache_type_map[] array is used to map Intel leaf 0x4 cache_type values to their corresponding types at <linux/cacheinfo.h>. Move that array's definition after the actual CPUID leaf 0x4 structures, instead of having it in the middle of AMD leaf 0x4 emulation code. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-28-darwi@linutronix.de
2025-03-25x86/cacheinfo: Extract out cache self-snoop checksAhmed S. Darwish
The logic of not doing a cache flush if the CPU declares cache self snooping support is repeated across the x86/cacheinfo code. Extract it into its own function. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-27-darwi@linutronix.de
2025-03-25x86/cacheinfo: Extract out cache level topology ID calculationAhmed S. Darwish
For Intel CPUID leaf 0x4 parsing, refactor the cache level topology ID calculation code into its own method instead of repeating the same logic twice for L2 and L3. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-26-darwi@linutronix.de
2025-03-25x86/cacheinfo: Separate Intel CPUID leaf 0x4 handlingAhmed S. Darwish
init_intel_cacheinfo() was overly complex. It parsed leaf 0x4 data, leaf 0x2 data, and performed post-processing, all within one function. Parent commit moved leaf 0x2 parsing and the post-processing logic into their own functions. Continue the refactoring by extracting leaf 0x4 parsing into its own function. Initialize local L2/L3 topology ID variables to BAD_APICID by default, thus ensuring they can be used unconditionally. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-25-darwi@linutronix.de
2025-03-25x86/cacheinfo: Separate CPUID leaf 0x2 handling and post-processing logicAhmed S. Darwish
The logic of init_intel_cacheinfo() is quite convoluted: it mixes leaf 0x4 parsing, leaf 0x2 parsing, plus some post-processing, in a single place. Begin simplifying its logic by extracting the leaf 0x2 parsing code, and the post-processing logic, into their own functions. While at it, rework the SMT LLC topology ID comment for clarity. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-24-darwi@linutronix.de
2025-03-25x86/cacheinfo: Use consolidated CPUID leaf 0x2 descriptor tableAhmed S. Darwish
CPUID leaf 0x2 output is a stream of one-byte descriptors, each implying certain details about the CPU's cache and TLB entries. At previous commits, the mapping tables for such descriptors were merged into one consolidated table. The mapping was also transformed into a hash lookup instead of a loop-based lookup for each descriptor. Use the new consolidated table and its hash-based lookup through the for_each_leaf_0x2_tlb_entry() accessor. Remove the old cache-specific mapping, cache_table[], as it is no longer used. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-22-darwi@linutronix.de
2025-03-25x86/cacheinfo: Use enums for cache descriptor typesAhmed S. Darwish
The leaf 0x2 one-byte cache descriptor types: CACHE_L1_INST CACHE_L1_DATA CACHE_L2 CACHE_L3 are just discriminators to be used within the cache_table[] mapping. Their specific values are irrelevant. Use enums for such types. Make the enum packed and static assert that its values remain within a single byte so that the cache_table[] array size do not go out of hand. Use a __CHECKER__ guard for the static_assert(sizeof(enum) == 1) line as sparse ignores the __packed annotation on enums. This is similar to: fe3944fb245a ("fs: Move enum rw_hint into a new header file") for the core SCSI code. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/Z9rsTirs9lLfEPD9@lx-t490 Link: https://lore.kernel.org/r/20250324133324.23458-19-darwi@linutronix.de
2025-03-25x86/cacheinfo: Clarify type markers for CPUID leaf 0x2 cache descriptorsAhmed S. Darwish
CPUID leaf 0x2 output is a stream of one-byte descriptors, each implying certain details about the CPU's cache and TLB entries. Two separate tables exist for interpreting these descriptors: one for TLBs at intel.c and one for caches at cacheinfo.c. These mapping tables will be merged in further commits, among other improvements to their model. In preparation for this, use more descriptive type names for the leaf 0x2 descriptors associated with cpu caches. Namely: LVL_1_INST => CACHE_L1_INST LVL_1_DATA => CACHE_L1_DATA LVL_2 => CACHE_L2 LVL_3 => CACHE_L3 After the TLB and cache descriptors mapping tables are merged, this will make it clear that such descriptors correspond to cpu caches. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-18-darwi@linutronix.de
2025-03-25x86/cacheinfo: Rename 'struct _cpuid4_info_regs' to 'struct _cpuid4_info'Ahmed S. Darwish
Parent commits decoupled amd_northbridge from _cpuid4_info_regs, moved AMD L3 northbridge cache_disable_0/1 sysfs code to its own file, and splitted AMD vs. Intel leaf 0x4 handling into: amd_fill_cpuid4_info() intel_fill_cpuid4_info() fill_cpuid4_info() After doing all that, the "_cpuid4_info_regs" name becomes a mouthful. It is also not totally accurate, as the structure holds cpuid4 derived information like cache node ID and size -- not just regs. Rename struct _cpuid4_info_regs to _cpuid4_info. That new name also better matches the AMD/Intel leaf 0x4 functions mentioned above. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-17-darwi@linutronix.de
2025-03-25x86/cacheinfo: Separate Intel and AMD CPUID leaf 0x4 code pathsAhmed S. Darwish
The CPUID leaf 0x4 parsing code at cpuid4_cache_lookup_regs() is ugly and convoluted. It is tangled with multiple nested conditions to handle: * AMD with TOPEXT, or Hygon CPUs via leaf 0x8000001d * Legacy AMD fallback via leaf 0x4 emulation * Intel CPUs via the actual CPUID leaf 0x4 Moreover, AMD L3 northbridge initialization is also awkwardly placed alongside the CPUID calls of the first two scenarios above. Refactor all of that as follows: * Update AMD's leaf 0x4 emulation comment to represent current state * Clearly label the AMD leaf 0x4 emulation function as a fallback * Split AMD/Hygon and Intel code paths into separate functions * Move AMD L3 northbridge initialization out of CPUID leaf 0x4 code, and into populate_cache_leaves() where it belongs. There, ci_info_init() can directly store the initialized object in the private pointer of the <linux/cacheinfo.h> API. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-16-darwi@linutronix.de
2025-03-25x86/cacheinfo: Move AMD cache_disable_0/1 handling to separate fileAhmed S. Darwish
Parent commit decoupled amd_northbridge out of _cpuid4_info_regs, where it was merely "parked" there until ci_info_init() can store it in the private pointer of the <linux/cacheinfo.h> API. Given that decoupling, move the AMD-specific L3 cache_disable_0/1 sysfs code from the generic (and already extremely convoluted) x86/cacheinfo code into its own file. Compile the file only if CONFIG_AMD_NB and CONFIG_SYSFS are both enabled, which mirrors the existing logic. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-14-darwi@linutronix.de
2025-03-25x86/cacheinfo: Separate amd_northbridge from _cpuid4_info_regsAhmed S. Darwish
'struct _cpuid4_info_regs' is meant to hold the CPUID leaf 0x4 output registers (EAX, EBX, and ECX), as well as derived information such as the cache node ID and size. It also contains a reference to amd_northbridge, which is there only to be "parked" until ci_info_init() can store it in the priv pointer of the <linux/cacheinfo.h> API. That priv pointer is then used by AMD-specific L3 cache_disable_0/1 sysfs attributes. Decouple amd_northbridge from _cpuid4_info_regs and pass it explicitly through the functions at x86/cacheinfo. Doing so clarifies when amd_northbridge is actually needed (AMD-only code) and when it is not (Intel-specific code). It also prepares for moving the AMD-specific L3 cache_disable_0/1 sysfs code into its own file in next commit. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-13-darwi@linutronix.de
2025-03-25x86/cacheinfo: Consolidate AMD/Hygon leaf 0x8000001d callsAhmed S. Darwish
While gathering CPU cache info, CPUID leaf 0x8000001d is invoked in two separate if blocks: one for Hygon CPUs and one for AMDs with topology extensions. After each invocation, amd_init_l3_cache() is called. Merge the two if blocks into a single condition, thus removing the duplicated code. Future commits will expand these if blocks, so combining them now is both cleaner and more maintainable. Note, while at it, remove a useless "better error?" comment that was within the same function since the 2005 commit e2cac78935ff ("[PATCH] x86_64: When running cpuid4 need to run on the correct CPU"). Note, as previously done at commit aec28d852ed2 ("x86/cpuid: Standardize on u32 in <asm/cpuid/api.h>"), standardize on using 'u32' and 'u8' types. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-12-darwi@linutronix.de
2025-03-25x86/cacheinfo: Standardize _cpuid4_info_regs instance namingAhmed S. Darwish
The cacheinfo code frequently uses the output registers from CPUID leaf 0x4. Such registers are cached in 'struct _cpuid4_info_regs', augmented with related information, and are then passed across functions. The naming of these _cpuid4_info_regs instances is confusing at best. Some instances are called "this_leaf", which is vague as "this" lacks context and "leaf" is overly generic given that other CPUID leaves are also processed within cacheinfo. Other _cpuid4_info_regs instances are just called "base", adding further ambiguity. Standardize on id4 for all instances. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-11-darwi@linutronix.de
2025-03-25x86/cacheinfo: Align ci_info_init() assignment expressionsAhmed S. Darwish
The ci_info_init() function initializes 10 members of a 'struct cacheinfo' instance using passed data from CPUID leaf 0x4. Such assignment expressions are difficult to read in their current form. Align them for clarity. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-10-darwi@linutronix.de
2025-03-25x86/cacheinfo: Constify _cpuid4_info_regs instancesAhmed S. Darwish
_cpuid4_info_regs instances are passed through a large number of functions at cacheinfo.c. For clarity, constify the instance parameters where _cpuid4_info_regs is only read from. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-9-darwi@linutronix.de
2025-03-25x86/cacheinfo: Use proper name for cacheinfo instancesThomas Gleixner
The cacheinfo structure defined at <include/linux/cacheinfo.h> is a generic cache info object representation. Calling its instances at x86 cacheinfo.c "leaf" confuses it with a CPUID leaf -- especially that multiple CPUID calls are already sprinkled across that file. Most of such instances also have a redundant "this_" prefix. Rename all of the cacheinfo "this_leaf" instances to just "ci". [ darwi: Move into separate commit and write commit log ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-8-darwi@linutronix.de
2025-03-25x86/cacheinfo: Properly name amd_cpuid4()'s first parameterThomas Gleixner
amd_cpuid4()'s first parameter, "leaf", is not a CPUID leaf as the name implies. Rather, it's an index emulating CPUID(4)'s subleaf semantics; i.e. an ID for the cache object currently enumerated. Rename that parameter to "index". Apply minor coding style fixes to the rest of the function as well. [ darwi: Move into a separate commit and write commit log. Use "index" instead of "subleaf" for amd_cpuid4() first param, as that's the name typically used at the whole of cacheinfo.c. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-7-darwi@linutronix.de
2025-03-25x86/cacheinfo: Refactor CPUID leaf 0x2 cache descriptor lookupThomas Gleixner
Extract the cache descriptor lookup logic out of the leaf 0x2 parsing code and into a dedicated function. This disentangles such lookup from the deeply nested leaf 0x2 parsing loop. Remove the cache table termination entry, as it is no longer needed after the ARRAY_SIZE()-based lookup. [ darwi: Move refactoring logic into this separate commit + commit log. Remove the cache table termination entry. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-6-darwi@linutronix.de
2025-03-25x86/cacheinfo: Use CPUID leaf 0x2 parsing helpersAhmed S. Darwish
Parent commit introduced CPUID leaf 0x2 parsing helpers at <asm/cpuid/leaf_0x2_api.h>. The new API allows sharing leaf 0x2's output validation and iteration logic across both intel.c and cacheinfo.c. Convert cacheinfo.c to that new API. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-5-darwi@linutronix.de
2025-03-25x86/cacheinfo: Remove CPUID leaf 0x2 parsing loopAhmed S. Darwish
Leaf 0x2 output includes a "query count" byte where it was supposed to specify the number of repeated CPUID leaf 0x2 subleaf 0 queries needed to extract all of the CPU's cache and TLB descriptors. Per current Intel manuals, all CPUs supporting this leaf "will always" return an iteration count of 1. Remove the leaf 0x2 query loop and just query the hardware once. Note, as previously done at commit aec28d852ed2 ("x86/cpuid: Standardize on u32 in <asm/cpuid/api.h>"), standardize on using 'u32' and 'u8' types. Suggested-by: Ingo Molnar <mingo@kernel.org> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250324133324.23458-3-darwi@linutronix.de
2025-03-04x86/cacheinfo: Remove unnecessary headers and reorder the restAhmed S. Darwish
Remove the headers at cacheinfo.c that are no longer required. Alphabetically reorder what remains since more headers will be included in further commits. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250304085152.51092-13-darwi@linutronix.de
2025-03-04x86/cacheinfo: Remove the P4 trace leftovers for realThomas Gleixner
Commit 851026a2bf54 ("x86/cacheinfo: Remove unused trace variable") removed the switch case for LVL_TRACE but did not get rid of the surrounding gunk. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250304085152.51092-12-darwi@linutronix.de
2025-03-04x86/cacheinfo: Validate CPUID leaf 0x2 EDX outputAhmed S. Darwish
CPUID leaf 0x2 emits one-byte descriptors in its four output registers EAX, EBX, ECX, and EDX. For these descriptors to be valid, the most significant bit (MSB) of each register must be clear. The historical Git commit: 019361a20f016 ("- pre6: Intel: start to add Pentium IV specific stuff (128-byte cacheline etc)...") introduced leaf 0x2 output parsing. It only validated the MSBs of EAX, EBX, and ECX, but left EDX unchecked. Validate EDX's most-significant bit. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250304085152.51092-2-darwi@linutronix.de
2024-12-06x86/cacheinfo: Delete global num_cache_leavesRicardo Neri
Linux remembers cpu_cachinfo::num_leaves per CPU, but x86 initializes all CPUs from the same global "num_cache_leaves". This is erroneous on systems such as Meteor Lake, where each CPU has a distinct num_leaves value. Delete the global "num_cache_leaves" and initialize num_leaves on each CPU. init_cache_level() no longer needs to set num_leaves. Also, it never had to set num_levels as it is unnecessary in x86. Keep checking for zero cache leaves. Such condition indicates a bug. [ bp: Cleanup. ] Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org # 6.3+ Link: https://lore.kernel.org/r/20241128002247.26726-3-ricardo.neri-calderon@linux.intel.com
2024-03-11Merge tag 'x86_mtrr_for_v6.9_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 MTRR update from Borislav Petkov: - Relax the PAT MSR programming which was unnecessarily using the MTRR programming protocol of disabling the cache around the changes. The reason behind this is the current algorithm triggering a #VE exception for TDX guests and unnecessarily complicating things * tag 'x86_mtrr_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pat: Simplify the PAT programming protocol
2024-02-20x86/pat: Simplify the PAT programming protocolKirill A. Shutemov
The programming protocol for the PAT MSR follows the MTRR programming protocol. However, this protocol is cumbersome and requires disabling caching (CR0.CD=1), which is not possible on some platforms. Specifically, a TDX guest is not allowed to set CR0.CD. It triggers a #VE exception. It turns out that the requirement to follow the MTRR programming protocol for PAT programming is unnecessarily strict. The new Intel Software Developer Manual (http://www.intel.com/sdm) (December 2023) relaxes this requirement, please refer to the section titled "Programming the PAT" for more information. In short, this section provides an alternative PAT update sequence which doesn't need to disable caches around the PAT update but only to flush those caches and TLBs. The AMD documentation does not link PAT programming to MTRR and is there fore, fine too. The kernel only needs to flush the TLB after updating the PAT MSR. The set_memory code already takes care of flushing the TLB and cache when changing the memory type of a page. [ bp: Expand commit message. ] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20240124130650.496056-1-kirill.shutemov@linux.intel.com
2024-02-16x86/cpu/topology: Get rid of cpuinfo::x86_max_coresThomas Gleixner
Now that __num_cores_per_package and __num_threads_per_package are available, cpuinfo::x86_max_cores and the related math all over the place can be replaced with the ready to consume data. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://lore.kernel.org/r/20240213210253.176147806@linutronix.de
2024-02-15x86/cpu: Provide an AMD/HYGON specific topology parserThomas Gleixner
AMD/HYGON uses various methods for topology evaluation: - Leaf 0x80000008 and 0x8000001e based with an optional leaf 0xb, which is the preferred variant for modern CPUs. Leaf 0xb will be superseded by leaf 0x80000026 soon, which is just another variant of the Intel 0x1f leaf for whatever reasons. - Subleaf 0x80000008 and NODEID_MSR base - Legacy fallback That code is following the principle of random bits and pieces all over the place which results in multiple evaluations and impenetrable code flows in the same way as the Intel parsing did. Provide a sane implementation by clearly separating the three variants and bringing them in the proper preference order in one place. This provides the parsing for both AMD and HYGON because there is no point in having a separate HYGON parser which only differs by 3 lines of code. Any further divergence between AMD and HYGON can be handled in different functions, while still sharing the existing parsers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Wang Wendy <wendy.wang@intel.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20240212153625.020038641@linutronix.de
2024-02-15x86/cpu/amd: Provide a separate accessor for Node IDThomas Gleixner
AMD (ab)uses topology_die_id() to store the Node ID information and topology_max_dies_per_pkg to store the number of nodes per package. This collides with the proper processor die level enumeration which is coming on AMD with CPUID 8000_0026, unless there is a correlation between the two. There is zero documentation about that. So provide new storage and new accessors which for now still access die_id and topology_max_die_per_pkg(). Will be mopped up after AMD and HYGON are converted over. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Wang Wendy <wendy.wang@intel.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20240212153624.956116738@linutronix.de
2023-10-10x86/cpu: Move cpu_l[l2]c_id into topology infoThomas Gleixner
The topology IDs which identify the LLC and L2 domains clearly belong to the per CPU topology information. Move them into cpuinfo_x86::cpuinfo_topo and get rid of the extra per CPU data and the related exports. This also paves the way to do proper topology evaluation during early boot because it removes the only per CPU dependency for that. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.803864641@linutronix.de
2023-10-10x86/cpu: Move cpu_die_id into topology infoThomas Gleixner
Move the next member. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.388185134@linutronix.de
2023-10-10x86/cpu: Move phys_proc_id into topology infoThomas Gleixner
Rename it to pkg_id which is the terminology used in the kernel. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.329006989@linutronix.de
2023-10-10x86/cpu: Encapsulate topology information in cpuinfo_x86Thomas Gleixner
The topology related information is randomly scattered across cpuinfo_x86. Create a new structure cpuinfo_topo and move in a first step initial_apicid and apicid into it. Aside of being better readable this is in preparation for replacing the horribly fragile CPU topology evaluation code further down the road. Consolidate APIC ID fields to u32 as that represents the hardware type. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.269787744@linutronix.de
2023-05-15x86/cpu/cacheinfo: Remove cpu_callout_mask dependencyThomas Gleixner
cpu_callout_mask is used for the stop machine based MTRR/PAT init. In preparation of moving the BP/AP synchronization to the core hotplug code, use a private CPU mask for cacheinfo and manage it in the starting/dying hotplug state. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Tested-by: Helge Deller <deller@gmx.de> # parisc Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck Link: https://lore.kernel.org/r/20230512205256.035041005@linutronix.de
2023-02-11x86/cacheinfo: Remove unused trace variableBorislav Petkov (AMD)
15cd8812ab2c ("x86: Remove the CPU cache size printk's") removed the last use of the trace local var. Remove it too and the useless trace cache case. No functional changes. Reported-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230210234541.9694-1-bp@alien8.de Link: http://lore.kernel.org/r/20220705073349.1512-1-jiapeng.chong@linux.alibaba.com
2022-11-10x86/cacheinfo: Switch cache_ap_init() to hotplug callbackJuergen Gross
Instead of explicitly calling cache_ap_init() in identify_secondary_cpu() use a CPU hotplug callback instead. By registering the callback only after having started the non-boot CPUs and initializing cache_aps_delayed_init with "true", calling set_cache_aps_delayed_init() at boot time can be dropped. It should be noted that this change results in cache_ap_init() being called a little bit later when hotplugging CPUs. By using a new hotplug slot right at the start of the low level bringup this is not problematic, as no operations requiring a specific caching mode are performed that early in CPU initialization. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221102074713.21493-15-jgross@suse.com Signed-off-by: Borislav Petkov <bp@suse.de>
2022-11-10x86: Decouple PAT and MTRR handlingJuergen Gross
Today, PAT is usable only with MTRR being active, with some nasty tweaks to make PAT usable when running as a Xen PV guest which doesn't support MTRR. The reason for this coupling is that both PAT MSR changes and MTRR changes require a similar sequence and so full PAT support was added using the already available MTRR handling. Xen PV PAT handling can work without MTRR, as it just needs to consume the PAT MSR setting done by the hypervisor without the ability and need to change it. This in turn has resulted in a convoluted initialization sequence and wrong decisions regarding cache mode availability due to misguiding PAT availability flags. Fix all of that by allowing to use PAT without MTRR and by reworking the current PAT initialization sequence to match better with the newly introduced generic cache initialization. This removes the need of the recently added pat_force_disabled flag, so remove the remnants of the patch adding it. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221102074713.21493-14-jgross@suse.com Signed-off-by: Borislav Petkov <bp@suse.de>
2022-11-10x86/mtrr: Add a stop_machine() handler calling only cache_cpu_init()Juergen Gross
Instead of having a stop_machine() handler for either a specific MTRR register or all state at once, add a handler just for calling cache_cpu_init() if appropriate. Add functions for calling stop_machine() with this handler as well. Add a generic replacement for mtrr_bp_restore() and a wrapper for mtrr_bp_init(). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221102074713.21493-13-jgross@suse.com Signed-off-by: Borislav Petkov <bp@suse.de>
2022-11-10x86/mtrr: Let cache_aps_delayed_init replace mtrr_aps_delayed_initJuergen Gross
In order to prepare decoupling MTRR and PAT replace the MTRR-specific mtrr_aps_delayed_init flag with a more generic cache_aps_delayed_init one. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221102074713.21493-12-jgross@suse.com Signed-off-by: Borislav Petkov <bp@suse.de>
2022-11-10x86/mtrr: Disentangle MTRR init from PAT initJuergen Gross
Add a main cache_cpu_init() init routine which initializes MTRR and/or PAT support depending on what has been detected on the system. Leave the MTRR-specific initialization in a MTRR-specific init function where the smp_changes_mask setting happens now with caches disabled. This global mask update was done with caches enabled before probably because atomic operations while running uncached might have been quite expensive. But since only systems with a broken BIOS should ever require to set any bit in smp_changes_mask, hurting those devices with a penalty of a few microseconds during boot shouldn't be a real issue. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221102074713.21493-8-jgross@suse.com Signed-off-by: Borislav Petkov <bp@suse.de>