summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu/cacheinfo.c
AgeCommit message (Collapse)Author
2022-11-10x86/mtrr: Move cache control code to cacheinfo.cJuergen Gross
Prepare making PAT and MTRR support independent from each other by moving some code needed by both out of the MTRR-specific sources. [ bp: Massage commit message. ] Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221102074713.21493-7-jgross@suse.com Signed-off-by: Borislav Petkov <bp@suse.de>
2022-11-10x86/mtrr: Replace use_intel() with a local flagJuergen Gross
In MTRR code use_intel() is only used in one source file, and the relevant use_intel_if member of struct mtrr_ops is set only in generic_mtrr_ops. Replace use_intel() with a single flag in cacheinfo.c which can be set when assigning generic_mtrr_ops to mtrr_if. This allows to drop use_intel_if from mtrr_ops, while preparing to decouple PAT from MTRR. As another preparation for the PAT/MTRR decoupling use a bit for MTRR control and one for PAT control. For now set both bits together, this can be changed later. As the new flag will be set only if mtrr_enabled is set, the test for mtrr_enabled can be dropped at some places. [ bp: Massage commit message. ] Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221102074713.21493-4-jgross@suse.com Signed-off-by: Borislav Petkov <bp@suse.de>
2022-07-17x86/cacheinfo: move shared cache map definitionsSander Vanheule
Patch series "cpumask: Fix invalid uniprocessor assumptions", v4. On uniprocessor builds, it is currently assumed that any cpumask will contain the single CPU: cpu0. This assumption is used to provide optimised implementations. The current assumption also appears to be wrong, by ignoring the fact that users can provide empty cpumasks. This can result in bugs as explained in [1] - for_each_cpu() will run one iteration of the loop even when passed an empty cpumask. This series introduces some basic tests, and updates the optimisations for uniprocessor builds. The x86 patch was written after the kernel test robot [2] ran into a failed build. I have tried to list the files potentially affected by the changes to cpumask.h, in an attempt to find any other cases that fail on !SMP. I've gone through some of the files manually, and ran a few cross builds, but nothing else popped up. I (build) checked about half of the potientally affected files, but I do not have the resources to do them all. I hope we can fix other issues if/when they pop up later. [1] https://lore.kernel.org/all/20220530082552.46113-1-sander@svanheule.net/ [2] https://lore.kernel.org/all/202206060858.wA0FOzRy-lkp@intel.com/ This patch (of 5): The maps to keep track of shared caches between CPUs on SMP systems are declared in asm/smp.h, among them specifically cpu_llc_shared_map. These maps are externally defined in cpu/smpboot.c. The latter is only compiled on CONFIG_SMP=y, which means the declared extern symbols from asm/smp.h do not have a corresponding definition on uniprocessor builds. The inline cpu_llc_shared_mask() function from asm/smp.h refers to the map declaration mentioned above. This function is referenced in cacheinfo.c inside for_each_cpu() loop macros, to provide cpumask for the loop. On uniprocessor builds, the symbol for the cpu_llc_shared_map does not exist. However, the current implementation of for_each_cpu() also (wrongly) ignores the provided mask. By sheer luck, the compiler thus optimises out this unused reference to cpu_llc_shared_map, and the linker therefore does not require the cpu_llc_shared_mask to actually exist on uniprocessor builds. Only on SMP bulids does smpboot.o exist to provide the required symbols. To no longer rely on compiler optimisations for successful uniprocessor builds, move the definitions of cpu_llc_shared_map and cpu_l2c_shared_map from smpboot.c to cacheinfo.c. Link: https://lkml.kernel.org/r/cover.1656777646.git.sander@svanheule.net Link: https://lkml.kernel.org/r/e8167ddb570f56744a3dc12c2149a660a324d969.1656777646.git.sander@svanheule.net Signed-off-by: Sander Vanheule <sander@svanheule.net> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Marco Elver <elver@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2021-10-15sched: Add cluster scheduler level for x86Tim Chen
There are x86 CPU architectures (e.g. Jacobsville) where L2 cahce is shared among a cluster of cores instead of being exclusive to one single core. To prevent oversubscription of L2 cache, load should be balanced between such L2 clusters, especially for tasks with no shared data. On benchmark such as SPECrate mcf test, this change provides a boost to performance especially on medium load system on Jacobsville. on a Jacobsville that has 24 Atom cores, arranged into 6 clusters of 4 cores each, the benchmark number is as follow: Improvement over baseline kernel for mcf_r copies run time base rate 1 -0.1% -0.2% 6 25.1% 25.1% 12 18.8% 19.0% 24 0.3% 0.3% So this looks pretty good. In terms of the system's task distribution, some pretty bad clumping can be seen for the vanilla kernel without the L2 cluster domain for the 6 and 12 copies case. With the extra domain for cluster, the load does get evened out between the clusters. Note this patch isn't an universal win as spreading isn't necessarily a win, particually for those workload who can benefit from packing. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210924085104.44806-4-21cnbao@gmail.com
2021-09-01drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()Thomas Gleixner
DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework to ensure that the cache related functions are called on the upcoming CPU because the notifier itself could run on any online CPU. The hotplug state machine guarantees that the callbacks are invoked on the upcoming CPU. So there is no need to have this SMP function call obfuscation. That indirection was missed when the hotplug notifiers were converted. This also solves the problem of ARM64 init_cache_level() invoking ACPI functions which take a semaphore in that context. That's invalid as SMP function calls run with interrupts disabled. Running it just from the callback in context of the CPU hotplug thread solves this. Fixes: 8571890e1513 ("arm64: Add support for ACPI based firmware tables") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx
2021-04-07x86/cacheinfo: Remove unneeded dead-store initializationYang Li
$ make CC=clang clang-analyzer (needs clang-tidy installed on the system too) on x86_64 defconfig triggers: arch/x86/kernel/cpu/cacheinfo.c:880:24: warning: Value stored to 'this_cpu_ci' \ during its initialization is never read [clang-analyzer-deadcode.DeadStores] struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); ^ arch/x86/kernel/cpu/cacheinfo.c:880:24: note: Value stored to 'this_cpu_ci' \ during its initialization is never read So simply remove this unneeded dead-store initialization. As compilers will detect this unneeded assignment and optimize this anyway the resulting object code is identical before and after this change. No functional change. No change to object code. [ bp: Massage commit message. ] Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lkml.kernel.org/r/1617177624-24670-1-git-send-email-yang.lee@linux.alibaba.com
2020-11-19x86/CPU/AMD: Remove amd_get_nb_id()Yazen Ghannam
The Last Level Cache ID is returned by amd_get_nb_id(). In practice, this value is the same as the AMD NodeId for callers of this function. The NodeId is saved in struct cpuinfo_x86.cpu_die_id. Replace calls to amd_get_nb_id() with the logical CPU's cpu_die_id and remove the function. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201109210659.754018-3-Yazen.Ghannam@amd.com
2020-11-19x86/CPU/AMD: Save AMD NodeId as cpu_die_idYazen Ghannam
AMD systems provide a "NodeId" value that represents a global ID indicating to which "Node" a logical CPU belongs. The "Node" is a physical structure equivalent to a Die, and it should not be confused with logical structures like NUMA nodes. Logical nodes can be adjusted based on firmware or other settings whereas the physical nodes/dies are fixed based on hardware topology. The NodeId value can be used when a physical ID is needed by software. Save the AMD NodeId to struct cpuinfo_x86.cpu_die_id. Use the value from CPUID or MSR as appropriate. Default to phys_proc_id otherwise. Do so for both AMD and Hygon systems. Drop the node_id parameter from cacheinfo_*_init_llc_id() as it is no longer needed. Update the x86 topology documentation. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201109210659.754018-2-Yazen.Ghannam@amd.com
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2019-06-19x86/cacheinfo: Fix a -Wtype-limits warningQian Cai
cpuinfo_x86.x86_model is an unsigned type, so comparing against zero will generate a compilation warning: arch/x86/kernel/cpu/cacheinfo.c: In function 'cacheinfo_amd_init_llc_id': arch/x86/kernel/cpu/cacheinfo.c:662:19: warning: comparison is always true \ due to limited range of data type [-Wtype-limits] Remove the unnecessary lower bound check. [ bp: Massage. ] Fixes: 68091ee7ac3c ("x86/CPU/AMD: Calculate last level cache ID from number of sharing threads") Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/1560954773-11967-1-git-send-email-cai@lca.pw
2019-01-26x86/kernel: Mark expected switch-case fall-throughsGustavo A. R. Silva
In preparation to enable -Wimplicit-fallthrough by default, mark switch-case statements where fall-through is intentional, explicitly in order to fix a couple of -Wimplicit-fallthrough warnings. Warning level 3 was used: -Wimplicit-fallthrough=3. [ bp: Massasge and trim commit message. ] Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Baoquan He <bhe@redhat.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: David Wang <davidwang@zhaoxin.com> Cc: Douglas Anderson <dianders@chromium.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nicolai Stange <nstange@suse.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pu Wen <puwen@hygon.cn> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190125183903.GA4712@embeddedor
2018-12-08x86/kernel: Fix more -Wmissing-prototypes warningsBorislav Petkov
... with the goal of eventually enabling -Wmissing-prototypes by default. At least on x86. Make functions static where possible, otherwise add prototypes or make them visible through includes. asm/trace/ changes courtesy of Steven Rostedt <rostedt@goodmis.org>. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> # ACPI + cpufreq bits Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Travis <mike.travis@hpe.com> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yi Wang <wang.yi59@zte.com.cn> Cc: linux-acpi@vger.kernel.org
2018-09-27x86/cpu: Get cache info and setup cache cpumap for Hygon DhyanaPu Wen
The Hygon Dhyana CPU has a topology extensions bit in CPUID. With this bit, the kernel can get the cache information. So add support in cpuid4_cache_lookup_regs() to get the correct cache size. The Hygon Dhyana CPU also discovers num_cache_leaves via CPUID leaf 0x8000001d, so add support to it in find_num_cache_leaves(). Also add cacheinfo_hygon_init_llc_id() and init_hygon_cacheinfo() functions to initialize Dhyana cache info. Setup cache cpumap in the same way as AMD does. Signed-off-by: Pu Wen <puwen@hygon.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: bp@alien8.de Cc: tglx@linutronix.de Cc: mingo@redhat.com Cc: hpa@zytor.com Cc: x86@kernel.org Cc: thomas.lendacky@amd.com Link: https://lkml.kernel.org/r/2a686b2ac0e2f5a1f2f5f101124d9dd44f949731.1537533369.git.puwen@hygon.cn
2018-06-22x86/CPU/AMD: Fix LLC ID bit-shift calculationSuravee Suthikulpanit
The current logic incorrectly calculates the LLC ID from the APIC ID. Unless specified otherwise, the LLC ID should be calculated by removing the Core and Thread ID bits from the least significant end of the APIC ID. For more info, see "ApicId Enumeration Requirements" in any Fam17h PPR document. [ bp: Improve commit message. ] Fixes: 68091ee7ac3c ("Calculate last level cache ID from number of sharing threads") Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1528915390-30533-1-git-send-email-suravee.suthikulpanit@amd.com
2018-05-13x86/CPU: Move cpu_detect_cache_sizes() into init_intel_cacheinfo()David Wang
There is no point in having the conditional cpu_detect_cache_sizes() call at the callsite of init_intel_cacheinfo(). Move it into init_intel_cacheinfo() and make init_intel_cacheinfo() void. [ tglx: Made the init_intel_cacheinfo() void as the return value was pointless. Adjust changelog accordingly ] Signed-off-by: David Wang <davidwang@zhaoxin.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: lukelin@viacpu.com Cc: qiyuanwang@zhaoxin.com Cc: gregkh@linuxfoundation.org Cc: brucechang@via-alliance.com Cc: timguo@zhaoxin.com Cc: cooperyan@zhaoxin.com Cc: hpa@zytor.com Cc: benjaminpan@viatech.com Link: https://lkml.kernel.org/r/1525314766-18910-3-git-send-email-davidwang@zhaoxin.com
2018-05-13x86/CPU: Move cpu local function declarations to local headerThomas Gleixner
No point in exposing all these functions globaly as they are strict local to the cpu management code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-05-06x86/CPU/AMD: Calculate last level cache ID from number of sharing threadsSuravee Suthikulpanit
Last Level Cache ID can be calculated from the number of threads sharing the cache, which is available from CPUID Fn0x8000001D (Cache Properties). This is used to left-shift the APIC ID to derive LLC ID. Therefore, default to this method unless the APIC ID enumeration does not follow the scheme. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1524864877-111962-5-git-send-email-suravee.suthikulpanit@amd.com
2018-05-06x86/CPU: Rename intel_cacheinfo.c to cacheinfo.cBorislav Petkov
Since this file contains general cache-related information for x86, rename the file to a more generic name. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1524864877-111962-4-git-send-email-suravee.suthikulpanit@amd.com