summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu
AgeCommit message (Collapse)Author
2023-10-20x86/bugs: Remove default case for fully switched enumsJosh Poimboeuf
For enum switch statements which handle all possible cases, remove the default case so a compiler warning gets printed if one of the enums gets accidentally omitted from the switch statement. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/fcf6feefab991b72e411c2aed688b18e65e06aed.1693889988.git.jpoimboe@kernel.org
2023-10-20x86/srso: Remove 'pred_cmd' labelJosh Poimboeuf
SBPB is only enabled in two distinct cases: 1) when SRSO has been disabled with srso=off 2) when SRSO has been fixed (in future HW) Simplify the control flow by getting rid of the 'pred_cmd' label and moving the SBPB enablement check to the two corresponding code sites. This makes it more clear when exactly SBPB gets enabled. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/bb20e8569cfa144def5e6f25e610804bc4974de2.1693889988.git.jpoimboe@kernel.org
2023-10-20x86/srso: Fix vulnerability reporting for missing microcodeJosh Poimboeuf
The SRSO default safe-ret mitigation is reported as "mitigated" even if microcode hasn't been updated. That's wrong because userspace may still be vulnerable to SRSO attacks due to IBPB not flushing branch type predictions. Report the safe-ret + !microcode case as vulnerable. Also report the microcode-only case as vulnerable as it leaves the kernel open to attacks. Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/a8a14f97d1b0e03ec255c81637afdf4cf0ae9c99.1693889988.git.jpoimboe@kernel.org
2023-10-20x86/srso: Print mitigation for retbleed IBPB caseJosh Poimboeuf
When overriding the requested mitigation with IBPB due to retbleed=ibpb, print the mitigation in the usual format instead of a custom error message. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/ec3af919e267773d896c240faf30bfc6a1fd6304.1693889988.git.jpoimboe@kernel.org
2023-10-20x86/srso: Print actual mitigation if requested mitigation isn't possibleJosh Poimboeuf
If the kernel wasn't compiled to support the requested option, print the actual option that ends up getting used. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/7e7a12ea9d85a9f76ca16a3efb71f262dee46ab1.1693889988.git.jpoimboe@kernel.org
2023-10-20x86/srso: Fix SBPB enablement for (possible) future fixed HWJosh Poimboeuf
Make the SBPB check more robust against the (possible) case where future HW has SRSO fixed but doesn't have the SRSO_NO bit set. Fixes: 1b5277c0ea0b ("x86/srso: Add SRSO_NO support") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/cee5050db750b391c9f35f5334f8ff40e66c01b9.1693889988.git.jpoimboe@kernel.org
2023-10-19x86/microcode/intel: Cleanup code furtherThomas Gleixner
Sanitize the microcode scan loop, fixup printks and move the loading function for builtin microcode next to the place where it is used and mark it __init. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115902.389400871@linutronix.de
2023-10-19x86/microcode/intel: Simplify and rename generic_load_microcode()Thomas Gleixner
so it becomes less obfuscated and rename it because there is nothing generic about it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115902.330295409@linutronix.de
2023-10-19x86/microcode/intel: Simplify scan_microcode()Thomas Gleixner
Make it readable and comprehensible. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115902.271940980@linutronix.de
2023-10-19x86/microcode/intel: Rip out mixed stepping support for Intel CPUsAshok Raj
Mixed steppings aren't supported on Intel CPUs. Only one microcode patch is required for the entire system. The caching of microcode blobs which match the family and model is therefore pointless and in fact is dysfunctional as CPU hotplug updates use only a single microcode blob, i.e. the one where *intel_ucode_patch points to. Remove the microcode cache and make it an AMD local feature. [ tglx: - save only at the end. Otherwise random microcode ends up in the pointer for early loading - free the ucode patch pointer in save_microcode_patch() only after kmemdup() has succeeded, as reported by Andrew Cooper ] Originally-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231017211722.404362809@linutronix.de
2023-10-18x86/microcode/32: Move early loading after paging enableThomas Gleixner
32-bit loads microcode before paging is enabled. The commit which introduced that has zero justification in the changelog. The cover letter has slightly more content, but it does not give any technical justification either: "The problem in current microcode loading method is that we load a microcode way, way too late; ideally we should load it before turning paging on. This may only be practical on 32 bits since we can't get to 64-bit mode without paging on, but we should still do it as early as at all possible." Handwaving word salad with zero technical content. Someone claimed in an offlist conversation that this is required for curing the ATOM erratum AAE44/AAF40/AAG38/AAH41. That erratum requires an microcode update in order to make the usage of PSE safe. But during early boot, PSE is completely irrelevant and it is evaluated way later. Neither is it relevant for the AP on single core HT enabled CPUs as the microcode loading on the AP is not doing anything. On dual core CPUs there is a theoretical problem if a split of an executable large page between enabling paging including PSE and loading the microcode happens. But that's only theoretical, it's practically irrelevant because the affected dual core CPUs are 64bit enabled and therefore have paging and PSE enabled before loading the microcode on the second core. So why would it work on 64-bit but not on 32-bit? The erratum: "AAG38 Code Fetch May Occur to Incorrect Address After a Large Page is Split Into 4-Kbyte Pages Problem: If software clears the PS (page size) bit in a present PDE (page directory entry), that will cause linear addresses mapped through this PDE to use 4-KByte pages instead of using a large page after old TLB entries are invalidated. Due to this erratum, if a code fetch uses this PDE before the TLB entry for the large page is invalidated then it may fetch from a different physical address than specified by either the old large page translation or the new 4-KByte page translation. This erratum may also cause speculative code fetches from incorrect addresses." The practical relevance for this is exactly zero because there is no splitting of large text pages during early boot-time, i.e. between paging enable and microcode loading, and neither during CPU hotplug. IOW, this load microcode before paging enable is yet another voodoo programming solution in search of a problem. What's worse is that it causes at least two serious problems: 1) When stackprotector is enabled, the microcode loader code has the stackprotector mechanics enabled. The read from the per CPU variable __stack_chk_guard is always accessing the virtual address either directly on UP or via %fs on SMP. In physical address mode this results in an access to memory above 3GB. So this works by chance as the hardware returns the same value when there is no RAM at this physical address. When there is RAM populated above 3G then the read is by chance the same as nothing changes that memory during the very early boot stage. That's not necessarily true during runtime CPU hotplug. 2) When function tracing is enabled, the relevant microcode loader functions and the functions invoked from there will call into the tracing code and evaluate global and per CPU variables in physical address mode. What could potentially go wrong? Cure this and move the microcode loading after the early paging enable, use the new temporary initrd mapping and remove the gunk in the microcode loader which is required to handle physical address mode. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231017211722.348298216@linutronix.de
2023-10-17x86/microcode/amd: Fix snprintf() format string warning in W=1 buildPaolo Bonzini
Building with GCC 11.x results in the following warning: arch/x86/kernel/cpu/microcode/amd.c: In function ‘find_blobs_in_containers’: arch/x86/kernel/cpu/microcode/amd.c:504:58: error: ‘h.bin’ directive output may be truncated writing 5 bytes into a region of size between 1 and 7 [-Werror=format-truncation=] arch/x86/kernel/cpu/microcode/amd.c:503:17: note: ‘snprintf’ output between 35 and 41 bytes into a destination of size 36 The issue is that GCC does not know that the family can only be a byte (it ultimately comes from CPUID). Suggest the right size to the compiler by marking the argument as char-size ("hh"). While at it, instead of using the slightly more obscure precision specifier use the width with zero padding (over 23000 occurrences in kernel sources, vs 500 for the idiom using the precision). Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Closes: https://lore.kernel.org/oe-kbuild-all/202308252255.2HPJ6x5Q-lkp@intel.com/ Link: https://lore.kernel.org/r/20231016224858.2829248-1-pbonzini@redhat.com
2023-10-17x86/resctrl: Display RMID of resource groupBabu Moger
In x86, hardware uses RMID to identify a monitoring group. When a user creates a monitor group these details are not visible. These details can help resctrl debugging. Add RMID(mon_hw_id) to the monitor groups display in the resctrl interface. Users can see these details when resctrl is mounted with "-o debug" option. Add RFTYPE_MON_BASE that complements existing RFTYPE_CTRL_BASE and represents files belonging to monitoring groups. Other architectures do not use "RMID". Use the name mon_hw_id to refer to "RMID" in an effort to keep the naming generic. For example: $cat /sys/fs/resctrl/mon_groups/mon_grp1/mon_hw_id 3 Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Link: https://lore.kernel.org/r/20231017002308.134480-10-babu.moger@amd.com
2023-10-17x86/resctrl: Add support for the files of MON groups onlyBabu Moger
Files unique to monitoring groups have the RFTYPE_MON flag. When a new monitoring group is created the resctrl files with flags RFTYPE_BASE (files common to all resource groups) and RFTYPE_MON (files unique to monitoring groups) are created to support interacting with the new monitoring group. A resource group can support both monitoring and control, also termed a CTRL_MON resource group. CTRL_MON groups should get both monitoring and control resctrl files but that is not the case. Only the RFTYPE_BASE and RFTYPE_CTRL files are created for CTRL_MON groups. Ensure that files with the RFTYPE_MON flag are created for CTRL_MON groups. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Link: https://lore.kernel.org/r/20231017002308.134480-9-babu.moger@amd.com
2023-10-17x86/resctrl: Display CLOSID for resource groupBabu Moger
In x86, hardware uses CLOSID to identify a control group. When a user creates a control group this information is not visible to the user. It can help resctrl debugging. Add CLOSID(ctrl_hw_id) to the control groups display in the resctrl interface. Users can see this detail when resctrl is mounted with the "-o debug" option. Other architectures do not use "CLOSID". Use the names ctrl_hw_id to refer to "CLOSID" in an effort to keep the naming generic. For example: $cat /sys/fs/resctrl/ctrl_grp1/ctrl_hw_id 1 Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Link: https://lore.kernel.org/r/20231017002308.134480-8-babu.moger@amd.com
2023-10-17x86/resctrl: Introduce "-o debug" mount optionBabu Moger
Add "-o debug" option to mount resctrl filesystem in debug mode. When in debug mode resctrl displays files that have the new RFTYPE_DEBUG flag to help resctrl debugging. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Link: https://lore.kernel.org/r/20231017002308.134480-7-babu.moger@amd.com
2023-10-17x86/resctrl: Move default group file creation to mountBabu Moger
The default resource group and its files are created during kernel init time. Upcoming changes will make some resctrl files optional based on a mount parameter. If optional files are to be added to the default group based on the mount option, then each new file needs to be created separately and call kernfs_activate() again. Create all files of the default resource group during resctrl mount, destroyed during unmount, to avoid scattering resctrl file addition across two separate code flows. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Link: https://lore.kernel.org/r/20231017002308.134480-6-babu.moger@amd.com
2023-10-17x86/resctrl: Unwind properly from rdt_enable_ctx()Babu Moger
rdt_enable_ctx() enables the features provided during resctrl mount. Additions to rdt_enable_ctx() are required to also modify error paths of rdt_enable_ctx() callers to ensure correct unwinding if errors are encountered after calling rdt_enable_ctx(). This is error prone. Introduce rdt_disable_ctx() to refactor the error unwinding of rdt_enable_ctx() to simplify future additions. This also simplifies cleanup in rdt_kill_sb(). Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Link: https://lore.kernel.org/r/20231017002308.134480-5-babu.moger@amd.com
2023-10-17x86/resctrl: Rename rftype flags for consistencyBabu Moger
resctrl associates rftype flags with its files so that files can be chosen based on the resource, whether it is info or base, and if it is control or monitor type file. These flags use the RF_ as well as RFTYPE_ prefixes. Change the prefix to RFTYPE_ for all these flags to be consistent. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Link: https://lore.kernel.org/r/20231017002308.134480-4-babu.moger@amd.com
2023-10-17x86/resctrl: Simplify rftype flag definitionsBabu Moger
The rftype flags are bitmaps used for adding files under the resctrl filesystem. Some of these bitmap defines have one extra level of indirection which is not necessary. Drop the RF_* defines and simplify the macros. [ bp: Massage commit message. ] Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Link: https://lore.kernel.org/r/20231017002308.134480-3-babu.moger@amd.com
2023-10-17x86/resctrl: Add multiple tasks to the resctrl group at onceBabu Moger
The resctrl task assignment for monitor or control group needs to be done one at a time. For example: $mount -t resctrl resctrl /sys/fs/resctrl/ $mkdir /sys/fs/resctrl/ctrl_grp1 $echo 123 > /sys/fs/resctrl/ctrl_grp1/tasks $echo 456 > /sys/fs/resctrl/ctrl_grp1/tasks $echo 789 > /sys/fs/resctrl/ctrl_grp1/tasks This is not user-friendly when dealing with hundreds of tasks. Support multiple task assignment in one command with tasks ids separated by commas. For example: $echo 123,456,789 > /sys/fs/resctrl/ctrl_grp1/tasks Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Tan Shaopeng <tan.shaopeng@jp.fujitsu.com> Link: https://lore.kernel.org/r/20231017002308.134480-2-babu.moger@amd.com
2023-10-16x86/mce: Cleanup mce_usable_address()Yazen Ghannam
Move Intel-specific checks into a helper function. Explicitly use "bool" for return type. No functional change intended. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230613141142.36801-4-yazen.ghannam@amd.com
2023-10-16x86/mce: Define amd_mce_usable_address()Yazen Ghannam
Currently, all valid MCA_ADDR values are assumed to be usable on AMD systems. However, this is not correct in most cases. Notifiers expecting usable addresses may then operate on inappropriate values. Define a helper function to do AMD-specific checks for a usable memory address. List out all known cases. [ bp: Tone down the capitalized words. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230613141142.36801-3-yazen.ghannam@amd.com
2023-10-16x86/MCE/AMD: Split amd_mce_is_memory_error()Yazen Ghannam
Define helper functions for legacy and SMCA systems in order to reuse individual checks in later changes. Describe what each function is checking for, and correct the XEC bitmask for SMCA. No functional change intended. [ bp: Use "else in amd_mce_is_memory_error() to make the conditional balanced, for readability. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20230613141142.36801-2-yazen.ghannam@amd.com
2023-10-11x86/resctrl: Add sparse_masks file in infoFenghua Yu
Add the interface in resctrl FS to show if sparse cache allocation bit masks are supported on the platform. Reading the file returns either a "1" if non-contiguous 1s are supported and "0" otherwise. The file path is /sys/fs/resctrl/info/{resource}/sparse_masks, where {resource} can be either "L2" or "L3". Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Peter Newman <peternewman@google.com> Link: https://lore.kernel.org/r/7300535160beba41fd8aa073749ec1ee29b4621f.1696934091.git.maciej.wieczor-retman@intel.com
2023-10-11x86/resctrl: Enable non-contiguous CBMs in Intel CATMaciej Wieczor-Retman
The setting for non-contiguous 1s support in Intel CAT is hardcoded to false. On these systems, writing non-contiguous 1s into the schemata file will fail before resctrl passes the value to the hardware. In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped being reserved and now carry information about non-contiguous 1s value support for L3 and L2 cache respectively. The CAT capacity bitmask (CBM) supports a non-contiguous 1s value if the bit is set. The exception are Haswell systems where non-contiguous 1s value support needs to stay disabled since they can't make use of CPUID for Cache allocation. Originally-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Peter Newman <peternewman@google.com> Link: https://lore.kernel.org/r/1849b487256fe4de40b30f88450cba3d9abc9171.1696934091.git.maciej.wieczor-retman@intel.com
2023-10-11x86/resctrl: Rename arch_has_sparse_bitmapsMaciej Wieczor-Retman
Rename arch_has_sparse_bitmaps to arch_has_sparse_bitmasks to ensure consistent terminology throughout resctrl. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Peter Newman <peternewman@google.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Peter Newman <peternewman@google.com> Link: https://lore.kernel.org/r/e330fcdae873ef1a831e707025a4b70fa346666e.1696934091.git.maciej.wieczor-retman@intel.com
2023-10-11x86/cpu: Fix AMD erratum #1485 on Zen4-based CPUsBorislav Petkov (AMD)
Fix erratum #1485 on Zen4 parts where running with STIBP disabled can cause an #UD exception. The performance impact of the fix is negligible. Reported-by: René Rebe <rene@exactcode.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: René Rebe <rene@exactcode.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com
2023-10-11x86/resctrl: Fix remaining kernel-doc warningsMaciej Wieczor-Retman
The kernel test robot reported kernel-doc warnings here: arch/x86/kernel/cpu/resctrl/rdtgroup.c:915: warning: Function parameter or member 'of' not described in 'rdt_bit_usage_show' arch/x86/kernel/cpu/resctrl/rdtgroup.c:915: warning: Function parameter or member 'seq' not described in 'rdt_bit_usage_show' arch/x86/kernel/cpu/resctrl/rdtgroup.c:915: warning: Function parameter or member 'v' not described in 'rdt_bit_usage_show' arch/x86/kernel/cpu/resctrl/rdtgroup.c:1144: warning: Function parameter or member 'type' not described in '__rdtgroup_cbm_overlaps' arch/x86/kernel/cpu/resctrl/rdtgroup.c:1224: warning: Function parameter or member 'rdtgrp' not described in 'rdtgroup_mode_test_exclusive' arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'of' not described in 'rdtgroup_mode_write' arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'buf' not described in 'rdtgroup_mode_write' arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'nbytes' not described in 'rdtgroup_mode_write' arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'off' not described in 'rdtgroup_mode_write' arch/x86/kernel/cpu/resctrl/rdtgroup.c:1370: warning: Function parameter or member 'of' not described in 'rdtgroup_size_show' arch/x86/kernel/cpu/resctrl/rdtgroup.c:1370: warning: Function parameter or member 's' not described in 'rdtgroup_size_show' arch/x86/kernel/cpu/resctrl/rdtgroup.c:1370: warning: Function parameter or member 'v' not described in 'rdtgroup_size_show' The first two functions are missing an argument description while the other three are file callbacks and don't require a kernel-doc comment. Closes: https://lore.kernel.org/oe-kbuild-all/202310070434.mD8eRNAz-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Newman <peternewman@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20231011064843.246592-1-maciej.wieczor-retman@intel.com
2023-10-10arch/x86: Remove now superfluous sentinel elem from ctl_table arraysJoel Granados
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Remove sentinel element from sld_sysctl and itmt_kern_table. This removal is safe because register_sysctl_init and register_sysctl implicitly use the array size in addition to checking for the sentinel. Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> # for x86 Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-10-10x86/cpu: Provide debug interfaceThomas Gleixner
Provide debug files which dump the topology related information of cpuinfo_x86. This is useful to validate the upcoming conversion of the topology evaluation for correctness or bug compatibility. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085113.353191313@linutronix.de
2023-10-10x86/apic: Use u32 for cpu_present_to_apicid()Thomas Gleixner
APIC IDs are used with random data types u16, u32, int, unsigned int, unsigned long. Make it all consistently use u32 because that reflects the hardware register width and fixup a few related usage sites for consistency sake. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085113.054064391@linutronix.de
2023-10-10x86/cpu: Move cpu_l[l2]c_id into topology infoThomas Gleixner
The topology IDs which identify the LLC and L2 domains clearly belong to the per CPU topology information. Move them into cpuinfo_x86::cpuinfo_topo and get rid of the extra per CPU data and the related exports. This also paves the way to do proper topology evaluation during early boot because it removes the only per CPU dependency for that. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.803864641@linutronix.de
2023-10-10x86/cpu: Move logical package and die IDs into topology infoThomas Gleixner
Yet another topology related data pair. Rename logical_proc_id to logical_pkg_id so it fits the common naming conventions. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.745139505@linutronix.de
2023-10-10x86/cpu: Remove pointless evaluation of x86_coreid_bitsThomas Gleixner
cpuinfo_x86::x86_coreid_bits is only used by the AMD numa topology code. No point in evaluating it on non AMD systems. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.687588373@linutronix.de
2023-10-10x86/cpu: Move cu_id into topology infoThomas Gleixner
No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.628405546@linutronix.de
2023-10-10x86/cpu: Move cpu_core_id into topology infoThomas Gleixner
Rename it to core_id and stick it to the other ID fields. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.566519388@linutronix.de
2023-10-10x86/cpu: Move cpu_die_id into topology infoThomas Gleixner
Move the next member. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.388185134@linutronix.de
2023-10-10x86/cpu: Move phys_proc_id into topology infoThomas Gleixner
Rename it to pkg_id which is the terminology used in the kernel. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.329006989@linutronix.de
2023-10-10x86/cpu: Encapsulate topology information in cpuinfo_x86Thomas Gleixner
The topology related information is randomly scattered across cpuinfo_x86. Create a new structure cpuinfo_topo and move in a first step initial_apicid and apicid into it. Aside of being better readable this is in preparation for replacing the horribly fragile CPU topology evaluation code further down the road. Consolidate APIC ID fields to u32 as that represents the hardware type. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230814085112.269787744@linutronix.de
2023-10-10x86/cpu/hygon: Fix the CPU topology evaluation for realPu Wen
Hygon processors with a model ID > 3 have CPUID leaf 0xB correctly populated and don't need the fixed package ID shift workaround. The fixup is also incorrect when running in a guest. Fixes: e0ceeae708ce ("x86/CPU/hygon: Fix phys_proc_id calculation logic for multi-die processors") Signed-off-by: Pu Wen <puwen@hygon.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/tencent_594804A808BD93A4EBF50A994F228E3A7F07@qq.com Link: https://lore.kernel.org/r/20230814085112.089607918@linutronix.de
2023-10-08x86/resctrl: Fix kernel-doc warningsRandy Dunlap
The kernel test robot reported kernel-doc warnings here: monitor.c:34: warning: Cannot understand * @rmid_free_lru A least recently used list of free RMIDs on line 34 - I thought it was a doc line monitor.c:41: warning: Cannot understand * @rmid_limbo_count count of currently unused but (potentially) on line 41 - I thought it was a doc line monitor.c:50: warning: Cannot understand * @rmid_entry - The entry in the limbo and free lists. on line 50 - I thought it was a doc line We don't have a syntax for documenting individual data items via kernel-doc, so remove the "/**" kernel-doc markers and add a hyphen for consistency. Fixes: 6a445edce657 ("x86/intel_rdt/cqm: Add RDT monitoring initialization") Fixes: 24247aeeabe9 ("x86/intel_rdt/cqm: Improve limbo list processing") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231006235132.16227-1-rdunlap@infradead.org
2023-10-05Merge tag 'v6.6-rc4' into x86/entry, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-10-03x86/boot: Move x86_cache_alignment initialization to correct spotDave Hansen
c->x86_cache_alignment is initialized from c->x86_clflush_size. However, commit fbf6449f84bf moved c->x86_clflush_size initialization to later in boot without moving the c->x86_cache_alignment assignment: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach") This presumably left c->x86_cache_alignment set to zero for longer than it should be. The result was an oops on 32-bit kernels while accessing a pointer at 0x20. The 0x20 came from accessing a structure member at offset 0x10 (buffer->cpumask) from a ZERO_SIZE_PTR=0x10. kmalloc() can evidently return ZERO_SIZE_PTR when it's given 0 as its alignment requirement. Move the c->x86_cache_alignment initialization to be after c->x86_clflush_size has an actual value. Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20231002220045.1014760-1-dave.hansen@linux.intel.com
2023-09-29x86/cpu/amd: Remove redundant 'break' statementBaolin Liu
This break is after the return statement, so it is redundant & confusing, and should be deleted. Signed-off-by: Baolin Liu <liubaolin@kylinos.cn> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/396ba14d.2726.189d957b74b.Coremail.liubaolin12138@163.com
2023-09-28x86/sgx: Resolves SECS reclaim vs. page fault for EAUG raceHaitao Huang
The SGX EPC reclaimer (ksgxd) may reclaim the SECS EPC page for an enclave and set secs.epc_page to NULL. The SECS page is used for EAUG and ELDU in the SGX page fault handler. However, the NULL check for secs.epc_page is only done for ELDU, not EAUG before being used. Fix this by doing the same NULL check and reloading of the SECS page as needed for both EAUG and ELDU. The SECS page holds global enclave metadata. It can only be reclaimed when there are no other enclave pages remaining. At that point, virtually nothing can be done with the enclave until the SECS page is paged back in. An enclave can not run nor generate page faults without a resident SECS page. But it is still possible for a #PF for a non-SECS page to race with paging out the SECS page: when the last resident non-SECS page A triggers a #PF in a non-resident page B, and then page A and the SECS both are paged out before the #PF on B is handled. Hitting this bug requires that race triggered with a #PF for EAUG. Following is a trace when it happens. BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:sgx_encl_eaug_page+0xc7/0x210 Call Trace: ? __kmem_cache_alloc_node+0x16a/0x440 ? xa_load+0x6e/0xa0 sgx_vma_fault+0x119/0x230 __do_fault+0x36/0x140 do_fault+0x12f/0x400 __handle_mm_fault+0x728/0x1110 handle_mm_fault+0x105/0x310 do_user_addr_fault+0x1ee/0x750 ? __this_cpu_preempt_check+0x13/0x20 exc_page_fault+0x76/0x180 asm_exc_page_fault+0x27/0x30 Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave") Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Kai Huang <kai.huang@intel.com> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20230728051024.33063-1-haitao.huang%40linux.intel.com
2023-09-28x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of ↵Adam Dunlap
a two-phase approach Instead of setting x86_virt_bits to a possibly-correct value and then correcting it later, do all the necessary checks before setting it. At this point, the #VC handler references boot_cpu_data.x86_virt_bits, and in the previous version, it would be triggered by the CPUIDs between the point at which it is set to 48 and when it is set to the correct value. Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Adam Dunlap <acdunlap@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Jacob Xu <jacobhxu@google.com> Link: https://lore.kernel.org/r/20230912002703.3924521-3-acdunlap@google.com
2023-09-28x86/srso: Add SRSO mitigation for Hygon processorsPu Wen
Add mitigation for the speculative return stack overflow vulnerability which exists on Hygon processors too. Signed-off-by: Pu Wen <puwen@hygon.cn> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/tencent_4A14812842F104E93AA722EC939483CEFF05@qq.com
2023-09-22x86/cpu: Clear SVM feature if disabled by BIOSPaolo Bonzini
When SVM is disabled by BIOS, one cannot use KVM but the SVM feature is still shown in the output of /proc/cpuinfo. On Intel machines, VMX is cleared by init_ia32_feat_ctl(), so do the same on AMD and Hygon processors. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230921114940.957141-1-pbonzini@redhat.com
2023-09-19x86/srso: Fix SBPB enablement for spec_rstack_overflow=offJosh Poimboeuf
If the user has requested no SRSO mitigation, other mitigations can use the lighter-weight SBPB instead of IBPB. Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/b20820c3cfd1003171135ec8d762a0b957348497.1693889988.git.jpoimboe@kernel.org