summaryrefslogtreecommitdiff
path: root/drivers/platform/x86/intel/ifs/runtest.c
AgeCommit message (Collapse)Author
2025-05-02x86/msr: Add explicit includes of <asm/msr.h>Xin Li (Intel)
For historic reasons there are some TSC-related functions in the <asm/msr.h> header, even though there's an <asm/tsc.h> header. To facilitate the relocation of rdtsc{,_ordered}() from <asm/msr.h> to <asm/tsc.h> and to eventually eliminate the inclusion of <asm/msr.h> in <asm/tsc.h>, add an explicit <asm/msr.h> dependency to the source files that reference definitions from <asm/msr.h>. [ mingo: Clarified the changelog. ] Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20250501054241.1245648-1-xin@zytor.com
2025-04-10x86/msr: Rename 'wrmsrl()' to 'wrmsrq()'Ingo Molnar
Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Xin Li <xin@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-10x86/msr: Rename 'rdmsrl()' to 'rdmsrq()'Ingo Molnar
Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Xin Li <xin@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2024-08-12Merge tag 'platform-drivers-x86-v6.11-3' into review-hansHans de Goede
Merge 'platform-drivers-x86-v6.11-3' into review-hans to avoid conflicts when merging further ideapad-laptop patches. platform-drivers-x86 for v6.11-3 Fixes: - ideapad-laptop / lenovo-ymc: Protect VPC calls with a mutex - amd/pmf: Query HPD data also when ALS is disabled The following is an automated shortlog grouped by driver: amd/pmf: - Fix to Update HPD Data When ALS is Disabled ideapad-laptop: - add a mutex to synchronize VPC commands - introduce a generic notification chain - move ymc_trigger_ec from lenovo-ymc
2024-08-12trace: platform/x86/intel/ifs: Add SBAF trace supportJithu Joseph
Add tracing support for the SBAF IFS tests, which may be useful for debugging systems that fail these tests. Log details like test content batch number, SBAF bundle ID, program index and the exact errors or warnings encountered by each HT thread during the test. Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20240801051814.1935149-5-sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-08-12platform/x86/intel/ifs: Add SBAF test supportJithu Joseph
In a core, the SBAF test engine is shared between sibling CPUs. An SBAF test image contains multiple bundles. Each bundle is further composed of subunits called programs. When a SBAF test (for a particular core) is triggered by the user, each SBAF bundle from the loaded test image is executed sequentially on all the threads on the core using the stop_core_cpuslocked mechanism. Each bundle execution is initiated by writing to MSR_ACTIVATE_SBAF. SBAF test bundle execution may be aborted when an interrupt occurs or if the CPU does not have enough power budget for the test. In these cases the kernel restarts the test from the aborted bundle. SBAF execution is not retried if the test fails or if the test makes no forward progress after 5 retries. Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20240801051814.1935149-4-sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-07-31platform/x86/intel/ifs: Initialize union ifs_status to zeroKuppuswamy Sathyanarayanan
If the IFS scan test exits prematurely due to a timeout before completing a single run, the union ifs_status remains uninitialized, leading to incorrect test status reporting. To prevent this, always initialize the union ifs_status to zero. Fixes: 2b40e654b73a ("platform/x86/intel/ifs: Add scan test support") Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jithu Joseph <jithu.joseph@intel.com> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20240730155930.1754744-1-sathyanarayanan.kuppuswamy@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-04-29platform/x86/intel/ifs: Classify error scenarios correctlyJithu Joseph
"Scan controller error" means that scan hardware encountered an error prior to doing an actual test on the target CPU. It does not mean that there is an actual cpu/core failure. "scan signature failure" indicates that the test result on the target core did not match the expected value and should be treated as a cpu failure. Current driver classifies both these scenarios as failures. Modify the driver to classify this situation with a more appropriate "untested" status instead of "fail" status. Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20240412172349.544064-2-jithu.joseph@intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-01-31platform/x86/intel/ifs: Add an entry rendezvous for SAFAshok Raj
The activation for Scan at Field (SAF) includes a parameter to make microcode wait for both threads to join. It's preferable to perform an entry rendezvous before the activation to ensure that they start the `wrmsr` close enough to each other. In some cases it has been observed that one of the threads might be just a bit late to arrive. An entry rendezvous reduces the likelihood of these cases occurring. Add an entry rendezvous to ensure the activation on both threads happen close enough to each other. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20240125082254.424859-6-ashok.raj@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-01-31platform/x86/intel/ifs: Replace the exit rendezvous with an entry rendezvous ↵Ashok Raj
for ARRAY_BIST ARRAY_BIST requires the test to be invoked only from one of the HT siblings of a core. If the other sibling was in mwait(), that didn't permit the test to complete and resulted in several retries before the test could finish. The exit rendezvous was introduced to keep the HT sibling busy until the primary CPU completed the test to avoid those retries. What is actually needed is to ensure that both the threads rendezvous *before* the wrmsr to trigger the test to give good chance to complete the test. The `stop_machine()` function returns only after all the CPUs complete running the function, and provides an exit rendezvous implicitly. In kernel/stop_machine.c::multi_cpu_stop(), every CPU in the mask needs to complete reaching MULTI_STOP_RUN. When all CPUs complete, the state machine moves to next state, i.e MULTI_STOP_EXIT. Thus the underlying API stop_core_cpuslocked() already provides an exit rendezvous. Add the rendezvous earlier in order to ensure the wrmsr is triggered after all CPUs reach the do_array_test(). Remove the exit rendezvous since stop_core_cpuslocked() already guarantees that. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20240125082254.424859-5-ashok.raj@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-01-31platform/x86/intel/ifs: Add current batch number to trace outputAshok Raj
Add the current batch number in the trace output. When there are failures, it's important to know which test content resulted in failure. # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | migration/0-18 [000] d..1. 527287.084668: ifs_status: batch: 02, start: 0000, stop: 007f, status: 0000000000007f80 migration/128-785 [128] d..1. 527287.084669: ifs_status: batch: 02, start: 0000, stop: 007f, status: 0000000000007f80 Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20240125082254.424859-4-ashok.raj@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-01-31platform/x86/intel/ifs: Trace on all HT threads when executing a testAshok Raj
Enable the trace function on all HT threads. Currently, the trace is called from some arbitrary CPU where the test was invoked. This change gives visibility to the exact errors as seen by each participating HT threads, and not just what was seen from the primary thread. Sample output below. # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | migration/0-18 [000] d..1. 527287.084668: start: 0000, stop: 007f, status: 0000000000007f80 migration/128-785 [128] d..1. 527287.084669: start: 0000, stop: 007f, status: 0000000000007f80 Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20240125082254.424859-3-ashok.raj@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-10-31Merge tag 'platform-drivers-x86-v6.7-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver updates from Ilpo Järvinen: - asus-wmi: Support for screenpad and solve brightness key press duplication - int3472: Eliminate the last use of deprecated GPIO functions - mlxbf-pmc: New HW support - msi-ec: Support new EC configurations - thinkpad_acpi: Support reading aux MAC address during passthrough - wmi: Fixes & improvements - x86-android-tablets: Detection fix and avoid use of GPIO private APIs - Debug & metrics interface improvements - Miscellaneous cleanups / fixes / improvements * tag 'platform-drivers-x86-v6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (80 commits) platform/x86: inspur-platform-profile: Add platform profile support platform/x86: thinkpad_acpi: Add battery quirk for Thinkpad X120e platform/x86: wmi: Decouple WMI device removal from wmi_block_list platform/x86: wmi: Fix opening of char device platform/x86: wmi: Fix probe failure when failing to register WMI devices platform/x86: wmi: Fix refcounting of WMI devices in legacy functions platform/x86: wmi: Decouple probe deferring from wmi_block_list platform/x86/amd/hsmp: Fix iomem handling platform/x86: asus-wmi: Do not report brightness up/down keys when also reported by acpi_video platform/x86: thinkpad_acpi: replace deprecated strncpy with memcpy tools/power/x86/intel-speed-select: v1.18 release tools/power/x86/intel-speed-select: Use cgroup isolate for CPU 0 tools/power/x86/intel-speed-select: Increase max CPUs in one request tools/power/x86/intel-speed-select: Display error for core-power support tools/power/x86/intel-speed-select: No TRL for non compute domains tools/power/x86/intel-speed-select: turbo-mode enable disable swapped tools/power/x86/intel-speed-select: Update help for TRL tools/power/x86/intel-speed-select: Sanitize integer arguments platform/x86: acer-wmi: Remove void function return platform/x86/amd/pmc: Add dump_custom_stb module parameter ...
2023-10-06platform/x86/intel/ifs: ARRAY BIST for Sierra ForestJithu Joseph
Array BIST MSR addresses, bit definition and semantics are different for Sierra Forest. Branch into a separate Array BIST flow on Sierra Forest when user invokes Array Test. Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Link: https://lore.kernel.org/r/20231005195137.3117166-10-jithu.joseph@intel.com [ij: ARRAY_GEN_* -> ARRAY_GEN* for consistency] Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-10-06platform/x86/intel/ifs: Add new error codeJithu Joseph
Make driver aware of a newly added error code so that it can provide a more appropriate error message. Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Link: https://lore.kernel.org/r/20231005195137.3117166-9-jithu.joseph@intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-10-06platform/x86/intel/ifs: Gen2 Scan test supportJithu Joseph
Width of chunk related bitfields is ACTIVATE_SCAN and SCAN_STATUS MSRs are different in newer IFS generation compared to gen0. Make changes to scan test flow such that MSRs are populated appropriately based on the generation supported by hardware. Account for the 8/16 bit MSR bitfield width differences between gen0 and newer generations for the scan test trace event too. Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Link: https://lore.kernel.org/r/20231005195137.3117166-5-jithu.joseph@intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-10-04platform/x86/intel/ifs: release cpus_read_lock()Jithu Joseph
Couple of error paths in do_core_test() was returning directly without doing a necessary cpus_read_unlock(). Following lockdep warning was observed when exercising these scenarios with PROVE_RAW_LOCK_NESTING enabled: [ 139.304775] ================================================ [ 139.311185] WARNING: lock held when returning to user space! [ 139.317593] 6.6.0-rc2ifs01+ #11 Tainted: G S W I [ 139.324499] ------------------------------------------------ [ 139.330908] bash/11476 is leaving the kernel with locks still held! [ 139.338000] 1 lock held by bash/11476: [ 139.342262] #0: ffffffffaa26c930 (cpu_hotplug_lock){++++}-{0:0}, at: do_core_test+0x35/0x1c0 [intel_ifs] Fix the flow so that all scenarios release the lock prior to returning from the function. Fixes: 5210fb4e1880 ("platform/x86/intel/ifs: Sysfs interface for Array BIST") Cc: stable@vger.kernel.org Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Link: https://lore.kernel.org/r/20230927184824.2566086-1-jithu.joseph@intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-03-27platform/x86/intel/ifs: Implement Array BIST testJithu Joseph
Array BIST test (for a particular core) is triggered by writing to MSR_ARRAY_BIST from one sibling of the core. This will initiate a test for all supported arrays on that CPU. Array BIST test may be aborted before completing all the arrays in the event of an interrupt or other reasons. In this case, kernel will restart the test from that point onwards. Array test will also be aborted when the test fails, in which case the test is stopped immediately without further retry. Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20230322003359.213046-8-jithu.joseph@intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-03-27platform/x86/intel/ifs: Sysfs interface for Array BISTJithu Joseph
The interface to trigger Array BIST test and obtain its result is similar to the existing scan test. The only notable difference is that, Array BIST doesn't require any test content to be loaded. So binary load related options are not needed for this test. Add sysfs interface for array BIST test, the testing support will be added by subsequent patch. Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20230322003359.213046-7-jithu.joseph@intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-11-19platform/x86/intel/ifs: Add current_batch sysfs entryJithu Joseph
Initial implementation assumed a single IFS test image file with a fixed name ff-mm-ss.scan. (where ff, mm, ss refers to family, model and stepping of the core). Subsequently, it became evident that supporting more than one test image file is needed to provide more comprehensive test coverage. (Test coverage in this scenario refers to testing more transistors in the core to identify faults). The other alternative of increasing the size of a single scan test image file would not work as the upper bound is limited by the size of memory area reserved by BIOS for loading IFS test image. Introduce "current_batch" file which accepts a number. Writing a number to the current_batch file would load the test image file by name ff-mm-ss-<xy>.scan, where <xy> is the number written to the "current_batch" file in hex. Range check of the input is done to verify it not greater than 0xff. For e.g if the scan test image comprises of 6 files, they would be named: 06-8f-06-01.scan 06-8f-06-02.scan 06-8f-06-03.scan 06-8f-06-04.scan 06-8f-06-05.scan 06-8f-06-06.scan And writing 3 to current_batch would result in loading 06-8f-06-03.scan above. The file can also be read to know the currently loaded file. And testing a system looks like: for each scan file do load the IFS test image file (write to the batch file) for each core do test the core with this set of tests done done Qualify few error messages with the test image file suffix to provide better context. [ bp: Massage commit message. Add link to the discussion. ] Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20221107225323.2733518-13-jithu.joseph@intel.com
2022-05-12trace: platform/x86/intel/ifs: Add trace point to track Intel IFS operationsTony Luck
Add tracing support which may be useful for debugging systems that fail to complete In Field Scan tests. Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20220506225410.1652287-11-tony.luck@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-05-12platform/x86/intel/ifs: Add scan test supportJithu Joseph
In a core, the scan engine is shared between sibling cpus. When a Scan test (for a particular core) is triggered by the user, the scan chunks are executed on all the threads on the core using stop_core_cpuslocked. Scan may be aborted by some reasons. Scan test will be aborted in certain circumstances such as when interrupt occurred or cpu does not have enough power budget for scan. In this case, the kernel restart scan from the chunk where it stopped. Scan will also be aborted when the test is failed. In this case, the test is immediately stopped without retry. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Co-developed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220506225410.1652287-9-tony.luck@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>