From 604dc9170f2435d27da5039a3efd757dceadc684 Mon Sep 17 00:00:00 2001 From: Daniel Drake Date: Thu, 9 May 2019 13:54:15 +0800 Subject: x86/tsc: Use CPUID.0x16 to calculate missing crystal frequency native_calibrate_tsc() had a data mapping Intel CPU families and crystal clock speed, but hardcoded tables are not ideal, and this approach was already problematic at least in the Skylake X case, as seen in commit: b51120309348 ("x86/tsc: Fix erroneous TSC rate on Skylake Xeon") By examining CPUID data from http://instlatx64.atw.hu/ and units in the lab, we have found that 3 different scenarios need to be dealt with, and we can eliminate most of the hardcoded data using an approach a little more advanced than before: 1. ApolloLake, GeminiLake, CannonLake (and presumably all new chipsets from this point) report the crystal frequency directly via CPUID.0x15. That's definitive data that we can rely upon. 2. Skylake, Kabylake and all variants of those two chipsets report a crystal frequency of zero, however we can calculate the crystal clock speed by condidering data from CPUID.0x16. This method correctly distinguishes between the two crystal clock frequencies present on different Skylake X variants that caused headaches before. As the calculations do not quite match the previously-hardcoded values in some cases (e.g. 23913043Hz instead of 24MHz), TSC refinement is enabled on all platforms where we had to calculate the crystal frequency in this way. 3. Denverton (GOLDMONT_X) reports a crystal frequency of zero and does not support CPUID.0x16, so we leave this entry hardcoded. Suggested-by: Thomas Gleixner Signed-off-by: Daniel Drake Reviewed-by: Thomas Gleixner Cc: Andy Lutomirski Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: len.brown@intel.com Cc: linux@endlessm.com Cc: rafael.j.wysocki@intel.com Link: http://lkml.kernel.org/r/20190509055417.13152-1-drake@endlessm.com Link: https://lkml.kernel.org/r/20190419083533.32388-1-drake@endlessm.com Signed-off-by: Ingo Molnar --- arch/x86/kernel/tsc.c | 47 +++++++++++++++++++++++++++-------------------- 1 file changed, 27 insertions(+), 20 deletions(-) (limited to 'arch/x86/kernel/tsc.c') diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 15b5e98a86f9..6e6d933fb99c 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -631,31 +631,38 @@ unsigned long native_calibrate_tsc(void) crystal_khz = ecx_hz / 1000; - if (crystal_khz == 0) { - switch (boot_cpu_data.x86_model) { - case INTEL_FAM6_SKYLAKE_MOBILE: - case INTEL_FAM6_SKYLAKE_DESKTOP: - case INTEL_FAM6_KABYLAKE_MOBILE: - case INTEL_FAM6_KABYLAKE_DESKTOP: - crystal_khz = 24000; /* 24.0 MHz */ - break; - case INTEL_FAM6_ATOM_GOLDMONT_X: - crystal_khz = 25000; /* 25.0 MHz */ - break; - case INTEL_FAM6_ATOM_GOLDMONT: - crystal_khz = 19200; /* 19.2 MHz */ - break; - } - } + /* + * Denverton SoCs don't report crystal clock, and also don't support + * CPUID.0x16 for the calculation below, so hardcode the 25MHz crystal + * clock. + */ + if (crystal_khz == 0 && + boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT_X) + crystal_khz = 25000; - if (crystal_khz == 0) - return 0; /* - * TSC frequency determined by CPUID is a "hardware reported" + * TSC frequency reported directly by CPUID is a "hardware reported" * frequency and is the most accurate one so far we have. This * is considered a known frequency. */ - setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ); + if (crystal_khz != 0) + setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ); + + /* + * Some Intel SoCs like Skylake and Kabylake don't report the crystal + * clock, but we can easily calculate it to a high degree of accuracy + * by considering the crystal ratio and the CPU speed. + */ + if (crystal_khz == 0 && boot_cpu_data.cpuid_level >= 0x16) { + unsigned int eax_base_mhz, ebx, ecx, edx; + + cpuid(0x16, &eax_base_mhz, &ebx, &ecx, &edx); + crystal_khz = eax_base_mhz * 1000 * + eax_denominator / ebx_numerator; + } + + if (crystal_khz == 0) + return 0; /* * For Atom SoCs TSC is the only reliable clocksource. -- cgit From 2420a0b1798d7a78d1f9b395f09f3c80d92cc588 Mon Sep 17 00:00:00 2001 From: Daniel Drake Date: Thu, 9 May 2019 13:54:17 +0800 Subject: x86/tsc: Set LAPIC timer period to crystal clock frequency The APIC timer calibration (calibrate_APIC_timer()) can be skipped in cases where we know the APIC timer frequency. On Intel SoCs, we believe that the APIC is fed by the crystal clock; this would make sense, and the crystal clock frequency has been verified against the APIC timer calibration result on ApolloLake, GeminiLake, Kabylake, CoffeeLake, WhiskeyLake and AmberLake. Set lapic_timer_period based on the crystal clock frequency accordingly. APIC timer calibration would normally be skipped on modern CPUs by nature of the TSC deadline timer being used instead, however this change is still potentially useful, e.g. if the TSC deadline timer has been disabled with a kernel parameter. calibrate_APIC_timer() uses the legacy timer, but we are seeing new platforms that omit such legacy functionality, so avoiding such codepaths is becoming more important. Suggested-by: Thomas Gleixner Signed-off-by: Daniel Drake Reviewed-by: Thomas Gleixner Cc: Andy Lutomirski Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: len.brown@intel.com Cc: linux@endlessm.com Cc: rafael.j.wysocki@intel.com Link: http://lkml.kernel.org/r/20190509055417.13152-3-drake@endlessm.com Link: https://lkml.kernel.org/r/20190419083533.32388-1-drake@endlessm.com Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1904031206440.1967@nanos.tec.linutronix.de Signed-off-by: Ingo Molnar --- arch/x86/kernel/tsc.c | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'arch/x86/kernel/tsc.c') diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 6e6d933fb99c..8f47c4862c56 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -671,6 +671,16 @@ unsigned long native_calibrate_tsc(void) if (boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT) setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE); +#ifdef CONFIG_X86_LOCAL_APIC + /* + * The local APIC appears to be fed by the core crystal clock + * (which sounds entirely sensible). We can set the global + * lapic_timer_period here to avoid having to calibrate the APIC + * timer later. + */ + lapic_timer_period = crystal_khz * 1000 / HZ; +#endif + return crystal_khz * ebx_numerator / eax_denominator; } -- cgit