summaryrefslogtreecommitdiff
path: root/drivers/thermal/intel
AgeCommit message (Collapse)Author
2022-08-03thermal: intel: Add TCC cooling support for Alder Lake-N and Raptor Lake-PSumeet Pawnikar
Add Alder Lake-N and Raptor Lake-P to the list of processor models supported by the Intel TCC cooling driver. Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-07-22intel: thermal: PCH: Drop ACPI_FADT_LOW_POWER_S0 checkRafael J. Wysocki
If ACPI_FADT_LOW_POWER_S0 is not set, this doesn't mean that low-power S0 idle is not usable. It merely means that using S3 on the given system is more beneficial from the energy saving perspective than using low-power S0 idle, as long as S3 is supported. Suspend-to-idle is still a valid suspend mode if ACPI_FADT_LOW_POWER_S0 is not set and the pm_suspend_via_firmware() check in pch_wpt_suspend() is sufficient to distinguish suspend-to-idle from S3, so drop the confusing ACPI_FADT_LOW_POWER_S0 check. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com>
2022-07-12thermal: intel: x86_pkg_temp_thermal: Drop duplicate 'is' from commentJiang Jian
There is an unexpected word 'is' in a comments that need to be dropped file: ./drivers/thermal/intel/x86_pkg_temp_thermal.c line: 108 * tj-max is is interesting because threshold is set relative to this changed to: * tj-max is interesting because threshold is set relative to this Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-07-01Merge tag 'thermal-5.19-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fix from Rafael Wysocki: "Add a new CPU ID to the list of supported processors in the intel_tcc_cooling driver (Sumeet Pawnikar)" * tag 'thermal-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: intel_tcc_cooling: Add TCC cooling support for RaptorLake
2022-06-30thermal: intel_tcc_cooling: Add TCC cooling support for RaptorLakeSumeet Pawnikar
Add RaptorLake to the list of processor models supported by the Intel TCC cooling driver. Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> [ rjw: Subject edits, new changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-30Merge tag 'thermal-5.19-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull additional thermal control update from Rafael Wysocki: "Add Meteor Lake PCI device ID to the int340x thermal control driver (Sumeet Pawnikar)" * tag 'thermal-5.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: int340x: Add Meteor Lake PCI device ID
2022-05-25thermal: int340x: Add Meteor Lake PCI device IDSumeet Pawnikar
Add Meteor Lake PCI ID for processor thermal device. Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-25ACPI: DPTF: Support Meteor LakeSumeet Pawnikar
Add Meteor Lake ACPI IDs for DPTF devices. Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-23Merge branches 'thermal-int340x', 'thermal-pch' and 'thermal-misc'Rafael J. Wysocki
Merge int340x thermal driver updates, PCH thermal driver updates and miscellaneous thermal control updates for 5.19-rc1: - Clean up _OSC handling in int340x (Davidlohr Bueso). - Improve overheat condition handling during suspend-to-idle in the Intel PCH thermal driver (Zhang Rui). - Use local ops instead of global ops in devfreq_cooling (Kant Fan). - Switch hisi_termal from CONFIG_PM_SLEEP guards to pm_sleep_ptr() (Hesham Almatary) * thermal-int340x: thermal: int340x: Clean up _OSC context init thermal: int340x: Consolidate freeing of acpi_buffer pointer thermal: int340x: Clean up unnecessary acpi_buffer pointer freeing * thermal-pch: thermal: intel: pch: improve the cooling delay log thermal: intel: pch: enhance overheat handling thermal: intel: pch: move cooling delay to suspend_noirq phase PM: wakeup: expose pm_wakeup_pending to modules * thermal-misc: thermal: devfreq_cooling: use local ops instead of global ops thermal: hisi_termal: Switch from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
2022-05-23Merge back earlier thermal control updates for 5.19-rc1.Rafael J. Wysocki
2022-05-19thermal: intel: pch: improve the cooling delay logZhang Rui
Previously, during suspend, intel_pch_thermal driver logs for every cooling iteration, about the current PCH temperature and number of cooling iterations that have been tried, like below [ 100.955526] intel_pch_thermal 0000:00:14.2: CPU-PCH current temp [53C] higher than the threshold temp [50C], sleep 1 times for 100 ms duration [ 101.064156] intel_pch_thermal 0000:00:14.2: CPU-PCH current temp [53C] higher than the threshold temp [50C], sleep 2 times for 100 ms duration After changing the default delay_cnt to 600, in practice, it is common to see tens of the above messages if the system is suspended when PCH overheats. Thus, change this log message from dev_warn to dev_dbg because it is only useful when we want to check the temperature trend. At the same time, there is always a one-line message given by the driver with the patch applied, with below four possibilities. 1. PCH is cool, no cooling delay needed [ 1791.902853] intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [48C] 2. PCH overheats and becomes cool after the cooling delays [ 1475.511617] intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [49C] after 30700 ms delay 3. PCH still overheats after the overall cooling timeout [ 2250.157487] intel_pch_thermal 0000:00:12.0: CPU-PCH is hot [60C] after 60000 ms delay. S0ix might fail 4. PCH aborts cooling because of wakeup event detected during the delay [ 1933.639509] intel_pch_thermal 0000:00:12.0: Wakeup event detected, abort cooling Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19thermal: intel: pch: enhance overheat handlingZhang Rui
Commit ef63b043ac86 ("thermal: intel: pch: fix S0ix failure due to PCH temperature above threshold") introduces delay loop mechanism that allows PCH temperature to go down below threshold during suspend so it won't block S0ix. And the default overall delay timeout is 1 second. However, in practice, we found that the time it takes to cool the PCH down below threshold highly depends on the initial PCH temperature when the delay starts, as well as the ambient temperature. And in some cases, the 1 second delay is not sufficient. As a result, the system stays in a shallower power state like PCx instead of S0ix, and drains the battery power, without user' notice. To make sure S0ix is not blocked by the PCH overheating, we 1. expand the default overall timeout to 60 seconds. 2. make sure the temperature is below threshold rather than equal to it. At the same time, as the cooling delay can be much longer and many wakeup events (ACPI Power Button press, USB mouse move, etc) becomes valid in the suspend_noirq phase, add detection of wakeup event so that the driver does not delay blindly when the system suspend is likely to abort soon. This patch may introduce longer suspend time, but only in the cases when the system overheats and Linux used to enter a shallower S2idle state, say, PCx instead of S0ix. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19thermal: intel: pch: move cooling delay to suspend_noirq phaseZhang Rui
Move the PCH Thermal driver suspend callback to suspend_noirq to do cooling while the system is more quiescent. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-18thermal: intel: hfi: remove NULL check after container_of() callHaowen Bai
container_of() will never return NULL, so remove useless code. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-18Merge back earlier int340x driver changes for 5.19.Rafael J. Wysocki
2022-05-11thermal: int340x: Mode setting with new OS handshakeSrinivas Pandruvada
With the new OS handshake introduced by commit: "c7ff29763989 ("thermal: int340x: Update OS policy capability handshake")", the "enabled" thermal zone mode doesn't work in the same way as previously. The "enabled" mode fails with -EINVAL when the new handshake is used. To address this issue, when the new OS UUID mask is set: - When the mode is "enabled", return 0 as the firmware already has the latest policy mask. - When the mode is "disabled", update the firmware with the UUID mask of zero. This way, the firmware can take over the thermal control. Also reset the OS UUID mask, which allows user space to update with new set of policies. Fixes: c7ff29763989 ("thermal: int340x: Update OS policy capability handshake") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Changelog edits, removed unneeded parens ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-05Merge back earlier int340x thermal driver changes for 5.19.Rafael J. Wysocki
2022-04-21thermal: int340x: Fix attr.show callback prototypeKees Cook
Control Flow Integrity (CFI) instrumentation of the kernel noticed that the caller, dev_attr_show(), and the callback, odvp_show(), did not have matching function prototypes, which would cause a CFI exception to be raised. Correct the prototype by using struct device_attribute instead of struct kobj_attribute. Reported-and-tested-by: Joao Moreira <joao@overdrivepizza.com> Link: https://lore.kernel.org/lkml/067ce8bd4c3968054509831fa2347f4f@overdrivepizza.com/ Fixes: 006f006f1e5c ("thermal/int340x_thermal: Export OEM vendor variables") Cc: 5.8+ <stable@vger.kernel.org> # 5.8+ Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-05thermal: int340x: Clean up _OSC context initDavidlohr Bueso
Now that the UUID is already sanitized by the caller, lets trivially clean up some of the context arming. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Zhang Rui <rui.zhang@intel.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-05thermal: int340x: Consolidate freeing of acpi_buffer pointerDavidlohr Bueso
Introduce a single point of freeing/exit after ensuring no error in int3400_setup_gddv(). Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-05thermal: int340x: Clean up unnecessary acpi_buffer pointer freeingDavidlohr Bueso
It is the caller's responsibility to free only upon ACPI_SUCCESS. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Zhang Rui <rui.zhang@intel.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-18Merge branch 'thermal-hfi'Rafael J. Wysocki
Merge Intel Hardware Feedback Interface (HFI) thermal driver for 5.18-rc1 and update the intel-speed-select utility to support that driver. * thermal-hfi: tools/power/x86/intel-speed-select: v1.12 release tools/power/x86/intel-speed-select: HFI support tools/power/x86/intel-speed-select: OOB daemon mode thermal: intel: hfi: INTEL_HFI_THERMAL depends on NET thermal: netlink: Fix parameter type of thermal_genl_cpu_capability_event() stub thermal: intel: hfi: Notify user space for HFI events thermal: netlink: Add a new event to notify CPU capabilities change thermal: intel: hfi: Enable notification interrupt thermal: intel: hfi: Handle CPU hotplug events thermal: intel: hfi: Minimally initialize the Hardware Feedback Interface x86/cpu: Add definitions for the Intel Hardware Feedback Interface x86/Documentation: Describe the Intel Hardware Feedback Interface
2022-03-18Merge branches 'thermal-powerclamp', 'thermal-int340x' and 'thermal-docs'Rafael J. Wysocki
Merge powerclamp thermal driver changes, int340x thermal driver changes and thermal documentation changes for 5.18-rc1: - Don't use bitmap_weight() in end_power_clamp() in the powerclamp driver (Yury Norov). - Update the OS policy capabilities handshake in the int340x thermal driver (Srinivas Pandruvada). - Increase the policies bitmap size in int340x (Srinivas Pandruvada). - Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the int340x thermal driver (Rafael Wysocki). - Check for NULL after calling kmemdup() in int340x (Jiasheng Jiang). - Add Intel Dynamic Power and Thermal Framework (DPTF) kernel interface documentation (Srinivas Pandruvada). - Fix bullet list warning in the thermal documentation (Randy Dunlap). * thermal-powerclamp: thermal: intel_powerclamp: don't use bitmap_weight() in end_power_clamp() * thermal-int340x: thermal: int340x: Update OS policy capability handshake thermal: int340x: Increase bitmap size thermal: Replace acpi_bus_get_device() thermal: int340x: Check for NULL after calling kmemdup() * thermal-docs: Documentation: thermal: DPTF Documentation thermal: fix Documentation bullet list warning
2022-03-16thermal: int340x: Update OS policy capability handshakeSrinivas Pandruvada
Update the firmware with OS supported policies mask, so that firmware can relinquish its internal controls. Without this update several Tiger Lake laptops gets performance limited with in few seconds of executing in turbo region. The existing way of enumerating firmware policies via IDSP method and selecting policy by directly writing those policy UUIDS via _OSC method is not supported in newer generation of hardware. There is a new UUID "B23BA85D-C8B7-3542-88DE-8DE2FFCFD698" is defined for updating policy capabilities. As part of ACPI _OSC method: Arg0 - UUID: B23BA85D-C8B7-3542-88DE-8DE2FFCFD698 Arg1 - Rev ID: 1 Arg2 - Count: 2 Arg3 - Capability buffers: Array of Arg2 DWORDS DWORD1: As defined in the ACPI 5.0 Specification - Bit 0: Query Flag - Bits 1-3: Always 0 - Bits 4-31: Reserved DWORD2 and beyond: - Bit0: set to 1 to indicate Intel(R) Dynamic Tuning is active, 0 to indicate it is disabled and legacy thermal mechanism should be enabled. - Bit1: set to 1 to indicate Intel(R) Dynamic Tuning is controlling active cooling, 0 to indicate bios shall enable legacy thermal zone with active trip point. - Bit2: set to 1 to indicate Intel(R) Dynamic Tuning is controlling passive cooling, 0 to indicate bios shall enable legacy thermal zone with passive trip point. - Bit3: set to 1 to indicate Intel(R) Dynamic Tuning is handling critical trip point, 0 to indicate bios shall enable legacy thermal zone with critical trip point. - Bits 4:31: Reserved From sysfs interface, there is an existing interface to update policy UUID using attribute "current_uuid". User space can write the same UUID for ACTIVE, PASSIVE and CRITICAL policy. Driver converts these UUIDs to DWORD2 Bit 1 to Bit 3. When any of the policy is activated by user space it is assumed that dynamic tuning is active. For example $cd /sys/bus/platform/devices/INTC1040:00/uuids To support active policy $echo "3A95C389-E4B8-4629-A526-C52C88626BAE" > current_uuid To support passive policy $echo "42A441D6-AE6A-462b-A84B-4A8CE79027D3" > current_uuid To support critical policy $echo "97C68AE7-15FA-499c-B8C9-5DA81D606E0A" > current_uuid To check all the supported policies $cat current_uuid 3A95C389-E4B8-4629-A526-C52C88626BAE 42A441D6-AE6A-462b-A84B-4A8CE79027D3 97C68AE7-15FA-499c-B8C9-5DA81D606E0A To match the bit format for DWORD2, rearranged enum int3400_thermal_uuid and int3400_thermal_uuids[] by swapping current INT3400_THERMAL_ACTIVE and INT3400_THERMAL_PASSIVE_1. If the policies are enumerated via IDSP method then legacy method is used, if not the new method is used to update policy support. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-16thermal: int340x: Increase bitmap sizeSrinivas Pandruvada
The number of policies are 10, so can't be supported by the bitmap size of u8. Even though there are no platfoms with these many policies, but for correctness increase to u32. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Fixes: 16fc8eca1975 ("thermal/int340x_thermal: Add additional UUIDs") Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-28Merge back int340x thermal driver changes for v5.18.Rafael J. Wysocki
2022-02-24thermal: int340x: fix memory leak in int3400_notify()Chuansheng Liu
It is easy to hit the below memory leaks in my TigerLake platform: unreferenced object 0xffff927c8b91dbc0 (size 32): comm "kworker/0:2", pid 112, jiffies 4294893323 (age 83.604s) hex dump (first 32 bytes): 4e 41 4d 45 3d 49 4e 54 33 34 30 30 20 54 68 65 NAME=INT3400 The 72 6d 61 6c 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 rmal.kkkkkkkkkk. backtrace: [<ffffffff9c502c3e>] __kmalloc_track_caller+0x2fe/0x4a0 [<ffffffff9c7b7c15>] kvasprintf+0x65/0xd0 [<ffffffff9c7b7d6e>] kasprintf+0x4e/0x70 [<ffffffffc04cb662>] int3400_notify+0x82/0x120 [int3400_thermal] [<ffffffff9c8b7358>] acpi_ev_notify_dispatch+0x54/0x71 [<ffffffff9c88f1a7>] acpi_os_execute_deferred+0x17/0x30 [<ffffffff9c2c2c0a>] process_one_work+0x21a/0x3f0 [<ffffffff9c2c2e2a>] worker_thread+0x4a/0x3b0 [<ffffffff9c2cb4dd>] kthread+0xfd/0x130 [<ffffffff9c201c1f>] ret_from_fork+0x1f/0x30 Fix it by calling kfree() accordingly. Fixes: 38e44da59130 ("thermal: int3400_thermal: process "thermal table changed" event") Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Cc: 4.14+ <stable@vger.kernel.org> # 4.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-10thermal: intel: hfi: INTEL_HFI_THERMAL depends on NETRandy Dunlap
THERMAL_NETLINK depends on NET and since 'select' does not follow any dependency chain, INTEL_HFI_THERMAL also should depend on NET. Fix one Kconfig warning and 48 subsequent build errors: WARNING: unmet direct dependencies detected for THERMAL_NETLINK Depends on [n]: THERMAL [=y] && NET [=n] Selected by [y]: - INTEL_HFI_THERMAL [=y] && THERMAL [=y] && (X86 [=y] || X86_INTEL_QUARK [=n] || COMPILE_TEST [=y]) && CPU_SUP_INTEL [=y] && X86_THERMAL_VECTOR [=y] Fixes: bd30cdfd9bd7 ("thermal: intel: hfi: Notify user space for HFI events") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-04thermal: Replace acpi_bus_get_device()Rafael J. Wysocki
Replace acpi_bus_get_device() that is going to be dropped with acpi_fetch_acpi_dev(). No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-04thermal: intel_powerclamp: don't use bitmap_weight() in end_power_clamp()Yury Norov
Don't call bitmap_weight() if the following code can get by without it. Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-04thermal: int340x: Check for NULL after calling kmemdup()Jiasheng Jiang
As the potential failure of the allocation, kmemdup() may return NULL. Then, 'bin_attr_data_vault.private' will be NULL, but 'bin_attr_data_vault.size' is not 0, which is not consistent. Therefore, it is better to check the return value of kmemdup() to avoid the confusion. Fixes: 0ba13c763aac ("thermal/int340x_thermal: Export GDDV") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-03thermal: intel: hfi: Notify user space for HFI eventsSrinivas Pandruvada
When the hardware issues an HFI event, relay a notification to user space. This allows user space to respond by reading performance and efficiency of each CPU and take appropriate action. For example, when the performance and efficiency of a CPU is 0, user space can either offline the CPU or inject idle. Also, if user space notices a downward trend in performance, it may proactively adjust power limits to avoid future situations in which performance drops to 0. To avoid excessive notifications, the rate is limited by one HZ per event. To limit the netlink message size, send parameters for up to 16 CPUs in a single message. If there are more than 16 CPUs, issue as many messages as needed to notify the status of all CPUs. In the HFI specification, both performance and efficiency capabilities are defined in the [0, 255] range. The existing implementations of HFI hardware do not scale the maximum values to 255. Since userspace cares about capability values that are either 0 or show a downward/upward trend, this fact does not matter much. Relative changes in capabilities are enough. To comply with the thermal netlink ABI, scale both performance and efficiency capabilities to the [0, 1023] interval. Reviewed-by: Len Brown <len.brown@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-03thermal: intel: hfi: Enable notification interruptRicardo Neri
When hardware wants to inform the operating system about updates in the HFI table, it issues a package-level thermal event interrupt. For this, hardware has new interrupt and status bits in the IA32_PACKAGE_THERM_ INTERRUPT and IA32_PACKAGE_THERM_STATUS registers. The existing thermal throttle driver already handles thermal event interrupts: it initializes the thermal vector of the local APIC as well as per-CPU and package-level interrupt reporting. It also provides routines to service such interrupts. Extend its functionality to also handle HFI interrupts. The frequency of the thermal HFI interrupt is specific to each processor model. On some processors, a single interrupt happens as soon as the HFI is enabled and hardware will never update HFI capabilities afterwards. On other processors, thermal and power constraints may cause thermal HFI interrupts every tens of milliseconds. To not overwhelm consumers of the HFI data, use delayed work to throttle the rate at which HFI updates are processed. Use a dedicated workqueue to not overload system_wq if hardware issues many HFI updates. Reviewed-by: Len Brown <len.brown@intel.com> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-03thermal: intel: hfi: Handle CPU hotplug eventsRicardo Neri
All CPUs in a package are represented in an HFI table. There exists an HFI table per package. Thus, CPUs in a package need to coordinate to initialize and access the table. Do such coordination during CPU hotplug. Use the first CPU to come online in a package to initialize the HFI instance and the data structure representing it. Other CPUs in the same package need only to register or unregister themselves in that data structure. The HFI depends on both the package-level thermal management and the local APIC thermal local vector. Thus, to ensure that a CPU coming online has an associated HFI instance when the hardware issues an HFI event, enable the HFI only after having enabled the local APIC thermal vector. The thermal throttle driver takes care of the needed package-level initialization. Reviewed-by: Len Brown <len.brown@intel.com> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-03thermal: intel: hfi: Minimally initialize the Hardware Feedback InterfaceRicardo Neri
The Intel Hardware Feedback Interface provides guidance to the operating system about the performance and energy efficiency capabilities of each CPU in the system. Capabilities are numbers between 0 and 255 where a higher number represents a higher capability. For each CPU, energy efficiency and performance are reported as separate capabilities. Hardware computes these capabilities based on the operating conditions of the system such as power and thermal limits. These capabilities are shared with the operating system in a table resident in memory. Each package in the system has its own HFI instance. Every logical CPU in the package is represented in the table. More than one logical CPUs may be represented in a single table entry. When the hardware updates the table, it generates a package-level thermal interrupt. The size and format of the HFI table depend on the supported features and can only be determined at runtime. To minimally initialize the HFI, parse its features and allocate one instance per package of a data structure with the necessary parameters to read and navigate a local copy (i.e., owned by the driver) of individual HFI tables. A subsequent changeset will provide per-CPU initialization and interrupt handling. Reviewed-by: Len Brown <len.brown@intel.com> Co-developed by: Aubrey Li <aubrey.li@linux.intel.com> Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-17thermal: int340x: Add Raptor Lake PCI device idSrinivas Pandruvada
Add Raptor Lake PCI ID for processor thermal device. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-17thermal: int340x: Support Raptor LakeSrinivas Pandruvada
Add Raptor Lake ACPI IDs for DPTF devices. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-10Merge branch 'thermal-int340x'Rafael J. Wysocki
Merge int340x thermal driver update fixing RFIM mailbox write commands handling for 5.17-rc1. * thermal-int340x: thermal/drivers/int340x: Fix RFIM mailbox write commands
2021-12-30thermal/drivers/int340x: Fix RFIM mailbox write commandsSumeet Pawnikar
The existing mail mechanism only supports writing of workload types. However, mailbox command for RFIM (cmd = 0x08) also requires write operation which is ignored. This results in failing to store RFI restriction. Fixint this requires enhancing mailbox writes for non workload commands too, so remove the check for MBOX_CMD_WORKLOAD_TYPE_WRITE in mailbox write to allow this other write commands to be supoorted. At the same time, however, we have to make sure that there is no impact on read commands, by avoiding to write anything into the mailbox data register. To properly implement that, add two separate functions for mbox read and write commands for the processor thermal workload command type. This helps to distinguish the read and write workload command types from each other while sending mbox commands. Fixes: 5d6fbc96bd36 ("thermal/drivers/int340x: processor_thermal: Export additional attributes") Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Cc: 5.14+ <stable@vger.kernel.org> # 5.14+ Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-27Merge tag 'thermal-v5.17-rc1' of ↵Rafael J. Wysocki
https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux Pull thermal control material for 5.17-rc1 from Daniel Lezcano: - Fix PM issue on the iMX driver when suspend/resume is happening by implementing PM runtime support (Oleksij Rempel) - Add 'const' annotation to the thermal_cooling_ops in the Intel powerclamp driver (Rikard Falkeborn) - Add TSU driver and bindings for the RZ/G2L platform (Biju Das) - Fix missing ADC bit set on iMX8MP to enable the sensor (Paul Gerber) - Fix missing check when calling reset_control_deassert() (Biju Das) * tag 'thermal-v5.17-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: thermal/drivers/rz2gl: Add error check for reset_control_deassert() thermal/drivers/imx8mm: Enable ADC when enabling monitor thermal/drivers: Add TSU driver for RZ/G2L dt-bindings: thermal: Document Renesas RZ/G2L TSU thermal/drivers/intel_powerclamp: Constify static thermal_cooling_device_ops thermal/drivers/imx: Implement runtime PM support
2021-12-14Merge back int340x driver material for 5.17.Rafael J. Wysocki
2021-12-08thermal: int340x: Fix VCoRefLow MMIO bit offset for TGLSumeet Pawnikar
The VCoRefLow CPU FIVR register definition for Tiger Lake is incorrect. Current implementation reads it from MMIO offset 0x5A18 and bit offset [12:14], but the actual correct register definition is from bit offset [11:13]. Update to fix the bit offset. Fixes: 473be51142ad ("thermal: int340x: processor_thermal: Add RFIM driver") Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Cc: 5.14+ <stable@vger.kernel.org> # 5.14+ [ rjw: New subject, changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-30thermal/drivers/intel_powerclamp: Constify static thermal_cooling_device_opsRikard Falkeborn
The only usage of powerclamp_cooling_ops is to pass its address to thermal_cooling_device_register(), which takes a pointer to const struct thermal_cooling_device_ops. Make it const to allow the compiler to put it in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20211128214641.30953-1-rikard.falkeborn@gmail.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-11-24thermal: int340x: Use struct_group() for memcpy() regionKees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), avoid intentionally writing across neighboring fields. Use struct_group() in struct art around members weight, and ac[0-9]_max, so they can be referenced together. This will allow memcpy() and sizeof() to more easily reason about sizes, improve readability, and avoid future warnings about writing beyond the end of weight. "pahole" shows no size nor member offset changes to struct art. "objdump -d" shows no meaningful object code changes (i.e. only source line number induced differences). Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-16thermal: int340x: Limit Kconfig to 64-bitArnd Bergmann
32-bit processors cannot generally access 64-bit MMIO registers atomically, and it is unknown in which order the two halves of this registers would need to be read: drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.c: In function 'send_mbox_cmd': drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.c:79:37: error: implicit declaration of function 'readq'; did you mean 'readl'? [-Werror=implicit-function-declaration] 79 | *cmd_resp = readq((void __iomem *) (proc_priv->mmio_base + MBOX_OFFSET_DATA)); | ^~~~~ | readl The driver already does not build for anything other than x86, so limit it further to x86-64. Fixes: aeb58c860dc5 ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-12thermal: int340x: fix build on 32-bit targetsLinus Torvalds
Commit aeb58c860dc5 ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses") started using 'readq()' to read 64-bit status responses from the int340x hardware. That's all fine and good, but on 32-bit targets a 64-bit 'readq()' is ambiguous, since it's no longer an atomic access. Some hardware might require 64-bit accesses, and other hardware might want low word first or high word first. It's quite likely that the driver isn't relevant in a 32-bit environment any more, and there's a patch floating around to just make it depend on X86_64, but let's make it buildable on x86-32 anyway. The driver previously just read the low 32 bits, so the hardware certainly is ok with 32-bit reads, and in a little-endian environment the low word first model is the natural one. So just add the include for the 'io-64-nonatomic-lo-hi.h' version. Fixes: aeb58c860dc5 ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses") Reported-by: Jakub Kicinski <kuba@kernel.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-04thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responsesSrinivas Pandruvada
Some of the RFIM mail box command returns 64 bit values. So enhance mailbox interface to return 64 bit values and use them for RFIM commands. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Fixes: 5d6fbc96bd36 ("thermal/drivers/int340x: processor_thermal: Export additional attributes") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-10-26Merge branches 'thermal-int340x', 'thermal-powerclamp' and 'thermal-docs'Rafael J. Wysocki
Merge Intel thermal driver updates and a thermal documentation update for v5.16. * thermal-int340x: thermal: int340x: delete bogus length check * thermal-powerclamp: thermal: intel_powerclamp: Use bitmap_zalloc/bitmap_free when applicable * thermal-docs: thermal: Move ABI documentation to Documentation/ABI
2021-10-21thermal/drivers/int340x: Improve the tcc offset saving for suspend/resumeAntoine Tenart
When the driver resumes, the tcc offset is set back to its previous value. But this only works if the value was user defined as otherwise the offset isn't saved. This asymmetric logic is harder to maintain and introduced some issues. Improve the logic by saving the tcc offset in a suspend op, so the right value is always restored after a resume. Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Srinivas Pandruvada <srinivas.pI andruvada@linux.intel.com> Link: https://lore.kernel.org/r/20210909085613.5577-3-atenart@kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-05thermal: int340x: delete bogus length checkDan Carpenter
This check has a signedness bug and does not work. If "length" is larger than "PAGE_SIZE" then "PAGE_SIZE - length" is not negative but instead it is a large unsigned value. Fortunately, Takashi Iwai changed this code to use scnprint() instead of snprintf() so now "length" is never larger than "PAGE_SIZE - 1" and the check can be removed. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>