summaryrefslogtreecommitdiff
path: root/drivers/thermal/thermal_helpers.c
AgeCommit message (Collapse)Author
2024-02-23thermal: core: Store zone ops in struct thermal_zone_deviceRafael J. Wysocki
The current code requires thermal zone creators to pass pointers to writable ops structures to thermal_zone_device_register_with_trips() which needs to modify the target struct thermal_zone_device_ops object if the "critical" operation in it is NULL. Moreover, the callers of thermal_zone_device_register_with_trips() are required to hold on to the struct thermal_zone_device_ops object passed to it until the given thermal zone is unregistered. Both of these requirements are quite inconvenient, so modify struct thermal_zone_device to contain struct thermal_zone_device_ops as field and make thermal_zone_device_register_with_trips() copy the contents of the struct thermal_zone_device_ops passed to it via a pointer (which can be const now) to that field. Also adjust the code using thermal zone ops accordingly and modify thermal_of_zone_register() to use a local ops variable during thermal zone registration so ops do not need to be freed in thermal_of_zone_unregister() any more. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12thermal: helpers: Rearrange thermal_cdev_set_cur_state()Rafael J. Wysocki
Change the code layout in thermal_cdev_set_cur_state() so it returns early on errors which is more consistent with what happens elsewhere. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12thermal: netlink: Rework notify API for cooling devicesRafael J. Wysocki
In analogy with some previous thermal netlink API changes, redefine thermal_notify_cdev_state_update(), thermal_notify_cdev_add() and thermal_notify_cdev_delete() to take a const cdev pointer as their first argument and let them extract the requisite information from there by themselves. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12thermal/debugfs: Add thermal cooling device debugfs informationDaniel Lezcano
The thermal framework does not have any debug information except a sysfs stat which is a bit controversial. This one allocates big chunks of memory for every cooling devices with a high number of states and could represent on some systems in production several megabytes of memory for just a portion of it. As the sysfs is limited to a page size, the output is not exploitable with large data array and gets truncated. The patch provides the same information than sysfs except the transitions are dynamically allocated, thus they won't show more events than the ones which actually occurred. There is no longer a size limitation and it opens the field for more debugging information where the debugfs is designed for, not sysfs. The thermal debugfs directory structure tries to stay consistent with the sysfs one but in a very simplified way: thermal/ -- cooling_devices |-- 0 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 1 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 2 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table |-- 3 | |-- clear | |-- time_in_state_ms | |-- total_trans | `-- trans_table `-- 4 |-- clear |-- time_in_state_ms |-- total_trans `-- trans_table The content of the files in the cooling devices directory is the same as the sysfs one except for the trans_table which has the following format: Transition Hits 1->0 246 0->1 246 2->1 632 1->2 632 3->2 98 2->3 98 Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: White space fixups, rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-13thermal: helpers: Use for_each_trip() in __thermal_zone_get_temp()Rafael J. Wysocki
Make __thermal_zone_get_temp() use for_each_trip() instead of an open- coded loop over trip indices. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2023-12-12thermal: Drop redundant and confusing device_is_registered() checksRafael J. Wysocki
Multiple places in the thermal subsystem (most importantly, sysfs attribute callback functions) check if the given thermal zone device is still registered in order to return early in case the device_del() in thermal_zone_device_unregister() has run already. However, after thermal_zone_device_unregister() has been made wait for all of the zone-related activity to complete before returning, it is not necessary to do that any more, because all of the code holding a reference to the thermal zone device object will be waited for even if it does not do anything special to enforce this. Accordingly, drop all of the device_is_registered() checks that are now redundant and get rid of the zone locking that is not necessary any more after dropping them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-10-20thermal: core: Pass trip pointer to governor throttle callbackRafael J. Wysocki
Modify the governor .throttle() callback definition so that it takes a trip pointer instead of a trip index as its second argument, adjust the governors accordingly and update the core code invoking .throttle(). This causes the governors to become independent of the representation of the list of trips in the thermal zone structure. This change is not expected to alter the general functionality. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-09-28thermal: core: Store trip pointer in struct thermal_instanceRafael J. Wysocki
Replace the integer trip number stored in struct thermal_instance with a pointer to the relevant trip and adjust the code using the structure in question accordingly. The main reason for making this change is to allow the trip point to cooling device binding code more straightforward, as illustrated by subsequent modifications of the ACPI thermal driver, but it also helps to clarify the overall design and allows the governor code overhead to be reduced (through subsequent modifications). The only case in which it adds complexity is trip_point_show() that needs to walk the trips[] table to find the index of the given trip point, but this is not a critical path and the interface that trip_point_show() belongs to is problematic anyway (for instance, it doesn't cover the case when the same cooling devices is associated with multiple trip points). This is a preliminary change and the affected code will be refined by a series of subsequent modifications of thermal governors, the core and the ACPI thermal driver. The general functionality is not expected to be affected by this change. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-08-29thermal: core: Rework .get_trend() thermal zone callbackRafael J. Wysocki
Passing a struct thermal_trip pointer instead of a trip index to the .get_trend() thermal zone callback allows one of its 2 implementations, the thermal_get_trend() function in the ACPI thermal driver, to be simplified quite a bit, and the other implementation of it in the ti-soc-thermal driver does not even use the relevant callback argument. For this reason, change the .get_trend() thermal zone callback definition and adjust the related code accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-01thermal/core: Relocate the traces definition in thermal directoryDaniel Lezcano
The traces are exported but only local to the thermal core code. On the other side, the traces take the thermal zone device structure as argument, thus they have to rely on the exported thermal.h header file. As we want to move the structure to the private thermal core header, first we have to relocate those traces to the same place as many drivers do. Cc: Steven Rostedt <rostedt@goodmis.org> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20230307133735.90772-2-daniel.lezcano@linaro.org
2023-03-03thermal/core: Show a debug message when get_temp() failsDaniel Lezcano
The different thermal drivers are showing an error in case the get_temp() fails. Actually no traces should be displayed in the backend ops but in the call site of this ops. Furthermore, the message is often a dev_dbg message where the tz->device is used, thus using the internal of the structure from the driver. Show a debug message if the thermal_zone_get_temp() fails to read the sensor temperature, so code showing the message is factored out and the tz->device accesss is in the scope of the thermal core framework. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-25thermal/core: Move the thermal trip code to a dedicated fileDaniel Lezcano
The thermal_core.c files contains a lot of functions handling different thermal components like the governors, the trip points, the cooling device, the OF cooling device, etc ... This organization does not help to migrate to a more sane code where there is a better self-encapsulation as all the components' internals can be directly accessed from a single file. For the sake of clarity, let's move the thermal trip points code in a dedicated thermal_trip.c file and add a function to browse all the trip points like we do with the thermal zones, the govenors and the cooling devices. The same can be done for the cooling devices and the governor code but that will come later as the current work in the thermal framework is to fix the trip point handling and use a generic trip point structure. No functional changes intended. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-06thermal/core: Add a generic thermal_zone_get_trip() functionDaniel Lezcano
The thermal_zone_device_ops structure defines a set of ops family, get_trip_temp(), get_trip_hyst(), get_trip_type(). Each of them is returning a property of a trip point. The result is the code is calling the ops everywhere to get a trip point which is supposed to be defined in the backend driver. It is a non-sense as a thermal trip can be generic and used by the backend driver to declare its trip points. Part of the thermal framework has been changed and all the OF thermal drivers are using the same definition for the trip point and use a thermal zone registration variant to pass those trip points which are part of the thermal zone device structure. Consequently, we can use a generic function to get the trip points when they are stored in the thermal zone device structure. This approach can be generalized to all the drivers and we can get rid of the ops->get_trip_*. That will result to a much more simpler code and make possible to rework how the thermal trip are handled in the thermal core framework as discussed previously. This change adds a function thermal_zone_get_trip() where we get the thermal trip point structure which contains all the properties (type, temp, hyst) instead of doing multiple calls to ops->get_trip_*. That opens the door for trip point extension with more attributes. For instance, replacing the trip points disabled bitmask with a 'disabled' field in the structure. Here we replace all the calls to ops->get_trip_* in the thermal core code with a call to the thermal_zone_get_trip() function. The thermal zone ops defines a callback to retrieve the critical temperature. As the trip handling is being reworked, all the trip points will be the same whatever the driver and consequently finding the critical trip temperature will be just a loop to search for a critical trip point type. Provide such a generic function, so we encapsulate the ops get_crit_temp() which can be removed when all the backend drivers are using the generic trip points handling. While at it, add the thermal_zone_get_num_trips() to encapsulate the code more and reduce the grip with the thermal framework internals. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20221003092602.1323944-2-daniel.lezcano@linaro.org
2022-11-14thermal/core: Remove thermal_zone_set_trips()Guenter Roeck
Since no callers of thermal_zone_set_trips() are left, remove the function. Document __thermal_zone_set_trips() instead. Explicitly state that the thermal zone lock must be held when calling the function, and that the pointer to the thermal zone must be valid. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14thermal/core: Move parameter validation from __thermal_zone_get_temp to ↵Guenter Roeck
thermal_zone_get_temp All callers of __thermal_zone_get_temp() already validated the thermal zone parameters. Move validation to thermal_zone_get_temp() where it is actually needed. Also add kernel documentation for __thermal_zone_get_temp(), listing the requirement that the function must be called with validated parameters and with thermal device mutex held. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14thermal/core: Ensure that thermal device is registered in thermal_zone_get_tempGuenter Roeck
Calls to thermal_zone_get_temp() are not protected against thermal zone device removal. As result, it is possible that the thermal zone operations callbacks are no longer valid when thermal_zone_get_temp() is called. This may result in crashes such as BUG: unable to handle page fault for address: ffffffffc04ef420 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 5d60e067 P4D 5d60e067 PUD 5d610067 PMD 110197067 PTE 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 3209 Comm: cat Tainted: G W 5.10.136-19389-g615abc6eb807 #1 02df41ac0b12f3a64f4b34245188d8875bb3bce1 Hardware name: Google Coral/Coral, BIOS Google_Coral.10068.92.0 11/27/2018 RIP: 0010:thermal_zone_get_temp+0x26/0x73 Code: 89 c3 eb d3 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 53 48 85 ff 74 50 48 89 fb 48 81 ff 00 f0 ff ff 77 44 48 8b 83 98 03 00 00 <48> 83 78 10 00 74 36 49 89 f6 4c 8d bb d8 03 00 00 4c 89 ff e8 9f RSP: 0018:ffffb3758138fd38 EFLAGS: 00010287 RAX: ffffffffc04ef410 RBX: ffff98f14d7fb000 RCX: 0000000000000000 RDX: ffff98f17cf90000 RSI: ffffb3758138fd64 RDI: ffff98f14d7fb000 RBP: ffffb3758138fd50 R08: 0000000000001000 R09: ffff98f17cf90000 R10: 0000000000000000 R11: ffffffff8dacad28 R12: 0000000000001000 R13: ffff98f1793a7d80 R14: ffff98f143231708 R15: ffff98f14d7fb018 FS: 00007ec166097800(0000) GS:ffff98f1bbd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffc04ef420 CR3: 000000010ee9a000 CR4: 00000000003506e0 Call Trace: temp_show+0x31/0x68 dev_attr_show+0x1d/0x4f sysfs_kf_seq_show+0x92/0x107 seq_read_iter+0xf5/0x3f2 vfs_read+0x205/0x379 __x64_sys_read+0x7c/0xe2 do_syscall_64+0x43/0x55 entry_SYSCALL_64_after_hwframe+0x61/0xc6 if a thermal device is removed while accesses to its device attributes are ongoing. The problem is exposed by code in iwl_op_mode_mvm_start(), which registers a thermal zone device only to unregister it shortly afterwards if an unrelated failure is encountered while accessing the hardware. Check if the thermal zone device is registered after acquiring the thermal zone device mutex to ensure this does not happen. The code was tested by triggering the failure in iwl_op_mode_mvm_start() on purpose. Without this patch, the kernel crashes reliably. The crash is no longer observed after applying this and the preceding patches. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-17thermal/core: Move the mutex inside the thermal_zone_device_update() functionDaniel Lezcano
All the different calls inside the thermal_zone_device_update() function take the mutex. The previous changes move the mutex out of the different functions, like the throttling ops. Now that the mutexes are all at the same level in the call stack for the thermal_zone_device_update() function, they can be moved inside this one. That has the benefit of: 1. Simplify the code by not having a plethora of places where the lock is taken 2. Probably closes more race windows because releasing the lock from one line to another can give the opportunity to the thermal zone to change its state in the meantime. For example, the thermal zone can be enabled right after checking it is disabled. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20220805153834.2510142-5-daniel.lezcano@linaro.org
2022-07-28thermal/core: Rename 'trips' to 'num_trips'Daniel Lezcano
In order to use thermal trips defined in the thermal structure, rename the 'trips' field to 'num_trips' to have the 'trips' field containing the thermal trip points. Cc: Alexandre Bailon <abailon@baylibre.com> Cc: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org> Link: https://lore.kernel.org/r/20220722200007.1839356-8-daniel.lezcano@linexp.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28thermal/core: Move thermal_set_delay_jiffies to staticDaniel Lezcano
The function 'thermal_set_delay_jiffies' is only used in thermal_core.c but it is defined and implemented in a separate file. Move the function to thermal_core.c and make it static. Cc: Alexandre Bailon <abailon@baylibre.com> Cc: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20220722200007.1839356-7-daniel.lezcano@linexp.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28thermal/core: Remove unneeded EXPORT_SYMBOLSDaniel Lezcano
Different functions are exporting the symbols but are actually only used by the thermal framework internals. Remove these EXPORT_SYMBOLS. Cc: Alexandre Bailon <abailon@baylibre.com> Cc: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org> Link: https://lore.kernel.org/r/20220722200007.1839356-6-daniel.lezcano@linexp.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-04-22thermal/core: Create a helper __thermal_cdev_update() without a lockLukasz Luba
There is a need to have a helper function which updates cooling device state from the governors code. With this change governor can use lock and unlock while calling helper function. This avoid unnecessary second time lock/unlock which was in previous solution present in governor implementation. This new helper function must be called with mutex 'cdev->lock' hold. The changed been discussed and part of code presented in thread: https://lore.kernel.org/linux-pm/20210419084536.25000-1-lukasz.luba@arm.com/ Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210422114308.29684-2-lukasz.luba@arm.com
2021-01-19thermal/core: Precompute the delays from msecs to jiffiesDaniel Lezcano
The delays are stored in ms units and when the polling function is called this delay is converted into jiffies at each call. Instead of doing the conversion again and again, compute the jiffies at init time and use the value directly when setting the polling. Cc: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201216220337.839878-1-daniel.lezcano@linaro.org
2020-07-07thermal: core: Add notifications call in the frameworkDaniel Lezcano
The generic netlink protocol is implemented but the different notification functions are not yet connected to the core code. These changes add the notification calls in the different corresponding places. Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20200706105538.2159-4-daniel.lezcano@linaro.org
2020-05-22thermal/drivers/thermal_helpers: Include export.hAmit Kucheria
It is preferable to include export.h when you are using EXPORT_SYMBOL family of macros. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/fd3443f00dbba6ca90f35726c7451ae52145d2d4.1589199124.git.amit.kucheria@linaro.org
2020-05-22thermal/drivers/thermal_helpers: Sort headers alphabeticallyAmit Kucheria
Sort headers to make it easier to read and find duplicate headers. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/133db154796f354e6c51e6310095f679e1f45441.1589199124.git.amit.kucheria@linaro.org
2020-04-14thermal: core: Make thermal_zone_set_trips privateDaniel Lezcano
The function thermal_zone_set_trips() is used by the thermal core code in order to update the next trip points, there are no other users. Move the function definition in the thermal_core.h, remove the EXPORT_SYMBOL_GPL and document the function. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Link: https://lore.kernel.org/r/20200331165449.30355-1-daniel.lezcano@linaro.org
2018-05-30drivers: thermal: Update license to SPDX formatLina Iyer
Update licences format for core thermal files. Signed-off-by: Lina Iyer <ilina@codeaurora.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-04-02thermal: Add cooling device's statistics in sysfsViresh Kumar
This extends the sysfs interface for thermal cooling devices and exposes some pretty useful statistics. These statistics have proven to be quite useful specially while doing benchmarks related to the task scheduler, where we want to make sure that nothing has disrupted the test, specially the cooling device which may have put constraints on the CPUs. The information exposed here tells us to what extent the CPUs were constrained by the thermal framework. The write-only "reset" file is used to reset the statistics. The read-only "time_in_state_ms" file shows the time (in msec) spent by the device in the respective cooling states, and it prints one line per cooling state. The read-only "total_trans" file shows single positive integer value showing the total number of cooling state transitions the device has gone through since the time the cooling device is registered or the time when statistics were reset last. The read-only "trans_table" file shows a two dimensional matrix, where an entry <i,j> (row i, column j) represents the number of transitions from State_i to State_j. This is how the directory structure looks like for a single cooling device: $ ls -R /sys/class/thermal/cooling_device0/ /sys/class/thermal/cooling_device0/: cur_state max_state power stats subsystem type uevent /sys/class/thermal/cooling_device0/power: autosuspend_delay_ms runtime_active_time runtime_suspended_time control runtime_status /sys/class/thermal/cooling_device0/stats: reset time_in_state_ms total_trans trans_table This is tested on ARM 64-bit Hisilicon hikey620 board running Ubuntu and ARM 64-bit Hisilicon hikey960 board running Android. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: move slop and offset helpers to thermal_helpers.cEduardo Valentin
Reorganize code to reflect better placement. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-23thermal: core: introduce thermal_helpers.cEduardo Valentin
Here we have a simple code organization. This patch moves functions that do not need to handle thermal core internal data structure to thermal_helpers.c file. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>