summaryrefslogtreecommitdiff
path: root/drivers/thermal
AgeCommit message (Collapse)Author
2024-08-23thermal/of: Use the .should_bind() thermal zone callbackRafael J. Wysocki
Make the thermal_of driver use the .should_bind() thermal zone callback to provide the thermal core with the information on whether or not to bind the given cooling device to the given trip point in the given thermal zone. If it returns 'true', the thermal core will bind the cooling device to the trip and the corresponding unbinding will be taken care of automatically by the core on the removal of the involved thermal zone or cooling device. This replaces the .bind() and .unbind() thermal zone callbacks which assumed the same trip points ordering in the driver and in the thermal core (that may not be true any more in the future). The .bind() callback would walk the given thermal zone's cooling maps to find all of the valid trip point combinations with the given cooling device and it would call thermal_zone_bind_cooling_device() for all of them using trip point indices reflecting the ordering of the trips in the DT. The .should_bind() callback still walks the thermal zone's cooling maps, but it can use the trip object passed to it by the thermal core to find the trip in question in the first place and then it uses the corresponding 'cooling-device' entries to look up the given cooling device. To be able to match the trip object provided by the thermal core to a specific device node, the driver sets the 'priv' field of each trip to the corresponding device node pointer during initialization. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org> # rk3399-rock960 Link: https://patch.msgid.link/2236794.NgBsaNRSFp@rjwysocki.net [ rjw: Removed excessive of_node_put() ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-22thermal: of: Fix OF node leak in of_thermal_zone_find() error pathsKrzysztof Kozlowski
Terminating for_each_available_child_of_node() loop requires dropping OF node reference, so bailing out on errors misses this. Solve the OF node reference leak with scoped for_each_available_child_of_node_scoped(). Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization") Cc: <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20240814195823.437597-3-krzysztof.kozlowski@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-22thermal: of: Fix OF node leak in thermal_of_zone_register()Krzysztof Kozlowski
thermal_of_zone_register() calls of_thermal_zone_find() which will iterate over OF nodes with for_each_available_child_of_node() to find matching thermal zone node. When it finds such, it exits the loop and returns the node. Prematurely ending for_each_available_child_of_node() loops requires dropping OF node reference, thus success of of_thermal_zone_find() means that caller must drop the reference. Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20240814195823.437597-2-krzysztof.kozlowski@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-22thermal: of: Fix OF node leak in thermal_of_trips_init() error pathKrzysztof Kozlowski
Terminating for_each_child_of_node() loop requires dropping OF node reference, so bailing out after thermal_of_populate_trip() error misses this. Solve the OF node reference leak with scoped for_each_child_of_node_scoped(). Fixes: d0c75fa2c17f ("thermal/of: Initialize trip points separately") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20240814195823.437597-1-krzysztof.kozlowski@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-22thermal: imx: Use the .should_bind() thermal zone callbackRafael J. Wysocki
Make the imx_thermal driver use the .should_bind() thermal zone callback to provide the thermal core with the information on whether or not to bind the given cooling device to the given trip point in the given thermal zone. If it returns 'true', the thermal core will bind the cooling device to the trip and the corresponding unbinding will be taken care of automatically by the core on the removal of the involved thermal zone or cooling device. In the imx_thermal case, it only needs to return 'true' for the passive trip point and it will match any cooling device passed to it, in analogy with the old-style imx_bind() callback function. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/2485070.jE0xQCEvom@rjwysocki.net
2024-08-22thermal: core: Unexport thermal_bind_cdev_to_trip() and ↵Rafael J. Wysocki
thermal_unbind_cdev_from_trip() Since thermal_bind_cdev_to_trip() and thermal_unbind_cdev_from_trip() are only called locally in the thermal core now, they can be static, so change their definitions accordingly and drop their headers from the global thermal header file. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Huisong Li <lihuisong@huawei.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/3512161.QJadu78ljV@rjwysocki.net
2024-08-22thermal: core: Introduce .should_bind() thermal zone callbackRafael J. Wysocki
The current design of the code binding cooling devices to trip points in thermal zones is convoluted and hard to follow. Namely, a driver that registers a thermal zone can provide .bind() and .unbind() operations for it, which are required to call either thermal_bind_cdev_to_trip() and thermal_unbind_cdev_from_trip(), respectively, or thermal_zone_bind_cooling_device() and thermal_zone_unbind_cooling_device(), respectively, for every relevant trip point and the given cooling device. Moreover, if .bind() is provided and .unbind() is not, the cleanup necessary during the removal of a thermal zone or a cooling device may not be carried out. In other words, the core relies on the thermal zone owners to do the right thing, which is error prone and far from obvious, even though all of that is not really necessary. Specifically, if the core could ask the thermal zone owner, through a special thermal zone callback, whether or not a given cooling device should be bound to a given trip point in the given thermal zone, it might as well carry out all of the binding and unbinding by itself. In particular, the unbinding can be done automatically without involving the thermal zone owner at all because all of the thermal instances associated with a thermal zone or cooling device going away must be deleted regardless. Accordingly, introduce a new thermal zone operation, .should_bind(), that can be invoked by the thermal core for a given thermal zone, trip point and cooling device combination in order to check whether or not the cooling device should be bound to the trip point at hand. It takes an additional cooling_spec argument allowing the thermal zone owner to specify the highest and lowest cooling states of the cooling device and its weight for the given trip point binding. Make the thermal core use this operation, if present, in the absence of .bind() and .unbind(). Note that .should_bind() will be called under the thermal zone lock. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Huisong Li <lihuisong@huawei.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/9334403.CDJkKcVGEf@rjwysocki.net
2024-08-22thermal: core: Move thermal zone locking out of bind/unbind functionsRafael J. Wysocki
Since thermal_bind_cdev_to_trip() and thermal_unbind_cdev_from_trip() acquire the thermal zone lock, the locking rules for their callers get complicated. In particular, the thermal zone lock cannot be acquired in any code path leading to one of these functions even though it might be useful to do so. To address this, remove the thermal zone locking from both these functions, add lockdep assertions for the thermal zone lock to both of them and make their callers acquire the thermal zone lock instead. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/3837835.kQq0lBPeGt@rjwysocki.net
2024-08-22thermal: sysfs: Use the dev argument in instance-related show/storeRafael J. Wysocki
Two sysfs show/store functions for attributes representing thermal instances, trip_point_show() and weight_store(), retrieve the thermal zone pointer from the instance object at hand, but they may also get it from their dev argument, which is more consistent with what the other thermal sysfs functions do, so make them do so. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Huisong Li <lihuisong@huawei.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/1987669.PYKUYFuaPT@rjwysocki.net
2024-08-22thermal: core: Drop redundant thermal instance checksRafael J. Wysocki
Because the trip and cdev pointers are sufficient to identify a thermal instance holding them unambiguously, drop the additional thermal zone checks from two loops walking the list of thermal instances in a thermal zone. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/10527734.nUPlyArG6x@rjwysocki.net
2024-08-22thermal: core: Rearrange checks in thermal_bind_cdev_to_trip()Rafael J. Wysocki
It is not necessary to look up the thermal zone and the cooling device in the respective global lists to check whether or not they are registered. It is sufficient to check whether or not their respective list nodes are empty for this purpose. Use the above observation to simplify thermal_bind_cdev_to_trip(). In addition, eliminate an unnecessary ternary operator from it. Moreover, add lockdep_assert_held() for thermal_list_lock to it because that lock must be held by its callers when it is running. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/3324214.44csPzL39Z@rjwysocki.net
2024-08-22thermal: core: Fold two functions into their respective callersRafael J. Wysocki
Fold bind_cdev() into __thermal_cooling_device_register() and bind_tz() into thermal_zone_device_register_with_trips() to reduce code bloat and make it somewhat easier to follow the code flow. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/2962184.e9J7NaK4W3@rjwysocki.net
2024-08-22thermal: Introduce a debugfs-based testing facilityRafael J. Wysocki
Introduce a facility allowing the thermal core functionality to be exercised in a controlled way in order to verify its behavior, without affecting its regular users noticeably. It is based on the idea of preparing thermal zone templates along with their trip points by writing to files in debugfs. When ready, those templates can be used for registering test thermal zones with the thermal core. The temperature of a test thermal zone created this way can be adjusted via debugfs, which also triggers a __thermal_zone_device_update() call for it. By manipulating the temperature of a test thermal zone, one can check if the thermal core reacts to the changes of it as expected. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/6065927.lOV4Wx5bFT@rjwysocki.net [ rjw: Fixed ordering of kcalloc() arguments ] [ rjw: Fixed debugfs_create_dir() return value checks ] [ rjw: Fixed two kerneldoc comments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-21thermal/debugfs: Fix the NULL vs IS_ERR() confusion in debugfs_create_dir()Yang Ruibin
The debugfs_create_dir() return value is never NULL, it is either a valid pointer or an error one. Use IS_ERR() to check it. Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") Signed-off-by: Yang Ruibin <11162571@vivo.com> Link: https://patch.msgid.link/20240821075934.12145-1-11162571@vivo.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-19thermal/core: Compute low and high boundaries in thermal_zone_device_update()Daniel Lezcano
In order to set the scene for the thresholds support which have to manipulate the low and high temperature boundaries for the interrupt support, we must pass the low and high values to the incoming thresholds routine. The variables are set from the thermal_zone_set_trips() where the function loops the thermal trips to figure out the next and the previous temperatures to set the interrupt to be triggered when they are crossed. These variables will be needed by the function in charge of handling the thresholds in the incoming changes but they are local to the aforementioned function thermal_zone_set_trips(). Move the low and high boundaries computation out of the function in thermal_zone_device_update() so they are accessible from there. The positive side effect is they are computed in the same loop as handle_thermal_trip(), so we remove one loop. Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20240816081241.1925221-2-daniel.lezcano@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-19Merge back thermal core material for 6.12.Rafael J. Wysocki
2024-08-16thermal: gov_bang_bang: Use governor_data to reduce overheadRafael J. Wysocki
After running once, the for_each_trip_desc() loop in bang_bang_manage() is pure needless overhead because it is not going to make any changes unless a new cooling device has been bound to one of the trips in the thermal zone or the system is resuming from sleep. For this reason, make bang_bang_manage() set governor_data for the thermal zone and check it upfront to decide whether or not it needs to do anything. However, governor_data needs to be reset in some cases to let bang_bang_manage() know that it should walk the trips again, so add an .update_tz() callback to the governor and make the core additionally invoke it during system resume. To avoid affecting the other users of that callback unnecessarily, add a special notification reason for system resume, THERMAL_TZ_RESUME, and also pass it to __thermal_zone_device_update() called during system resume for consistency. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Kästle <peter@piie.net> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Cc: 6.10+ <stable@vger.kernel.org> # 6.10+ Link: https://patch.msgid.link/2285575.iZASKD2KPV@rjwysocki.net
2024-08-16thermal: gov_bang_bang: Add .manage() callbackRafael J. Wysocki
After recent changes, the Bang-bang governor may not adjust the initial configuration of cooling devices to the actual situation. Namely, if a cooling device bound to a certain trip point starts in the "on" state and the thermal zone temperature is below the threshold of that trip point, the trip point may never be crossed on the way up in which case the state of the cooling device will never be adjusted because the thermal core will never invoke the governor's .trip_crossed() callback. [Note that there is no issue if the zone temperature is at the trip threshold or above it to start with because .trip_crossed() will be invoked then to indicate the start of thermal mitigation for the given trip.] To address this, add a .manage() callback to the Bang-bang governor and use it to ensure that all of the thermal instances managed by the governor have been initialized properly and the states of all of the cooling devices involved have been adjusted to the current zone temperature as appropriate. Fixes: 530c932bdf75 ("thermal: gov_bang_bang: Use .trip_crossed() instead of .throttle()") Link: https://lore.kernel.org/linux-pm/1bfbbae5-42b0-4c7d-9544-e98855715294@piie.net/ Cc: 6.10+ <stable@vger.kernel.org> # 6.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Kästle <peter@piie.net> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Link: https://patch.msgid.link/8419356.T7Z3S40VBb@rjwysocki.net
2024-08-16thermal: gov_bang_bang: Split bang_bang_control()Rafael J. Wysocki
Move the setting of the thermal instance target state from bang_bang_control() into a separate function that will be also called in a different place going forward. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Kästle <peter@piie.net> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Cc: 6.10+ <stable@vger.kernel.org> # 6.10+ Link: https://patch.msgid.link/3313587.aeNJFYEL58@rjwysocki.net
2024-08-16thermal: gov_bang_bang: Call __thermal_cdev_update() directlyRafael J. Wysocki
Instead of clearing the "updated" flag for each cooling device affected by the trip point crossing in bang_bang_control() and walking all thermal instances to run thermal_cdev_update() for all of the affected cooling devices, call __thermal_cdev_update() directly for each of them. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Kästle <peter@piie.net> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Cc: 6.10+ <stable@vger.kernel.org> # 6.10+ Link: https://patch.msgid.link/13583081.uLZWGnKmhe@rjwysocki.net
2024-08-13thermal: sysfs: Refine the handling of trip hysteresis changesRafael J. Wysocki
Change trip_point_hyst_store() and replace thermal_zone_trip_updated() with thermal_zone_set_trip_hyst() to follow the trip_point_temp_store() code pattern for more consistency. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/5508466.Sb9uPGUboI@rjwysocki.net
2024-08-13thermal: sysfs: Get to trips via attribute pointersRafael J. Wysocki
The _store() and _show() functions for sysfs attributes corresponding to trip point parameters (type, temperature and hysteresis) read the trip ID from the attribute name and then use the trip ID as the index in the given thermal zone's trips table to get to the trip object they want. Instead of doing this, make them use the attribute pointer they get as the second argument to get to the trip object embedded in the same struct thermal_trip_desc as the struct device_attribute pointed to by it, which is much more straightforward and less overhead. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/114841552.nniJfEyVGO@rjwysocki.net
2024-08-13thermal: core: Store trip sysfs attributes in thermal_trip_descRafael J. Wysocki
Instead of allocating memory for trip point sysfs attributes separately, store them in struct thermal_trip_desc for each trip individually which allows a few extra memory allocations to be avoided. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/46571375.fMDQidcC6G@rjwysocki.net
2024-08-02thermal: trip: Drop thermal_zone_get_trip()Rafael J. Wysocki
There are no more callers of thermal_zone_get_trip() in the tree, so drop it. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/2220301.Mh6RI2rZIc@rjwysocki.net
2024-08-02thermal: trip: Get rid of thermal_zone_get_num_trips()Rafael J. Wysocki
The only existing caller of thermal_zone_get_num_trips(), which is rcar_gen3_thermal_probe(), uses this function to put the number of trip points into a kernel log message, but this information is also available from the thermal sysfs interface. For this reason, remove the thermal_zone_get_num_trips() call from rcar_gen3_thermal_probe() and drop the former altogether. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/2636988.Lt9SDvczpP@rjwysocki.net
2024-08-02thermal: helpers: Drop get_thermal_instance()Rafael J. Wysocki
There are no more users of get_thermal_instance(), so drop it. While at it, replace get_instance() returning a pointer to struct thermal_instance with thermal_instance_present() returning a bool which is more straightforward. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/2014591.usQuhbGJ8B@rjwysocki.net [ rjw: Dropped get_thermal_instance() documentation ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-02thermal: tegra: Use thermal_zone_for_each_trip() for walking trip pointsRafael J. Wysocki
It is generally inefficient to iterate over trip indices and call thermal_zone_get_trip() every time to get the struct thermal_trip corresponding to the given trip index, so modify the Tegra thermal drivers to use thermal_zone_for_each_trip() for walking trips. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/1819430.VLH7GnMWUR@rjwysocki.net [ rjw: Dropped an unused local variable ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-02thermal: tegra: Introduce struct trip_temps for critical and hot tripsRafael J. Wysocki
Introduce a helper structure, struct trip_temps, for storing the temperatures of the critical and hot trip points. This helps to make the code in tegra_tsensor_get_hw_channel_trips() somewhat cleaner and will be useful subsequently in eliminating iteration over trip indices from the driver. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/9333090.CDJkKcVGEf@rjwysocki.net
2024-08-02thermal: qcom: Use thermal_zone_get_crit_temp() in qpnp_tm_init()Rafael J. Wysocki
Modify qpnp_tm_init() to use thermal_zone_get_crit_temp() to get the critical trip temperature instead of iterating over trip indices and using thermal_zone_get_trip() to get a struct thermal_trip pointer from a trip index until it finds the critical one. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/7712228.EvYhyI6sBW@rjwysocki.net
2024-08-02thermal: hisi: Use thermal_zone_for_each_trip() in ↵Rafael J. Wysocki
hisi_thermal_register_sensor() Modify hisi_thermal_register_sensor() to use thermal_zone_for_each_trip() for walking trip points instead of iterating over trip indices and using thermal_zone_get_trip() to get a struct thermal_trip pointer from a trip index. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/1994088.PYKUYFuaPT@rjwysocki.net [ rjw: Dropped two unused local variables ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-02thermal: broadcom: Use thermal_zone_get_crit_temp() in bcm2835_thermal_probe()Rafael J. Wysocki
Modify the bcm2835 thermal driver to use thermal_zone_get_crit_temp() in bcm2835_thermal_probe() instead of relying on the assumption that the critical trip index will always be 0. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/3322893.aeNJFYEL58@rjwysocki.net
2024-07-31Merge branch 'thermal-intel'Rafael J. Wysocki
Merge fixes for the int340x thermal driver handling of MSI IRQs: - Fix MSI error path cleanup in int340x, allow it to work with a subset of thermal MSI IRQs if some of them are not working and make it free all MSI IRQs on module exit (Srinivas Pandruvada). * thermal-intel: thermal: intel: int340x: Free MSI IRQ vectors on module exit thermal: intel: int340x: Allow limited thermal MSI support thermal: intel: int340x: Fix kernel warning during MSI cleanup
2024-07-31thermal: trip: Avoid skipping trips in thermal_zone_set_trips()Rafael J. Wysocki
Say there are 3 trip points A, B, C sorted in ascending temperature order with no hysteresis. If the zone temerature is exactly equal to B, thermal_zone_set_trips() will set the boundaries to A and C and the hardware will not catch any crossing of B (either way) until either A or C is crossed and the boundaries are changed. To avoid that, use non-strict inequalities when comparing the trip threshold to the zone temperature in thermal_zone_set_trips(). In the example above, it will cause both boundaries to be set to B, which is desirable because an interrupt will trigger when the zone temperature becomes different from B regardless of which way it goes. That will allow a new interval to be set depending on the direction of the zone temperature change. Fixes: 893bae92237d ("thermal: trip: Make thermal_zone_set_trips() use trip thresholds") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/12509184.O9o76ZdvQC@rjwysocki.net
2024-07-30thermal: intel: int340x: Free MSI IRQ vectors on module exitSrinivas Pandruvada
On module exit call proc_thermal_free_msi() to free vectors allocated by pci_alloc_irq_vectors(). Fixes: 7a9a8c5faf41 ("thermal: intel: int340x: Support MSI interrupt for Lunar Lake") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Link: https://patch.msgid.link/20240723140228.865919-4-srinivas.pandruvada@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-30thermal: intel: int340x: Allow limited thermal MSI supportSrinivas Pandruvada
On some Lunar Lake pre-production systems, not all the MSI thermal vectors are valid. In that case instead of failing module load, continue with partial thermal interrupt support. pci_alloc_irq_vectors() can return less than expected maximum vectors. In that case call devm_request_threaded_irq() only for current maximum vectors. Fixes: 7a9a8c5faf41 ("thermal: intel: int340x: Support MSI interrupt for Lunar Lake") Reported-by: Yijun Shen <Yijun.Shen@dell.com> Tested-by: Yijun Shen <Yijun.Shen@dell.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Link: https://patch.msgid.link/20240723140228.865919-3-srinivas.pandruvada@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-30thermal: intel: int340x: Fix kernel warning during MSI cleanupSrinivas Pandruvada
On some pre-production Lunar Lake systems, there is a kernel warning: remove_proc_entry: removing non-empty directory 'irq/172' WARNING: CPU: 0 PID: 501 at fs/proc/generic.c:717 remove_proc_entry+0x1b4/0x1e0 ... ... remove_proc_entry+0x1b4/0x1e0 report_bug+0x182/0x1b0 handle_bug+0x51/0xa0 exc_invalid_op+0x18/0x80 asm_exc_invalid_op+0x1b/0x20 remove_proc_entry+0x1b4/0x1e0 remove_proc_entry+0x1b4/0x1e0 unregister_irq_proc+0xf2/0x120 free_desc+0x41/0xe0 irq_domain_free_irqs+0x138/0x1c0 irq_free_descs+0x52/0x80 irq_domain_free_irqs+0x151/0x1c0 msi_domain_free_locked.part.0+0x17e/0x1c0 msi_domain_free_irqs_all_locked+0x74/0xc0 pci_msi_teardown_msi_irqs+0x50/0x60 pci_free_msi_irqs+0x12/0x40 pci_free_irq_vectors+0x58/0x70 On these systems, not all the MSI thermal vectors are valid. This causes devm_request_threaded_irq() to fail for some vectors. As part of the clean up on this error, pci_free_irq_vectors() is called without calling devm_free_irq(). This causes the above warning. Add a function proc_thermal_free_msi() to call devm_free_irq() for all successfully registered IRQ handlers, then call pci_free_irq_vectors(). Call this function for MSI cleanup. Fixes: 7a9a8c5faf41 ("thermal: intel: int340x: Support MSI interrupt for Lunar Lake") Reported-by: Yijun Shen <Yijun.shen@dell.com> Tested-by: Yijun Shen <Yijun.shen@dell.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Link: https://patch.msgid.link/20240723140228.865919-2-srinivas.pandruvada@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-24thermal: core: Back off when polling thermal zones on errorsRafael J. Wysocki
Commit a8a261774466 ("thermal: core: Call monitor_thermal_zone() if zone temperature is invalid") introduced a polling mechanism by which the thermal core attampts to get a valid temperature value for thermal zones where the .get_temp() callback returns errors to start with (for example, due to initialization ordering woes). However, this polling is carried out periodically ad infinitum and every iteration of it causes a message to be printed to the kernel log which means a lot of log noise on systems where there are thermal zones that never get ready for some reason. It is also not really useful to continuously poll thermal zones that never respond. To address this, modify the thermal core to increase the delay between consecutive thermal zone temperature checks after every check that fails until it reaches a certain maximum value. At that point, the thermal zone in question will be disabled, but user space will be able to reenable it if it believes that the failure is transient. Also change the code to print messages regarding failed temperature checks to the kernel log only twice, once when the thermal zone's .get_temp() callback returns an error for the first time and once when disabling the given thermal zone. In addition, a dev_crit() message will be printed at that point if the given thermal zone contains a critical trip point to notify the system operator about the situation. Fixes: a8a261774466 ("thermal: core: Call monitor_thermal_zone() if zone temperature is invalid") Link: https://lore.kernel.org/linux-acpi/CAGnHSE=RyPK++UG0-wAtVKgeJxe0uzFYgLxm+RUOKKoQquW=Ow@mail.gmail.com/ Reported-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/2962033.e9J7NaK4W3@rjwysocki.net
2024-07-23thermal: trip: Split thermal_zone_device_set_mode()Rafael J. Wysocki
Pull a wrapper around thermal zone .change_mode() callback out of thermal_zone_device_set_mode() because it will be used elsewhere subsequently. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/2206793.irdbgypaU6@rjwysocki.net
2024-07-18thermal: core: Allow thermal zones to tell the core to ignore themRafael J. Wysocki
The iwlwifi wireless driver registers a thermal zone that is only needed when the network interface handled by it is up and it wants that thermal zone to be effectively ignored by the core otherwise. Before commit a8a261774466 ("thermal: core: Call monitor_thermal_zone() if zone temperature is invalid") that could be achieved by returning an error code from the thermal zone's .get_temp() callback because the core did not really handle errors returned by it almost at all. However, commit a8a261774466 made the core attempt to recover from the situation in which the temperature of a thermal zone cannot be determined due to errors returned by its .get_temp() and is always invalid from the core's perspective. That was done because there are thermal zones in which .get_temp() returns errors to start with due to some difficulties related to the initialization ordering, but then it will start to produce valid temperature values at one point. Unfortunately, the simple approach taken by commit a8a261774466, which is to poll the thermal zone periodically until its .get_temp() callback starts to return valid temperature values, is at odds with the special thermal zone in iwlwifi in which .get_temp() may always return an error because its network interface may always be down. If that happens, every attempt to invoke the thermal zone's .get_temp() callback resulting in an error causes the thermal core to print a dev_warn() message to the kernel log which is super-noisy. To address this problem, make the core handle the case in which .get_temp() returns 0, but the temperature value returned by it is not actually valid, in a special way. Namely, make the core completely ignore the invalid temperature value coming from .get_temp() in that case, which requires folding in update_temperature() into its caller and a few related changes. On the iwlwifi side, modify iwl_mvm_tzone_get_temp() to return 0 and put THERMAL_TEMP_INVALID into the temperature return memory location instead of returning an error when the firmware is not running or it is not of the right type. Also, to clearly separate the handling of invalid temperature values from the thermal zone initialization, introduce a special THERMAL_TEMP_INIT value specifically for the latter purpose. Fixes: a8a261774466 ("thermal: core: Call monitor_thermal_zone() if zone temperature is invalid") Closes: https://lore.kernel.org/linux-pm/20240715044527.GA1544@sol.localdomain/ Reported-by: Eric Biggers <ebiggers@kernel.org> Reported-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Link: https://bugzilla.kernel.org/show_bug.cgi?id=201761 Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Cc: 6.10+ <stable@vger.kernel.org> # 6.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/4950004.31r3eYUQgx@rjwysocki.net [ rjw: Rebased on top of the current mainline ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-15Merge branch 'thermal-intel'Rafael J. Wysocki
Merge updates of Intel thermal drivers for 6.11-rc1: - Switch Intel thermal drivers to new Intel CPU model defines (Tony Luck). - Clean up the int3400 and int3403 drivers (Erick Archer and David Alan Gilbert). - Improve intel_pch_thermal kernel log messages printed during suspend to idle (Zhang Rui). - Make the intel_tcc_cooling driver use a model-specific bitmask for TCC offset (Ricardo Neri). - Add DLVR and MSI interrupt support for the Lunar Lake platform to the int340x thermal driver (Srinivas Pandruvada). - Enable workload type hints (WLT) support and power floor interrupt support for the Lunar Lake platform in int340x ((Srinivas Pandruvada). - Make the HFI thermal driver use package scope for HFI instances as per the Intel SDM (Zhang Rui). * thermal-intel: thermal: intel: hfi: Give HFI instances package scope thermal: intel: int340x: Enable WLT and power floor support for Lunar Lake thermal: intel: int340x: Support MSI interrupt for Lunar Lake thermal: intel: int340x: Remove unnecessary calls to free irq thermal: intel: int340x: Add DLVR support for Lunar Lake thermal: intel: int340x: Capability to map user space to firmware values thermal: intel: int340x: Cleanup of DLVR sysfs on driver remove thermal: intel: intel_tcc_cooling: Use a model-specific bitmask for TCC offset thermal: intel: intel_tcc: Add model checks for temperature registers thermal: intel: intel_pch: Improve cooling log thermal: int3403: remove unused struct 'int3403_performance_state' thermal: int3400: Use sizeof(*pointer) instead of sizeof(type) thermal: intel: intel_soc_dts_thermal: Switch to new Intel CPU model defines thermal: intel: intel_tcc_cooling: Switch to new Intel CPU model defines
2024-07-15Merge branch 'thermal-core'Rafael J. Wysocki
Merge updates related to the thermal core for 6.11-rc1: - Redesign the .set_trip_temp() thermal zone callback to take a trip pointer instead of a trip ID and update its users (Rafael Wysocki). - Avoid using invalid combinations of polling_delay and passive_delay thermal zone parameters (Rafael Wysocki). - Update a cooling device registration function to take a const argument (Krzysztof Kozlowski). - Make the uniphier thermal driver use thermal_zone_for_each_trip() for walking trip points (Rafael Wysocki). * thermal-core: thermal: core: Add sanity checks for polling_delay and passive_delay thermal: trip: Fold __thermal_zone_get_trip() into its caller thermal: trip: Pass trip pointer to .set_trip_temp() thermal zone callback thermal: imx: Drop critical trip check from imx_set_trip_temp() thermal: trip: Add conversion macros for thermal trip priv field thermal: helpers: Introduce thermal_trip_is_bound_to_cdev() thermal: core: Change passive_delay and polling_delay data type thermal: core: constify 'type' in devm_thermal_of_cooling_device_register() thermal: uniphier: Use thermal_zone_for_each_trip() for walking trip points
2024-07-15thermal/drivers/sti: Cleanup code related to stih416Raphael Gallais-Pou
"st,stih416-mpe-thermal" compatible seems to appear nowhere in the device-tree nor in the documentation. Remove compatible and related code. Signed-off-by: Raphael Gallais-Pou <rgallaispou@gmail.com> Link: https://lore.kernel.org/r/20240708161840.102004-1-rgallaispou@gmail.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15thermal/drivers/generic-adc: Simplify with dev_err_probe()Krzysztof Kozlowski
Error handling in probe() can be a bit simpler with dev_err_probe(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-12-241644e2b6e0@linaro.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15thermal/drivers/generic-adc: Simplify probe() with local dev variableKrzysztof Kozlowski
Simplify the probe() function by using local 'dev' instead of &pdev->dev. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-11-241644e2b6e0@linaro.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15thermal/drivers/qcom-tsens: Simplify with dev_err_probe()Krzysztof Kozlowski
Error handling in probe() can be a bit simpler with dev_err_probe(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-10-241644e2b6e0@linaro.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15thermal/drivers/qcom-spmi-adc-tm5: Simplify with dev_err_probe()Krzysztof Kozlowski
Error handling in probe() can be a bit simpler with dev_err_probe(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-9-241644e2b6e0@linaro.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15thermal/drivers/imx: Simplify with dev_err_probe()Krzysztof Kozlowski
Error handling in probe() can be a bit simpler with dev_err_probe(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-8-241644e2b6e0@linaro.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15thermal/drivers/imx: Simplify probe() with local dev variableKrzysztof Kozlowski
Simplify the probe() function by using local 'dev' instead of &pdev->dev. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-7-241644e2b6e0@linaro.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15thermal/drivers/hisi: Simplify with dev_err_probe()Krzysztof Kozlowski
Error handling in probe() can be a bit simpler with dev_err_probe(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-6-241644e2b6e0@linaro.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15thermal/drivers/exynos: Simplify with dev_err_probe()Krzysztof Kozlowski
Error handling in probe() can be a bit simpler with dev_err_probe(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-5-241644e2b6e0@linaro.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>