Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Linux] sensors_temperatures wrong when using thermal_zones #2439

Open
siena-sam opened this issue Sep 4, 2024 · 2 comments
Open

[Linux] sensors_temperatures wrong when using thermal_zones #2439

siena-sam opened this issue Sep 4, 2024 · 2 comments

Comments

@siena-sam
Copy link

Summary

  • OS: Ubuntu Core 20
  • Architecture: ARM
  • Psutil version: 5.9.8 confirmed, but from code also 6.0.0 and likely earlier versions as well
  • Python version: 3.8.16
  • Type: core

Description

Quick Description

On systems without /sys/class/hwmon/hwmon*/temp*_* to determine the temperature psutil will fallback to /sys/class/thermal/thermal_zone*. In the code that gets the temperature from the thermal_zone files, however, it fails to reset the high and critical values it is tracking for each trip point such that the value can be divided by 1000 multiple times resulting in an incorrect value. Somehow, the final value returned and the possible value generated in each iteration of the loop need to be tracked separately so that future iterations do not alter the same value multiple times.

Example Scenario

Take the following scenario (which we are actively facing). A device has 2 trip points configured: /sys/class/thermal/thermal_zone0/trip_point_0_* and /sys/class/thermal/thermal_zone0/trip_point_1_*. trip_point_1_type is critical and trip_point_0_type is passive.

In the following snippet of code (taken from https://github.com/giampaolo/psutil/blob/release-5.9.8/psutil/_pslinux.py#L1457) a list of all trip points are gathered and iterated through. The variables high and critical are set to None only before the iteration through all trip points, not on each iteration. If it iterates through trip point 1 and then trip point 0 it will first correctly get the critical temperature from /sys/class/thermal/thermal_zone0/trip_point_1_temp and divide it by 1000 since critical is not None. On the second iteration, however, it will fail to update either high or critical because the type is passive. high will still be None and critical will still be the previous, correct value from the first iteration. On line 1481 critical will have a value (instead of None) so it will convert it to a float and divide by 1000.0 again. Thus the final value returned will be incorrect as it was divided by 1000 an extra time. If there are more trip points configured this value would be divided even more times resulting in possibly very small values. We have a different instance with 5 trip points where the configured critical threshold was 95 but the value returned for critical and high was 9.5e-11 (95000/1000 5 times).

image

Related Issues

The issue #1873 reported by https://github.com/lior-ll also demonstrates this bug in that when they force it to use thermal_zones the returned value is 110_000/1000 6 times (because there are 6 temperature trips configured) thus returning 1.1e-13 instead of the expected 110. I did not add to that ticket as it focused on wanting to use thermal_zones even if hwmon is present on the system while this ticket focuses on an issue with the thermal_zone temperature calculation.

@siena-sam siena-sam added the bug label Sep 4, 2024
@github-actions github-actions bot added the linux label Sep 4, 2024
@siena-sam
Copy link
Author

Here is a proposed fix ready for review: #2440

1 similar comment
@siena-sam
Copy link
Author

Here is a proposed fix ready for review: #2440

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant