summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-17ACPI: sysfs: Sort headers alphabeticallyAndy Shevchenko
For the sake of better maintenance, sort included headers alphabetically. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17ACPI: sysfs: Refactor param_get_trace_state() to drop dead codeAndy Shevchenko
The param_get_trace_state() has a few dead code issues: - 'return 0;' is never reachable - a few 'else' keywords are redundant Refactor param_get_trace_state() to drop dead code. Note, leave one 'else' in order to have the best readability. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17ACPI: sysfs: Unify pattern of memory allocationsAndy Shevchenko
Use the form of foo = kmalloc(sizeof(*foo)) everywhere in order to unify pattern of memory allocations. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17ACPI: sysfs: Allow bitmap list to be supplied to acpi_mask_gpeAndy Shevchenko
Currently we need to use as many acpi_mask_gpe options as we want to have GPEs to be masked. Even with two it already becomes inconveniently large the kernel command line. Instead, allow acpi_mask_gpe to represent bitmap list. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17ACPI: sysfs: Make sparse happy about address space in useAndy Shevchenko
Sparse is not happy about address space in use in acpi_data_show(): drivers/acpi/sysfs.c:428:14: warning: incorrect type in assignment (different address spaces) drivers/acpi/sysfs.c:428:14: expected void [noderef] __iomem *base drivers/acpi/sysfs.c:428:14: got void * drivers/acpi/sysfs.c:431:59: warning: incorrect type in argument 4 (different address spaces) drivers/acpi/sysfs.c:431:59: expected void const *from drivers/acpi/sysfs.c:431:59: got void [noderef] __iomem *base drivers/acpi/sysfs.c:433:30: warning: incorrect type in argument 1 (different address spaces) drivers/acpi/sysfs.c:433:30: expected void *logical_address drivers/acpi/sysfs.c:433:30: got void [noderef] __iomem *base Indeed, acpi_os_map_memory() returns a void pointer with dropped specific address space. Hence, we don't need to carry out __iomem in acpi_data_show(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17ACPI: scan: Fix race related to dropping dependenciesRafael J. Wysocki
If acpi_add_single_object() runs concurrently with respect to acpi_scan_clear_dep() which deletes a dependencies list entry where the device being added is the consumer, the device's dep_unmet counter may not be updated to reflect that change. Namely, if the dependencies list entry is deleted right after calling acpi_scan_dep_init() and before calling acpi_device_add(), acpi_scan_clear_dep() will not find the device object corresponding to the consumer device ACPI handle and it will not update its dep_unmet counter to reflect the deletion of the list entry. Consequently, the dep_unmet counter of the device will never become zero going forward which may prevent it from being completely enumerated. To address this problem, modify acpi_add_single_object() to run acpi_tie_acpi_dev(), to attach the ACPI device object created by it to the corresponding ACPI namespace node, under acpi_dep_list_lock along with acpi_scan_dep_init() whenever the latter is called. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2021-06-17ACPI: scan: Reorganize acpi_device_add()Rafael J. Wysocki
Move the invocation of acpi_attach_data() in acpi_device_add() into a separate function. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2021-06-17ACPI: scan: Fix device object rescan in acpi_scan_clear_dep()Rafael J. Wysocki
In general, acpi_bus_attach() can only be run safely under acpi_scan_lock, but that lock cannot be acquired under acpi_dep_list_lock, so make acpi_scan_clear_dep() schedule deferred execution of acpi_bus_attach() under acpi_scan_lock instead of calling it directly. This also fixes a possible race between acpi_scan_clear_dep() and device removal that might cause a device object that went away to be accessed, because acpi_scan_clear_dep() is changed to acquire a reference on the consumer device object. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2021-06-17ACPI: scan: Make acpi_walk_dep_device_list()Rafael J. Wysocki
Because acpi_walk_dep_device_list() is only called by the code in the file in which it is defined, make it static, drop the export of it and drop its header from acpi.h. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2021-06-17ACPI: scan: Rearrange acpi_dev_get_first_consumer_dev_cb()Rafael J. Wysocki
Make acpi_dev_get_first_consumer_dev_cb() a bit more straightforward and rewrite the comment in it. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2021-06-17ACPI: scan: Define acpi_bus_put_acpi_device() as static inlineRafael J. Wysocki
Since acpi_bus_put_acpi_device() is a synonym for acpi_dev_put(), define it as static inline in analogy with the latter. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2021-06-17usb: core: hub: Disable autosuspend for Cypress CY7C65632Andrew Lunn
The Cypress CY7C65632 appears to have an issue with auto suspend and detecting devices, not too dissimilar to the SMSC 5534B hub. It is easiest to reproduce by connecting multiple mass storage devices to the hub at the same time. On a Lenovo Yoga, around 1 in 3 attempts result in the devices not being detected. It is however possible to make them appear using lsusb -v. Disabling autosuspend for this hub resolves the issue. Fixes: 1208f9e1d758 ("USB: hub: Fix the broken detection of USB3 device in SMSC hub") Cc: stable@vger.kernel.org Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20210614155524.2228800-1-andrew@lunn.ch Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-17Merge tag 'irqchip-fixes-5.13-2' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Fix GICv3 NMI handling where an IRQ could be mistakenly handled as a NMI, with disatrous effects Link: https://lore.kernel.org/r/20210610171127.2404752-1-maz@kernel.org
2021-06-17spi: xilinx: convert to yamlNobuhiro Iwamatsu
Convert SPI for Xilinx bindings documentation to YAML schemas. Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210605002931.858031-1-iwamatsu@nigauri.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-17spi: convert Cadence SPI bindings to YAMLNobuhiro Iwamatsu
Convert spi for Cadence SPI bindings documentation to YAML. Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210605003811.858676-1-iwamatsu@nigauri.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-17ACPI: PRM: make symbol 'prm_module_list' staticWei Yongjun
The sparse tool complains as follows: drivers/acpi/prmt.c:53:1: warning: symbol 'prm_module_list' was not declared. Should it be static? This symbol is not used outside of prmt.c, so marks it static. Fixes: cefc7ca46235 ("ACPI: PRM: implement OperationRegion handler for the PlatformRtMechanism subtype") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17ACPI: NVS: fix doc warnings in nvs.cBaokun Li
Fixes the following W=1 kernel build warning(s): drivers/acpi/nvs.c:94: warning: Function parameter or member 'start' not described in 'suspend_nvs_register' drivers/acpi/nvs.c:94: warning: Function parameter or member 'size' not described in 'suspend_nvs_register' Signed-off-by: Baokun Li <libaokun1@huawei.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17ACPI: sysfs: fix doc warnings in device_sysfs.cBaokun Li
Fixes the following W=1 kernel build warning(s): drivers/acpi/device_sysfs.c:278: warning: Function parameter or member 'dev' not described in 'acpi_device_uevent_modalias' drivers/acpi/device_sysfs.c:278: warning: Function parameter or member 'env' not described in 'acpi_device_uevent_modalias' drivers/acpi/device_sysfs.c:323: warning: Function parameter or member 'dev' not described in 'acpi_device_modalias' drivers/acpi/device_sysfs.c:323: warning: Function parameter or member 'buf' not described in 'acpi_device_modalias' drivers/acpi/device_sysfs.c:323: warning: Function parameter or member 'size' not described in 'acpi_device_modalias' Signed-off-by: Baokun Li <libaokun1@huawei.com> [ rjw: Fix spelling: acpi -> ACPI ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17cpuidle: teo: remove unneeded semicolon in teo_select()Wan Jiabing
Fix following coccicheck warning: drivers/cpuidle/governors/teo.c:315:10-11: Unneeded semicolon Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17perf/x86: Reset the dirty counter to prevent the leak for an RDPMC taskKan Liang
The counter value of a perf task may leak to another RDPMC task. For example, a perf stat task as below is running on CPU 0. perf stat -e 'branches,cycles' -- taskset -c 0 ./workload In the meantime, an RDPMC task, which is also running on CPU 0, may read the GP counters periodically. (The RDPMC task creates a fixed event, but read four GP counters.) $./rdpmc_read_all_counters index 0x0 value 0x8001e5970f99 index 0x1 value 0x8005d750edb6 index 0x2 value 0x0 index 0x3 value 0x0 index 0x0 value 0x8002358e48a5 index 0x1 value 0x8006bd1e3bc9 index 0x2 value 0x0 index 0x3 value 0x0 It is a potential security issue. Once the attacker knows what the other thread is counting. The PerfMon counter can be used as a side-channel to attack cryptosystems. The counter value of the perf stat task leaks to the RDPMC task because perf never clears the counter when it's stopped. Three methods were considered to address the issue. - Unconditionally reset the counter in x86_pmu_del(). It can bring extra overhead even when there is no RDPMC task running. - Only reset the un-assigned dirty counters when the RDPMC task is scheduled in via sched_task(). It fails for the below case. Thread A Thread B clone(CLONE_THREAD) ---> set_affine(0) set_affine(1) while (!event-enabled) ; event = perf_event_open() mmap(event) ioctl(event, IOC_ENABLE); ---> RDPMC Counters are still leaked to the thread B. - Only reset the un-assigned dirty counters before updating the CR4.PCE bit. The method is implemented here. The dirty counter is a counter, on which the assigned event has been deleted, but the counter is not reset. To track the dirty counters, add a 'dirty' variable in the struct cpu_hw_events. The security issue can only be found with an RDPMC task. To enable the RDMPC, the CR4.PCE bit has to be updated. Add a perf_clear_dirty_counters() right before updating the CR4.PCE bit to clear the existing dirty counters. Only the current un-assigned dirty counters are reset, because the RDPMC assigned dirty counters will be updated soon. After applying the patch, $ ./rdpmc_read_all_counters index 0x0 value 0x0 index 0x1 value 0x0 index 0x2 value 0x0 index 0x3 value 0x0 index 0x0 value 0x0 index 0x1 value 0x0 index 0x2 value 0x0 index 0x3 value 0x0 Performance The performance of a context switch only be impacted when there are two or more perf users and one of the users must be an RDPMC user. In other cases, there is no performance impact. The worst-case occurs when there are two users: the RDPMC user only uses one counter; while the other user uses all available counters. When the RDPMC task is scheduled in, all the counters, other than the RDPMC assigned one, have to be reset. Test results for the worst-case, using a modified lat_ctx as measured on an Ice Lake platform, which has 8 GP and 3 FP counters (ignoring SLOTS). lat_ctx -s 128K -N 1000 processes 2 Without the patch: The context switch time is 4.97 us With the patch: The context switch time is 5.16 us There is ~4% performance drop for the context switching time in the worst-case. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1623693582-187370-1-git-send-email-kan.liang@linux.intel.com
2021-06-17sched/fair: Age the average idle timePeter Zijlstra
This is a partial forward-port of Peter Ziljstra's work first posted at: https://lore.kernel.org/lkml/20180530142236.667774973@infradead.org/ Currently select_idle_cpu()'s proportional scheme uses the average idle time *for when we are idle*, that is temporally challenged. When a CPU is not at all idle, we'll happily continue using whatever value we did see when the CPU goes idle. To fix this, introduce a separate average idle and age it (the existing value still makes sense for things like new-idle balancing, which happens when we do go idle). The overall goal is to not spend more time scanning for idle CPUs than we're idle for. Otherwise we're inhibiting work. This means that we need to consider the cost over all the wake-ups between consecutive idle periods. To track this, the scan cost is subtracted from the estimated average idle time. The impact of this patch is related to workloads that have domains that are fully busy or overloaded. Without the patch, the scan depth may be too high because a CPU is not reaching idle. Due to the nature of the patch, this is a regression magnet. It potentially wins when domains are almost fully busy or overloaded -- at that point searches are likely to fail but idle is not being aged as CPUs are active so search depth is too large and useless. It will potentially show regressions when there are idle CPUs and a deep search is beneficial. This tbench result on a 2-socket broadwell machine partially illustates the problem 5.13.0-rc2 5.13.0-rc2 vanilla sched-avgidle-v1r5 Hmean 1 445.02 ( 0.00%) 451.36 * 1.42%* Hmean 2 830.69 ( 0.00%) 846.03 * 1.85%* Hmean 4 1350.80 ( 0.00%) 1505.56 * 11.46%* Hmean 8 2888.88 ( 0.00%) 2586.40 * -10.47%* Hmean 16 5248.18 ( 0.00%) 5305.26 * 1.09%* Hmean 32 8914.03 ( 0.00%) 9191.35 * 3.11%* Hmean 64 10663.10 ( 0.00%) 10192.65 * -4.41%* Hmean 128 18043.89 ( 0.00%) 18478.92 * 2.41%* Hmean 256 16530.89 ( 0.00%) 17637.16 * 6.69%* Hmean 320 16451.13 ( 0.00%) 17270.97 * 4.98%* Note that 8 was a regression point where a deeper search would have helped but it gains for high thread counts when searches are useless. Hackbench is a more extreme example although not perfect as the tasks idle rapidly hackbench-process-pipes 5.13.0-rc2 5.13.0-rc2 vanilla sched-avgidle-v1r5 Amean 1 0.3950 ( 0.00%) 0.3887 ( 1.60%) Amean 4 0.9450 ( 0.00%) 0.9677 ( -2.40%) Amean 7 1.4737 ( 0.00%) 1.4890 ( -1.04%) Amean 12 2.3507 ( 0.00%) 2.3360 * 0.62%* Amean 21 4.0807 ( 0.00%) 4.0993 * -0.46%* Amean 30 5.6820 ( 0.00%) 5.7510 * -1.21%* Amean 48 8.7913 ( 0.00%) 8.7383 ( 0.60%) Amean 79 14.3880 ( 0.00%) 13.9343 * 3.15%* Amean 110 21.2233 ( 0.00%) 19.4263 * 8.47%* Amean 141 28.2930 ( 0.00%) 25.1003 * 11.28%* Amean 172 34.7570 ( 0.00%) 30.7527 * 11.52%* Amean 203 41.0083 ( 0.00%) 36.4267 * 11.17%* Amean 234 47.7133 ( 0.00%) 42.0623 * 11.84%* Amean 265 53.0353 ( 0.00%) 47.7720 * 9.92%* Amean 296 60.0170 ( 0.00%) 53.4273 * 10.98%* Stddev 1 0.0052 ( 0.00%) 0.0025 ( 51.57%) Stddev 4 0.0357 ( 0.00%) 0.0370 ( -3.75%) Stddev 7 0.0190 ( 0.00%) 0.0298 ( -56.64%) Stddev 12 0.0064 ( 0.00%) 0.0095 ( -48.38%) Stddev 21 0.0065 ( 0.00%) 0.0097 ( -49.28%) Stddev 30 0.0185 ( 0.00%) 0.0295 ( -59.54%) Stddev 48 0.0559 ( 0.00%) 0.0168 ( 69.92%) Stddev 79 0.1559 ( 0.00%) 0.0278 ( 82.17%) Stddev 110 1.1728 ( 0.00%) 0.0532 ( 95.47%) Stddev 141 0.7867 ( 0.00%) 0.0968 ( 87.69%) Stddev 172 1.0255 ( 0.00%) 0.0420 ( 95.91%) Stddev 203 0.8106 ( 0.00%) 0.1384 ( 82.92%) Stddev 234 1.1949 ( 0.00%) 0.1328 ( 88.89%) Stddev 265 0.9231 ( 0.00%) 0.0820 ( 91.11%) Stddev 296 1.0456 ( 0.00%) 0.1327 ( 87.31%) Again, higher thread counts benefit and the standard deviation shows that results are also a lot more stable when the idle time is aged. The patch potentially matters when a socket was multiple LLCs as the maximum search depth is lower. However, some of the test results were suspiciously good (e.g. specjbb2005 gaining 50% on a Zen1 machine) and other results were not dramatically different to other mcahines. Given the nature of the patch, Peter's full series is not being forward ported as each part should stand on its own. Preferably they would be merged at different times to reduce the risk of false bisections. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210615111611.GH30378@techsingularity.net
2021-06-17sched/cpufreq: Consider reduced CPU capacity in energy calculationLukasz Luba
Energy Aware Scheduling (EAS) needs to predict the decisions made by SchedUtil. The map_util_freq() exists to do that. There are corner cases where the max allowed frequency might be reduced (due to thermal). SchedUtil as a CPUFreq governor, is aware of that but EAS is not. This patch aims to address it. SchedUtil stores the maximum allowed frequency in 'sugov_policy::next_freq' field. EAS has to predict that value, which is the real used frequency. That value is made after a call to cpufreq_driver_resolve_freq() which clamps to the CPUFreq policy limits. In the existing code EAS is not able to predict that real frequency. This leads to energy estimation errors. To avoid wrong energy estimation in EAS (due to frequency miss prediction) make sure that the step which calculates Performance Domain frequency, is also aware of the allowed CPU capacity. Furthermore, modify map_util_freq() to not extend the frequency value. Instead, use map_util_perf() to extend the util value in both places: SchedUtil and EAS, but for EAS clamp it to max allowed CPU capacity. In the end, we achieve the same desirable behavior for both subsystems and alignment in regards to the real CPU frequency. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (For the schedutil part) Link: https://lore.kernel.org/r/20210614191238.23224-1-lukasz.luba@arm.com
2021-06-17sched/fair: Take thermal pressure into account while estimating energyLukasz Luba
Energy Aware Scheduling (EAS) needs to be able to predict the frequency requests made by the SchedUtil governor to properly estimate energy used in the future. It has to take into account CPUs utilization and forecast Performance Domain (PD) frequency. There is a corner case when the max allowed frequency might be reduced due to thermal. SchedUtil is aware of that reduced frequency, so it should be taken into account also in EAS estimations. SchedUtil, as a CPUFreq governor, knows the maximum allowed frequency of a CPU, thanks to cpufreq_driver_resolve_freq() and internal clamping to 'policy::max'. SchedUtil is responsible to respect that upper limit while setting the frequency through CPUFreq drivers. This effective frequency is stored internally in 'sugov_policy::next_freq' and EAS has to predict that value. In the existing code the raw value of arch_scale_cpu_capacity() is used for clamping the returned CPU utilization from effective_cpu_util(). This patch fixes issue with too big single CPU utilization, by introducing clamping to the allowed CPU capacity. The allowed CPU capacity is a CPU capacity reduced by thermal pressure raw value. Thanks to knowledge about allowed CPU capacity, we don't get too big value for a single CPU utilization, which is then added to the util sum. The util sum is used as a source of information for estimating whole PD energy. To avoid wrong energy estimation in EAS (due to capped frequency), make sure that the calculation of util sum is aware of allowed CPU capacity. This thermal pressure might be visible in scenarios where the CPUs are not heavily loaded, but some other component (like GPU) drastically reduced available power budget and increased the SoC temperature. Thus, we still use EAS for task placement and CPUs are not over-utilized. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20210614191128.22735-1-lukasz.luba@arm.com
2021-06-17thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressureLukasz Luba
The thermal pressure signal gives information to the scheduler about reduced CPU capacity due to thermal. It is based on a value stored in a per-cpu 'thermal_pressure' variable. The online CPUs will get the new value there, while the offline won't. Unfortunately, when the CPU is back online, the value read from per-cpu variable might be wrong (stale data). This might affect the scheduler decisions, since it sees the CPU capacity differently than what is actually available. Fix it by making sure that all online+offline CPUs would get the proper value in their per-cpu variable when thermal framework sets capping. Fixes: f12e4f66ab6a3 ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping") Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20210614191030.22241-1-lukasz.luba@arm.com
2021-06-17sched/fair: Return early from update_tg_cfs_load() if delta == 0Dietmar Eggemann
In case the _avg delta is 0 there is no need to update se's _avg (level n) nor cfs_rq's _avg (level n-1). These values stay the same. Since cfs_rq's _avg isn't changed, i.e. no load is propagated down, cfs_rq's _sum should stay the same as well. So bail out after se's _sum has been updated. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/20210601083616.804229-1-dietmar.eggemann@arm.com
2021-06-17sched/pelt: Check that *_avg are null when *_sum areVincent Guittot
Check that we never break the rule that pelt's avg values are null if pelt's sum are. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Odin Ugedal <odin@uged.al> Link: https://lore.kernel.org/r/20210601155328.19487-1-vincent.guittot@linaro.org
2021-06-17ACPI: APEI: fix synchronous external aborts in user-modeXiaofei Tan
Before commit 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work"), do_sea() would unconditionally signal the affected task from the arch code. Since that change, the GHES driver sends the signals. This exposes a problem as errors the GHES driver doesn't understand or doesn't handle effectively are silently ignored. It will cause the errors get taken again, and circulate endlessly. User-space task get stuck in this loop. Existing firmware on Kunpeng9xx systems reports cache errors with the 'ARM Processor Error' CPER records. Do memory failure handling for ARM Processor Error Section just like for Memory Error Section. Fixes: 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work") Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17hwmon: (ntc_thermistor) Drop unused headers.Jonathan Cameron
The IIO usage in this driver is purely consumer so it should only be including linux/iio/consumer.h Whilst here drop pm_runtime.h as there is no runtime power management in the driver. Found using include-what-you-use and manual inspection of the suggestions. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210611142257.103094-1-jic23@kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17MAINTAINERS: Add Delta DPS920AB PSU driverRobert Marko
Add maintainers entry for the Delta DPS920AB PSU driver. Signed-off-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17dt-bindings: trivial-devices: Add Delta DPS920ABRobert Marko
Add trivial device entry for Delta DPS920AB PSU. Signed-off-by: Robert Marko <robert.marko@sartura.hr> Link: https://lore.kernel.org/r/20210607103431.2039073-2-robert.marko@sartura.hr Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (pmbus) Add driver for Delta DPS-920AB PSURobert Marko
This adds support for the Delta DPS-920AB PSU. Only missing feature is fan control which the PSU supports. Signed-off-by: Robert Marko <robert.marko@sartura.hr> Link: https://lore.kernel.org/r/20210607103431.2039073-1-robert.marko@sartura.hr [groeck: Add MODULE_IMPORT_NS(PMBUS);] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (pmbus/pim4328) Add documentation for the pim4328 PMBus driverErik Rosen
Add documentation and index link for pim4328 PMBus driver. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (pmbus/pim4328) Add PMBus driver for PIM4006, PIM4328 and PIM4820Erik Rosen
Add hardware monitoring support for Flex power interface modules PIM4006, PIM4328 and PIM4820. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (pmbus) Allow phase function even if it's not on pageErik Rosen
Allow the use of a phase function even if it does not exist on the associated page. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (pmbus) Add support for reading direct mode coefficientsErik Rosen
Add support for reading and decoding direct format coefficients to the PMBus core driver. If the new flag PMBUS_USE_COEFFICIENTS_CMD is set, the driver will use the COEFFICIENTS register together with the information in the pmbus_sensor_attr structs to initialize relevant coefficients for the direct mode format. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> [groeck: Initialize ret with -EINVAL in pmbus_init_coefficients()] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (pmbus) Add new pmbus flag NO_WRITE_PROTECTErik Rosen
Some PMBus chips respond with invalid data when reading the WRITE_PROTECT register. For such chips, this flag should be set so that the PMBus core driver doesn't use the WRITE_PROTECT command to determine its behavior. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17docs: hwmon: adm1177.rst: avoid using ReSt :doc:`foo` markupMauro Carvalho Chehab
The :doc:`foo` tag is auto-generated via automarkup.py. So, use the filename at the sources, instead of :doc:`foo`. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/32b0db7e79a3ed0e817213113c607a1b819e3867.1622898327.git.mchehab+huawei@kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (pmbus_core) Check adapter PEC supportMadhava Reddy Siddareddygari
Currently, for Packet Error Checking (PEC) only the controller is checked for support. This causes problems on the cisco-8000 platform where a SMBUS transaction errors are observed. This is because PEC has to be enabled only if both controller and adapter support it. Added code to check PEC capability for adapter and enable it only if both controller and adapter supports PEC. Signed-off-by: Madhava Reddy Siddareddygari <msiddare@cisco.com> [Upstream from SONiC https://github.com/Azure/sonic-linux-kernel/pull/215] Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/r/20210605052700.541455-1-pmenzel@molgen.mpg.de [groeck: Dropped unnecessary continuation line] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (ina3221) use CVRF only for single-shot conversionNinad Malwade
As per current logic the wait time per conversion is arouns 430ms for 512 samples and around 860ms for 1024 samples for 3 channels considering 140us as the bus voltage and shunt voltage sampling conversion time. This waiting time is a lot for the continuous mode and even for the single shot mode. For continuous mode when moving average is considered the waiting for CVRF bit is not required and the data from the previous conversion is sufficuent. As mentioned in the datasheet the conversion ready bit is provided to help coordinate single-shot conversions, we can restrict the use to single-shot mode only. Also, the conversion time is for the averaged samples, the wait time for the polling can omit the number of samples consideration. Signed-off-by: Ninad Malwade <nmalwade@nvidia.com> Link: https://lore.kernel.org/r/1622789683-30931-1-git-send-email-nmalwade@nvidia.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (max31790) Detect and report zero fan speedGuenter Roeck
If a fan is not running or not connected, of if fan monitoring is disabled, the fan count register returns a fixed value of 0xffe0. So far this is then translated to a RPM value larger than 0. Since this is misleading and does not really make much sense, report a fan RPM of 0 in this situation. Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Václav Kubernát <kubernat@cesnet.cz> Link: https://lore.kernel.org/r/20210526154022.3223012-7-linux@roeck-us.net
2021-06-17hwmon: (max31790) Clear fan fault after reporting itGuenter Roeck
Fault bits in MAX31790 are sticky and have to be cleared explicitly. A write operation into either the 'Target Duty Cycle' register or the 'Target Count' register is necessary to clear a fault. At the same time, we can never clear cached fault status values before reading them because the companion fault status for any given fan is cleared as well when clearing a fault. Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Václav Kubernát <kubernat@cesnet.cz> Link: https://lore.kernel.org/r/20210526154022.3223012-6-linux@roeck-us.net
2021-06-17hwmon: (max31790) Fix pwmX_enable attributesGuenter Roeck
pwmX_enable supports three possible values: 0: Fan control disabled. Duty cycle is fixed to 0% 1: Fan control enabled, pwm mode. Duty cycle is determined by values written into Target Duty Cycle registers. 2: Fan control enabled, rpm mode Duty cycle is adjusted such that fan speed matches the values in Target Count registers The current code does not do this; instead, it mixes pwm control configuration with fan speed monitoring configuration. Worse, it reports that pwm control would be disabled (pwmX_enable==0) when it is in fact enabled in pwm mode. Part of the problem may be that the chip sets the "TACH input enable" bit on its own whenever the mode bit is set to RPM mode, but that doesn't mean that "TACH input enable" accurately reflects the pwm mode. Fix it up and only handle pwm control with the pwmX_enable attributes. In the documentation, clarify that disabling pwm control (pwmX_enable=0) sets the pwm duty cycle to 0%. In the code, explain why TACH_INPUT_EN is set together with RPM_MODE. While at it, only update the configuration register if the configuration has changed, and only update the cached configuration if updating the chip configuration was successful. Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Václav Kubernát <kubernat@cesnet.cz> Reviewed-by: Jan Kundrát <jan.kundrat@cesnet.cz> Link: https://lore.kernel.org/r/20210526154022.3223012-4-linux@roeck-us.net
2021-06-17hwmon: (max31790) Report correct current pwm duty cyclesGuenter Roeck
The MAX31790 has two sets of registers for pwm duty cycles, one to request a duty cycle and one to read the actual current duty cycle. Both do not have to be the same. When reporting the pwm duty cycle to the user, the actual pwm duty cycle from pwm duty cycle registers needs to be reported. When setting it, the pwm target duty cycle needs to be written. Since we don't know the actual pwm duty cycle after a target pwm duty cycle has been written, set the valid flag to false to indicate that actual pwm duty cycle should be read from the chip instead of using cached values. Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Václav Kubernát <kubernat@ceesnet.cz> Reviewed-by: Jan Kundrát <jan.kundrat@cesnet.cz> Link: https://lore.kernel.org/r/20210526154022.3223012-3-linux@roeck-us.net
2021-06-17hwmon: (max31790) Fix fan speed reporting for fan7..12Guenter Roeck
Fans 7..12 do not have their own set of configuration registers. So far the code ignored that and read beyond the end of the configuration register range to get the tachometer period. This resulted in more or less random fan speed values for those fans. The datasheet is quite vague when it comes to defining the tachometer period for fans 7..12. Experiments confirm that the period is the same for both fans associated with a given set of configuration registers. Fixes: 54187ff9d766 ("hwmon: (max31790) Convert to use new hwmon registration API") Fixes: 195a4b4298a7 ("hwmon: Driver for Maxim MAX31790") Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Reviewed-by: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Reviewed-by: Jan Kundrát <jan.kundrat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210526154022.3223012-2-linux@roeck-us.net
2021-06-17hwmon: (sht4x) Fix sht4x_read_values return valueJoe Perches
Kernel doc for sht4x_read_values() shows 0 on success, 1 on failure but the return value on success is actually always positive as it is set to SHT4X_RESPONSE_LENGTH by a successful call to i2c_master_recv(). Miscellanea: o Update the kernel doc for sht4x_read_values to 0 for success or -ERRNO o Remove incorrectly used kernel doc /** header for other _read functions o Typo fix succesfull->successful o Reverse a test to unindent a block and use goto unlock o Declare cmd[SHT4X_CMD_LEN] rather than cmd[] At least for gcc 10.2, object size is reduced a tiny bit. $ size drivers/hwmon/sht4x.o* text data bss dec hex filename 1752 404 256 2412 96c drivers/hwmon/sht4x.o.new 1825 404 256 2485 9b5 drivers/hwmon/sht4x.o.old Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/60eedce497137eb34448c0c77e01ec9d9c972ad7.camel@perches.com Reviewed by: Navin Sankar Velliangiri <navin@linumiz.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: Add sht4x Temperature and Humidity Sensor DriverNavin Sankar Velliangiri
This patch adds a hwmon driver for the SHT4x Temperature and Humidity sensor. Signed-off-by: Navin Sankar Velliangiri <navin@linumiz.com> [groeck: dropped unnecessary empty line and continuation lines] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17docs: hwmon: Add an entry for mp2888Fabio Estevam
The entry for mp2888 is missing and it causes the following 'make htmldocs' build warning: Documentation/hwmon/mp2888.rst: WARNING: document isn't included in any toctree Add the mp2888 entry. Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20210521172218.37592-1-festevam@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (adm1275) enable adm1272 temperature reportingChu Lin
adm1272 supports temperature reporting but it is disabled by default. Tested: ls temp1_* temp1_crit temp1_highest temp1_max temp1_crit_alarm temp1_input temp1_max_alarm cat temp1_input 26642 Signed-off-by: Chu Lin <linchuyuan@google.com> Link: https://lore.kernel.org/r/20210512171043.2433694-1-linchuyuan@google.com [groeck: Updated subject to reflect correct driver] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17dt-bindings: Add MP2888 voltage regulator deviceVadim Pasternak
Monolithic Power Systems, Inc. (MPS) dual-loop, digital, multi-phase controller. Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210511055619.118104-4-vadimp@nvidia.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17hwmon: (pmbus) Add support for MPS Multi-phase mp2888 controllerVadim Pasternak
Add support for mp2888 device from Monolithic Power Systems, Inc. (MPS) vendor. This is a digital, multi-phase, pulse-width modulation controller. This device supports: - One power rail. - Programmable Multi-Phase up to 10 Phases. - PWM-VID Interface - One pages 0 for telemetry. - Programmable pins for PMBus Address. - Built-In EEPROM to Store Custom Configurations. - Can configured VOUT readout in direct or VID format and allows setting of different formats on rails 1 and 2. For VID the following protocols are available: VR13 mode with 5-mV DAC; VR13 mode with 10-mV DAC, IMVP9 mode with 5-mV DAC. Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20210511055619.118104-3-vadimp@nvidia.com [groeck: Add MODULE_IMPORT_NS] Signed-off-by: Guenter Roeck <linux@roeck-us.net>