linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-06-17	nvmet: allow mn change if subsys not discovered	Noam Gottlieb
	Currently, once the subsystem's model_number is set for the first time there is no way to change it. However, as long as no connection was established to nvmf target, there is no reason for such restriction and we should allow to change the subsystem's model_number as many times as needed. In addition, in order to simplfy the changes and make the model number flow more similar to the rest of the attributes in the Identify Controller data structure, we set a default value for the model number at the initiation of the subsystem. Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Noam Gottlieb <ngottlieb@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvmet: make sn stable once connection was established	Noam Gottlieb
	Once some host has connected to the target, make sure that the serial number is stable and cannot be changed. Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Noam Gottlieb <ngottlieb@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvmet: change sn size and check validity	Noam Gottlieb
	According to the NVM specification, the serial_number should be 20 bytes (bytes 23:04 of the Identify Controller data structure), and should contain only ASCII characters. In accordance, the serial_number size is changed to 20 bytes and before any attempt to store a new value in serial_number we check that the input is valid - i.e. contains only ASCII characters, is not empty and does not exceed 20 bytes. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Noam Gottlieb <ngottlieb@nvidia.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvmet-fc: do not check for invalid target port in nvmet_fc_handle_fcp_rqst()	Hannes Reinecke
	When parsing a request in nvmet_fc_handle_fcp_rqst() we should not check for invalid target ports; if we do the command is aborted from the fcp layer, causing the host to assume a transport error. Rather we should still forward this request to the nvmet layer, which will then correctly fail the command with an appropriate error status. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvme-fabrics: remove memset in connect io q	Chaitanya Kulkarni
	Declare and initialize structure variable to the zero values so that we can get rid of the zeroout memset call. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvme-fabrics: remove memset in connect admin q	Chaitanya Kulkarni
	Declare and initialize structure variable to the zero values so that we can get rid of the zeroout memset call. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvme-fabrics: remove memset in nvmf_reg_write32()	Chaitanya Kulkarni
	Declare and initialize structure variable to the zero values so that we can get rid of the zeroout memset call. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvme-fabrics: remove memset in nvmf_reg_read64()	Chaitanya Kulkarni
	Declare and initialize structure variable to the zero values so that we can get rid of the zeroout memset call. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvme-tcp: use ctrl sgl check helper	Chaitanya Kulkarni
	Use the helper to check NVMe controller's SGL support. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvme-pci: use ctrl sgl check helper	Chaitanya Kulkarni
	Use the helper to check NVMe controller's SGL support. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvme-fc: use ctrl sgl check helper	Chaitanya Kulkarni
	Use the helper to check NVMe controller's SGL support. Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvme: add a helper to check ctrl sgl support	Chaitanya Kulkarni
	For various transports such as fc/tcp/pci it is common to check if NVMe SGLs are supported or not by the controller. In this preparation patch we add a helper to avoid the open coding of such checks in the various transport. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvme-pci: remove trailing lines for helpers	Chaitanya Kulkarni
	Remove the extra white line at the end of the functions. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	nvme-pci: fix var. type for increasing cq_head	JK Kim
	nvmeq->cq_head is compared with nvmeq->q_depth and changed the value and cq_phase for handling the next cq db. but, nvmeq->q_depth's type is u32 and max. value is 0x10000 when CQP.MSQE is 0xffff and io_queue_depth is 0x10000. current temp. variable for comparing with nvmeq->q_depth is overflowed when previous nvmeq->cq_head is 0xffff. in this case, nvmeq->cq_phase is not updated. so, fix data type for temp. variable to u32. Signed-off-by: JK Kim <jongkang.kim2@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-06-17	usb: core: hub: Disable autosuspend for Cypress CY7C65632	Andrew Lunn
	The Cypress CY7C65632 appears to have an issue with auto suspend and detecting devices, not too dissimilar to the SMSC 5534B hub. It is easiest to reproduce by connecting multiple mass storage devices to the hub at the same time. On a Lenovo Yoga, around 1 in 3 attempts result in the devices not being detected. It is however possible to make them appear using lsusb -v. Disabling autosuspend for this hub resolves the issue. Fixes: 1208f9e1d758 ("USB: hub: Fix the broken detection of USB3 device in SMSC hub") Cc: stable@vger.kernel.org Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20210614155524.2228800-1-andrew@lunn.ch Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-17	Merge tag 'irqchip-fixes-5.13-2' of ↵	Thomas Gleixner
	git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Fix GICv3 NMI handling where an IRQ could be mistakenly handled as a NMI, with disatrous effects Link: https://lore.kernel.org/r/20210610171127.2404752-1-maz@kernel.org
2021-06-17	spi: xilinx: convert to yaml	Nobuhiro Iwamatsu
	Convert SPI for Xilinx bindings documentation to YAML schemas. Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210605002931.858031-1-iwamatsu@nigauri.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-17	spi: convert Cadence SPI bindings to YAML	Nobuhiro Iwamatsu
	Convert spi for Cadence SPI bindings documentation to YAML. Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210605003811.858676-1-iwamatsu@nigauri.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-17	ACPI: PRM: make symbol 'prm_module_list' static	Wei Yongjun
	The sparse tool complains as follows: drivers/acpi/prmt.c:53:1: warning: symbol 'prm_module_list' was not declared. Should it be static? This symbol is not used outside of prmt.c, so marks it static. Fixes: cefc7ca46235 ("ACPI: PRM: implement OperationRegion handler for the PlatformRtMechanism subtype") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17	ACPI: NVS: fix doc warnings in nvs.c	Baokun Li
	Fixes the following W=1 kernel build warning(s): drivers/acpi/nvs.c:94: warning: Function parameter or member 'start' not described in 'suspend_nvs_register' drivers/acpi/nvs.c:94: warning: Function parameter or member 'size' not described in 'suspend_nvs_register' Signed-off-by: Baokun Li <libaokun1@huawei.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17	ACPI: sysfs: fix doc warnings in device_sysfs.c	Baokun Li
	Fixes the following W=1 kernel build warning(s): drivers/acpi/device_sysfs.c:278: warning: Function parameter or member 'dev' not described in 'acpi_device_uevent_modalias' drivers/acpi/device_sysfs.c:278: warning: Function parameter or member 'env' not described in 'acpi_device_uevent_modalias' drivers/acpi/device_sysfs.c:323: warning: Function parameter or member 'dev' not described in 'acpi_device_modalias' drivers/acpi/device_sysfs.c:323: warning: Function parameter or member 'buf' not described in 'acpi_device_modalias' drivers/acpi/device_sysfs.c:323: warning: Function parameter or member 'size' not described in 'acpi_device_modalias' Signed-off-by: Baokun Li <libaokun1@huawei.com> [ rjw: Fix spelling: acpi -> ACPI ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17	cpuidle: teo: remove unneeded semicolon in teo_select()	Wan Jiabing
	Fix following coccicheck warning: drivers/cpuidle/governors/teo.c:315:10-11: Unneeded semicolon Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17	perf/x86: Reset the dirty counter to prevent the leak for an RDPMC task	Kan Liang
	The counter value of a perf task may leak to another RDPMC task. For example, a perf stat task as below is running on CPU 0. perf stat -e 'branches,cycles' -- taskset -c 0 ./workload In the meantime, an RDPMC task, which is also running on CPU 0, may read the GP counters periodically. (The RDPMC task creates a fixed event, but read four GP counters.) $./rdpmc_read_all_counters index 0x0 value 0x8001e5970f99 index 0x1 value 0x8005d750edb6 index 0x2 value 0x0 index 0x3 value 0x0 index 0x0 value 0x8002358e48a5 index 0x1 value 0x8006bd1e3bc9 index 0x2 value 0x0 index 0x3 value 0x0 It is a potential security issue. Once the attacker knows what the other thread is counting. The PerfMon counter can be used as a side-channel to attack cryptosystems. The counter value of the perf stat task leaks to the RDPMC task because perf never clears the counter when it's stopped. Three methods were considered to address the issue. - Unconditionally reset the counter in x86_pmu_del(). It can bring extra overhead even when there is no RDPMC task running. - Only reset the un-assigned dirty counters when the RDPMC task is scheduled in via sched_task(). It fails for the below case. Thread A Thread B clone(CLONE_THREAD) ---> set_affine(0) set_affine(1) while (!event-enabled) ; event = perf_event_open() mmap(event) ioctl(event, IOC_ENABLE); ---> RDPMC Counters are still leaked to the thread B. - Only reset the un-assigned dirty counters before updating the CR4.PCE bit. The method is implemented here. The dirty counter is a counter, on which the assigned event has been deleted, but the counter is not reset. To track the dirty counters, add a 'dirty' variable in the struct cpu_hw_events. The security issue can only be found with an RDPMC task. To enable the RDMPC, the CR4.PCE bit has to be updated. Add a perf_clear_dirty_counters() right before updating the CR4.PCE bit to clear the existing dirty counters. Only the current un-assigned dirty counters are reset, because the RDPMC assigned dirty counters will be updated soon. After applying the patch, $ ./rdpmc_read_all_counters index 0x0 value 0x0 index 0x1 value 0x0 index 0x2 value 0x0 index 0x3 value 0x0 index 0x0 value 0x0 index 0x1 value 0x0 index 0x2 value 0x0 index 0x3 value 0x0 Performance The performance of a context switch only be impacted when there are two or more perf users and one of the users must be an RDPMC user. In other cases, there is no performance impact. The worst-case occurs when there are two users: the RDPMC user only uses one counter; while the other user uses all available counters. When the RDPMC task is scheduled in, all the counters, other than the RDPMC assigned one, have to be reset. Test results for the worst-case, using a modified lat_ctx as measured on an Ice Lake platform, which has 8 GP and 3 FP counters (ignoring SLOTS). lat_ctx -s 128K -N 1000 processes 2 Without the patch: The context switch time is 4.97 us With the patch: The context switch time is 5.16 us There is ~4% performance drop for the context switching time in the worst-case. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1623693582-187370-1-git-send-email-kan.liang@linux.intel.com
2021-06-17	sched/fair: Age the average idle time	Peter Zijlstra
	This is a partial forward-port of Peter Ziljstra's work first posted at: https://lore.kernel.org/lkml/20180530142236.667774973@infradead.org/ Currently select_idle_cpu()'s proportional scheme uses the average idle time for when we are idle, that is temporally challenged. When a CPU is not at all idle, we'll happily continue using whatever value we did see when the CPU goes idle. To fix this, introduce a separate average idle and age it (the existing value still makes sense for things like new-idle balancing, which happens when we do go idle). The overall goal is to not spend more time scanning for idle CPUs than we're idle for. Otherwise we're inhibiting work. This means that we need to consider the cost over all the wake-ups between consecutive idle periods. To track this, the scan cost is subtracted from the estimated average idle time. The impact of this patch is related to workloads that have domains that are fully busy or overloaded. Without the patch, the scan depth may be too high because a CPU is not reaching idle. Due to the nature of the patch, this is a regression magnet. It potentially wins when domains are almost fully busy or overloaded -- at that point searches are likely to fail but idle is not being aged as CPUs are active so search depth is too large and useless. It will potentially show regressions when there are idle CPUs and a deep search is beneficial. This tbench result on a 2-socket broadwell machine partially illustates the problem 5.13.0-rc2 5.13.0-rc2 vanilla sched-avgidle-v1r5 Hmean 1 445.02 ( 0.00%) 451.36 * 1.42%* Hmean 2 830.69 ( 0.00%) 846.03 * 1.85%* Hmean 4 1350.80 ( 0.00%) 1505.56 * 11.46%* Hmean 8 2888.88 ( 0.00%) 2586.40 * -10.47%* Hmean 16 5248.18 ( 0.00%) 5305.26 * 1.09%* Hmean 32 8914.03 ( 0.00%) 9191.35 * 3.11%* Hmean 64 10663.10 ( 0.00%) 10192.65 * -4.41%* Hmean 128 18043.89 ( 0.00%) 18478.92 * 2.41%* Hmean 256 16530.89 ( 0.00%) 17637.16 * 6.69%* Hmean 320 16451.13 ( 0.00%) 17270.97 * 4.98%* Note that 8 was a regression point where a deeper search would have helped but it gains for high thread counts when searches are useless. Hackbench is a more extreme example although not perfect as the tasks idle rapidly hackbench-process-pipes 5.13.0-rc2 5.13.0-rc2 vanilla sched-avgidle-v1r5 Amean 1 0.3950 ( 0.00%) 0.3887 ( 1.60%) Amean 4 0.9450 ( 0.00%) 0.9677 ( -2.40%) Amean 7 1.4737 ( 0.00%) 1.4890 ( -1.04%) Amean 12 2.3507 ( 0.00%) 2.3360 * 0.62%* Amean 21 4.0807 ( 0.00%) 4.0993 * -0.46%* Amean 30 5.6820 ( 0.00%) 5.7510 * -1.21%* Amean 48 8.7913 ( 0.00%) 8.7383 ( 0.60%) Amean 79 14.3880 ( 0.00%) 13.9343 * 3.15%* Amean 110 21.2233 ( 0.00%) 19.4263 * 8.47%* Amean 141 28.2930 ( 0.00%) 25.1003 * 11.28%* Amean 172 34.7570 ( 0.00%) 30.7527 * 11.52%* Amean 203 41.0083 ( 0.00%) 36.4267 * 11.17%* Amean 234 47.7133 ( 0.00%) 42.0623 * 11.84%* Amean 265 53.0353 ( 0.00%) 47.7720 * 9.92%* Amean 296 60.0170 ( 0.00%) 53.4273 * 10.98%* Stddev 1 0.0052 ( 0.00%) 0.0025 ( 51.57%) Stddev 4 0.0357 ( 0.00%) 0.0370 ( -3.75%) Stddev 7 0.0190 ( 0.00%) 0.0298 ( -56.64%) Stddev 12 0.0064 ( 0.00%) 0.0095 ( -48.38%) Stddev 21 0.0065 ( 0.00%) 0.0097 ( -49.28%) Stddev 30 0.0185 ( 0.00%) 0.0295 ( -59.54%) Stddev 48 0.0559 ( 0.00%) 0.0168 ( 69.92%) Stddev 79 0.1559 ( 0.00%) 0.0278 ( 82.17%) Stddev 110 1.1728 ( 0.00%) 0.0532 ( 95.47%) Stddev 141 0.7867 ( 0.00%) 0.0968 ( 87.69%) Stddev 172 1.0255 ( 0.00%) 0.0420 ( 95.91%) Stddev 203 0.8106 ( 0.00%) 0.1384 ( 82.92%) Stddev 234 1.1949 ( 0.00%) 0.1328 ( 88.89%) Stddev 265 0.9231 ( 0.00%) 0.0820 ( 91.11%) Stddev 296 1.0456 ( 0.00%) 0.1327 ( 87.31%) Again, higher thread counts benefit and the standard deviation shows that results are also a lot more stable when the idle time is aged. The patch potentially matters when a socket was multiple LLCs as the maximum search depth is lower. However, some of the test results were suspiciously good (e.g. specjbb2005 gaining 50% on a Zen1 machine) and other results were not dramatically different to other mcahines. Given the nature of the patch, Peter's full series is not being forward ported as each part should stand on its own. Preferably they would be merged at different times to reduce the risk of false bisections. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210615111611.GH30378@techsingularity.net
2021-06-17	sched/cpufreq: Consider reduced CPU capacity in energy calculation	Lukasz Luba
	Energy Aware Scheduling (EAS) needs to predict the decisions made by SchedUtil. The map_util_freq() exists to do that. There are corner cases where the max allowed frequency might be reduced (due to thermal). SchedUtil as a CPUFreq governor, is aware of that but EAS is not. This patch aims to address it. SchedUtil stores the maximum allowed frequency in 'sugov_policy::next_freq' field. EAS has to predict that value, which is the real used frequency. That value is made after a call to cpufreq_driver_resolve_freq() which clamps to the CPUFreq policy limits. In the existing code EAS is not able to predict that real frequency. This leads to energy estimation errors. To avoid wrong energy estimation in EAS (due to frequency miss prediction) make sure that the step which calculates Performance Domain frequency, is also aware of the allowed CPU capacity. Furthermore, modify map_util_freq() to not extend the frequency value. Instead, use map_util_perf() to extend the util value in both places: SchedUtil and EAS, but for EAS clamp it to max allowed CPU capacity. In the end, we achieve the same desirable behavior for both subsystems and alignment in regards to the real CPU frequency. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (For the schedutil part) Link: https://lore.kernel.org/r/20210614191238.23224-1-lukasz.luba@arm.com
2021-06-17	sched/fair: Take thermal pressure into account while estimating energy	Lukasz Luba
	Energy Aware Scheduling (EAS) needs to be able to predict the frequency requests made by the SchedUtil governor to properly estimate energy used in the future. It has to take into account CPUs utilization and forecast Performance Domain (PD) frequency. There is a corner case when the max allowed frequency might be reduced due to thermal. SchedUtil is aware of that reduced frequency, so it should be taken into account also in EAS estimations. SchedUtil, as a CPUFreq governor, knows the maximum allowed frequency of a CPU, thanks to cpufreq_driver_resolve_freq() and internal clamping to 'policy::max'. SchedUtil is responsible to respect that upper limit while setting the frequency through CPUFreq drivers. This effective frequency is stored internally in 'sugov_policy::next_freq' and EAS has to predict that value. In the existing code the raw value of arch_scale_cpu_capacity() is used for clamping the returned CPU utilization from effective_cpu_util(). This patch fixes issue with too big single CPU utilization, by introducing clamping to the allowed CPU capacity. The allowed CPU capacity is a CPU capacity reduced by thermal pressure raw value. Thanks to knowledge about allowed CPU capacity, we don't get too big value for a single CPU utilization, which is then added to the util sum. The util sum is used as a source of information for estimating whole PD energy. To avoid wrong energy estimation in EAS (due to capped frequency), make sure that the calculation of util sum is aware of allowed CPU capacity. This thermal pressure might be visible in scenarios where the CPUs are not heavily loaded, but some other component (like GPU) drastically reduced available power budget and increased the SoC temperature. Thus, we still use EAS for task placement and CPUs are not over-utilized. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20210614191128.22735-1-lukasz.luba@arm.com
2021-06-17	thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure	Lukasz Luba
	The thermal pressure signal gives information to the scheduler about reduced CPU capacity due to thermal. It is based on a value stored in a per-cpu 'thermal_pressure' variable. The online CPUs will get the new value there, while the offline won't. Unfortunately, when the CPU is back online, the value read from per-cpu variable might be wrong (stale data). This might affect the scheduler decisions, since it sees the CPU capacity differently than what is actually available. Fix it by making sure that all online+offline CPUs would get the proper value in their per-cpu variable when thermal framework sets capping. Fixes: f12e4f66ab6a3 ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping") Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20210614191030.22241-1-lukasz.luba@arm.com
2021-06-17	sched/fair: Return early from update_tg_cfs_load() if delta == 0	Dietmar Eggemann
	In case the _avg delta is 0 there is no need to update se's _avg (level n) nor cfs_rq's _avg (level n-1). These values stay the same. Since cfs_rq's _avg isn't changed, i.e. no load is propagated down, cfs_rq's _sum should stay the same as well. So bail out after se's _sum has been updated. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/20210601083616.804229-1-dietmar.eggemann@arm.com
2021-06-17	sched/pelt: Check that _avg are null when _sum are	Vincent Guittot
	Check that we never break the rule that pelt's avg values are null if pelt's sum are. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Odin Ugedal <odin@uged.al> Link: https://lore.kernel.org/r/20210601155328.19487-1-vincent.guittot@linaro.org
2021-06-17	ACPI: APEI: fix synchronous external aborts in user-mode	Xiaofei Tan
	Before commit 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work"), do_sea() would unconditionally signal the affected task from the arch code. Since that change, the GHES driver sends the signals. This exposes a problem as errors the GHES driver doesn't understand or doesn't handle effectively are silently ignored. It will cause the errors get taken again, and circulate endlessly. User-space task get stuck in this loop. Existing firmware on Kunpeng9xx systems reports cache errors with the 'ARM Processor Error' CPER records. Do memory failure handling for ARM Processor Error Section just like for Memory Error Section. Fixes: 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work") Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17	extcon: extcon-max8997: Simplify driver using devm	Matti Vaittinen
	Simplify driver by switching to use the resource managed IRQ requesting and resource managed work-queue initialization. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Chanwoo Choi <cw00.choi@samsung.com> Link: https://lore.kernel.org/r/61190cc280a63baeb05ec570282bb3677bee8e7b.1623146580.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-06-17	extcon: extcon-max8997: Fix IRQ freeing at error path	Matti Vaittinen
	If reading MAX8997_MUIC_REG_STATUS1 fails at probe the driver exits without freeing the requested IRQs. Free the IRQs prior returning if reading the status fails. Fixes: 3e34c8198960 ("extcon: max8997: Avoid forcing UART path on drive probe") Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Chanwoo Choi <cw00.choi@samsung.com> Link: https://lore.kernel.org/r/27ee4a48ee775c3f8c9d90459c18b6f2b15edc76.1623146580.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-06-17	extcon: extcon-max77693.c: Fix potential work-queue cancellation race	Matti Vaittinen
	The extcon IRQ schedules a work item. IRQ is requested using devm while WQ is cancelld at remove(). This mixing of devm and manual unwinding has potential case where the WQ has been emptied (.remove() was ran) but devm unwinding of IRQ was not yet done. It may be possible the IRQ is triggered at this point scheduling new work item to the already flushed queue. According to the input documentation the input device allocated by devm_input_allocate_device() does not need to be explicitly unregistered. Use the new devm_work_autocancel() and remove the remove() to simplify the code. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Chanwoo Choi <cw00.choi@samsung.com> Link: https://lore.kernel.org/r/cbe8205eed8276f6e6db5003cfe51b8b0d4ac966.1623146580.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-06-17	hwmon: (ntc_thermistor) Drop unused headers.	Jonathan Cameron
	The IIO usage in this driver is purely consumer so it should only be including linux/iio/consumer.h Whilst here drop pm_runtime.h as there is no runtime power management in the driver. Found using include-what-you-use and manual inspection of the suggestions. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210611142257.103094-1-jic23@kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	MAINTAINERS: Add Delta DPS920AB PSU driver	Robert Marko
	Add maintainers entry for the Delta DPS920AB PSU driver. Signed-off-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	dt-bindings: trivial-devices: Add Delta DPS920AB	Robert Marko
	Add trivial device entry for Delta DPS920AB PSU. Signed-off-by: Robert Marko <robert.marko@sartura.hr> Link: https://lore.kernel.org/r/20210607103431.2039073-2-robert.marko@sartura.hr Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	hwmon: (pmbus) Add driver for Delta DPS-920AB PSU	Robert Marko
	This adds support for the Delta DPS-920AB PSU. Only missing feature is fan control which the PSU supports. Signed-off-by: Robert Marko <robert.marko@sartura.hr> Link: https://lore.kernel.org/r/20210607103431.2039073-1-robert.marko@sartura.hr [groeck: Add MODULE_IMPORT_NS(PMBUS);] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	hwmon: (pmbus/pim4328) Add documentation for the pim4328 PMBus driver	Erik Rosen
	Add documentation and index link for pim4328 PMBus driver. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	hwmon: (pmbus/pim4328) Add PMBus driver for PIM4006, PIM4328 and PIM4820	Erik Rosen
	Add hardware monitoring support for Flex power interface modules PIM4006, PIM4328 and PIM4820. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	hwmon: (pmbus) Allow phase function even if it's not on page	Erik Rosen
	Allow the use of a phase function even if it does not exist on the associated page. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	hwmon: (pmbus) Add support for reading direct mode coefficients	Erik Rosen
	Add support for reading and decoding direct format coefficients to the PMBus core driver. If the new flag PMBUS_USE_COEFFICIENTS_CMD is set, the driver will use the COEFFICIENTS register together with the information in the pmbus_sensor_attr structs to initialize relevant coefficients for the direct mode format. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> [groeck: Initialize ret with -EINVAL in pmbus_init_coefficients()] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	hwmon: (pmbus) Add new pmbus flag NO_WRITE_PROTECT	Erik Rosen
	Some PMBus chips respond with invalid data when reading the WRITE_PROTECT register. For such chips, this flag should be set so that the PMBus core driver doesn't use the WRITE_PROTECT command to determine its behavior. Signed-off-by: Erik Rosen <erik.rosen@metormote.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	docs: hwmon: adm1177.rst: avoid using ReSt :doc:`foo` markup	Mauro Carvalho Chehab
	The :doc:`foo` tag is auto-generated via automarkup.py. So, use the filename at the sources, instead of :doc:`foo`. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/32b0db7e79a3ed0e817213113c607a1b819e3867.1622898327.git.mchehab+huawei@kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	hwmon: (pmbus_core) Check adapter PEC support	Madhava Reddy Siddareddygari
	Currently, for Packet Error Checking (PEC) only the controller is checked for support. This causes problems on the cisco-8000 platform where a SMBUS transaction errors are observed. This is because PEC has to be enabled only if both controller and adapter support it. Added code to check PEC capability for adapter and enable it only if both controller and adapter supports PEC. Signed-off-by: Madhava Reddy Siddareddygari <msiddare@cisco.com> [Upstream from SONiC https://github.com/Azure/sonic-linux-kernel/pull/215] Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/r/20210605052700.541455-1-pmenzel@molgen.mpg.de [groeck: Dropped unnecessary continuation line] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	hwmon: (ina3221) use CVRF only for single-shot conversion	Ninad Malwade
	As per current logic the wait time per conversion is arouns 430ms for 512 samples and around 860ms for 1024 samples for 3 channels considering 140us as the bus voltage and shunt voltage sampling conversion time. This waiting time is a lot for the continuous mode and even for the single shot mode. For continuous mode when moving average is considered the waiting for CVRF bit is not required and the data from the previous conversion is sufficuent. As mentioned in the datasheet the conversion ready bit is provided to help coordinate single-shot conversions, we can restrict the use to single-shot mode only. Also, the conversion time is for the averaged samples, the wait time for the polling can omit the number of samples consideration. Signed-off-by: Ninad Malwade <nmalwade@nvidia.com> Link: https://lore.kernel.org/r/1622789683-30931-1-git-send-email-nmalwade@nvidia.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-17	hwmon: (max31790) Detect and report zero fan speed	Guenter Roeck
	If a fan is not running or not connected, of if fan monitoring is disabled, the fan count register returns a fixed value of 0xffe0. So far this is then translated to a RPM value larger than 0. Since this is misleading and does not really make much sense, report a fan RPM of 0 in this situation. Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Václav Kubernát <kubernat@cesnet.cz> Link: https://lore.kernel.org/r/20210526154022.3223012-7-linux@roeck-us.net
2021-06-17	hwmon: (max31790) Clear fan fault after reporting it	Guenter Roeck
	Fault bits in MAX31790 are sticky and have to be cleared explicitly. A write operation into either the 'Target Duty Cycle' register or the 'Target Count' register is necessary to clear a fault. At the same time, we can never clear cached fault status values before reading them because the companion fault status for any given fan is cleared as well when clearing a fault. Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Václav Kubernát <kubernat@cesnet.cz> Link: https://lore.kernel.org/r/20210526154022.3223012-6-linux@roeck-us.net
2021-06-17	hwmon: (max31790) Fix pwmX_enable attributes	Guenter Roeck
	pwmX_enable supports three possible values: 0: Fan control disabled. Duty cycle is fixed to 0% 1: Fan control enabled, pwm mode. Duty cycle is determined by values written into Target Duty Cycle registers. 2: Fan control enabled, rpm mode Duty cycle is adjusted such that fan speed matches the values in Target Count registers The current code does not do this; instead, it mixes pwm control configuration with fan speed monitoring configuration. Worse, it reports that pwm control would be disabled (pwmX_enable==0) when it is in fact enabled in pwm mode. Part of the problem may be that the chip sets the "TACH input enable" bit on its own whenever the mode bit is set to RPM mode, but that doesn't mean that "TACH input enable" accurately reflects the pwm mode. Fix it up and only handle pwm control with the pwmX_enable attributes. In the documentation, clarify that disabling pwm control (pwmX_enable=0) sets the pwm duty cycle to 0%. In the code, explain why TACH_INPUT_EN is set together with RPM_MODE. While at it, only update the configuration register if the configuration has changed, and only update the cached configuration if updating the chip configuration was successful. Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Václav Kubernát <kubernat@cesnet.cz> Reviewed-by: Jan Kundrát <jan.kundrat@cesnet.cz> Link: https://lore.kernel.org/r/20210526154022.3223012-4-linux@roeck-us.net
2021-06-17	hwmon: (max31790) Report correct current pwm duty cycles	Guenter Roeck
	The MAX31790 has two sets of registers for pwm duty cycles, one to request a duty cycle and one to read the actual current duty cycle. Both do not have to be the same. When reporting the pwm duty cycle to the user, the actual pwm duty cycle from pwm duty cycle registers needs to be reported. When setting it, the pwm target duty cycle needs to be written. Since we don't know the actual pwm duty cycle after a target pwm duty cycle has been written, set the valid flag to false to indicate that actual pwm duty cycle should be read from the chip instead of using cached values. Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Václav Kubernát <kubernat@ceesnet.cz> Reviewed-by: Jan Kundrát <jan.kundrat@cesnet.cz> Link: https://lore.kernel.org/r/20210526154022.3223012-3-linux@roeck-us.net
2021-06-17	hwmon: (max31790) Fix fan speed reporting for fan7..12	Guenter Roeck
	Fans 7..12 do not have their own set of configuration registers. So far the code ignored that and read beyond the end of the configuration register range to get the tachometer period. This resulted in more or less random fan speed values for those fans. The datasheet is quite vague when it comes to defining the tachometer period for fans 7..12. Experiments confirm that the period is the same for both fans associated with a given set of configuration registers. Fixes: 54187ff9d766 ("hwmon: (max31790) Convert to use new hwmon registration API") Fixes: 195a4b4298a7 ("hwmon: Driver for Maxim MAX31790") Cc: Jan Kundrát <jan.kundrat@cesnet.cz> Reviewed-by: Jan Kundrát <jan.kundrat@cesnet.cz> Cc: Václav Kubernát <kubernat@cesnet.cz> Reviewed-by: Jan Kundrát <jan.kundrat@cesnet.cz> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210526154022.3223012-2-linux@roeck-us.net