summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-23software node: Handle software node injection to an existing device properlyHeikki Krogerus
The function software_node_notify() - the function that creates and removes the symlinks between the node and the device - was called unconditionally in device_add_software_node() and device_remove_software_node(), but it needs to be called in those functions only in the special case where the node is added to a device that has already been registered. This fixes NULL pointer dereference that happens if device_remove_software_node() is used with device that was never registered. Fixes: b622b24519f5 ("software node: Allow node addition to already existing device") Reported-and-tested-by: Dominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-23Merge tag 'pm-5.13-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Revert a recent PCI power management commit that causes initialization issues to appear on some systems" * tag 'pm-5.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "PCI: PM: Do not read power state in pci_enable_device_flags()"
2021-06-23perf: Fix task context PMU for HeteroPeter Zijlstra
On HETEROGENEOUS hardware (ARM big.Little, Intel Alderlake etc.) each CPU might have a different hardware PMU. Since each such PMU is represented by a different struct pmu, but we only have a single HW task context. That means that the task context needs to switch PMU type when it switches CPUs. Not doing this means that ctx->pmu calls (pmu_{dis,en}able(), {start,commit,cancel}_txn() etc.) are called against the wrong PMU and things will go wobbly. Fixes: f83d2f91d259 ("perf/x86/intel: Add Alder Lake Hybrid support") Reported-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/YMsy7BuGT8nBTspT@hirez.programming.kicks-ass.net
2021-06-23perf/x86/intel: Fix instructions:ppp support in Sapphire RapidsKan Liang
Perf errors out when sampling instructions:ppp. $ perf record -e instructions:ppp -- true Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (instructions:ppp). The instruction PDIR is only available on the fixed counter 0. The event constraint has been updated to fixed0_constraint in icl_get_event_constraints(). The Sapphire Rapids codes unconditionally error out for the event which is not available on the GP counter 0. Make the instructions:ppp an exception. Fixes: 61b985e3e775 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids") Reported-by: Yasin, Ahmad <ahmad.yasin@intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1624029174-122219-4-git-send-email-kan.liang@linux.intel.com
2021-06-23perf/x86/intel: Add more events requires FRONTEND MSR on Sapphire RapidsKan Liang
On Sapphire Rapids, there are two more events 0x40ad and 0x04c2 which rely on the FRONTEND MSR. If the FRONTEND MSR is not set correctly, the count value is not correct. Update intel_spr_extra_regs[] to support them. Fixes: 61b985e3e775 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1624029174-122219-3-git-send-email-kan.liang@linux.intel.com
2021-06-23perf/x86/intel: Fix fixed counter check warning for some Alder LakeKan Liang
For some Alder Lake machine, the below fixed counter check warning may be triggered. [ 2.010766] hw perf events fixed 5 > max(4), clipping! Current perf unconditionally increases the number of the GP counters and the fixed counters for a big core PMU on an Alder Lake system, because the number enumerated in the CPUID only reflects the common counters. The big core may has more counters. However, Alder Lake may have an alternative configuration. With that configuration, the X86_FEATURE_HYBRID_CPU is not set. The number of the GP counters and fixed counters enumerated in the CPUID is accurate. Perf mistakenly increases the number of counters. The warning is triggered. Directly use the enumerated value on the system with the alternative configuration. Fixes: f83d2f91d259 ("perf/x86/intel: Add Alder Lake Hybrid support") Reported-by: Jin Yao <yao.jin@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1624029174-122219-2-git-send-email-kan.liang@linux.intel.com
2021-06-23perf/x86/intel: Fix PEBS-via-PT reload base value for Extended PEBSLike Xu
If we use the "PEBS-via-PT" feature on a platform that supports extended PBES, like this: perf record -c 10000 \ -e '{intel_pt/branch=0/,branch-instructions/aux-output/p}' uname we will encounter the following call trace: [ 250.906542] unchecked MSR access error: WRMSR to 0x14e1 (tried to write 0x0000000000000000) at rIP: 0xffffffff88073624 (native_write_msr+0x4/0x20) [ 250.920779] Call Trace: [ 250.923508] intel_pmu_pebs_enable+0x12c/0x190 [ 250.928359] intel_pmu_enable_event+0x346/0x390 [ 250.933300] x86_pmu_start+0x64/0x80 [ 250.937231] x86_pmu_enable+0x16a/0x2f0 [ 250.941434] perf_event_exec+0x144/0x4c0 [ 250.945731] begin_new_exec+0x650/0xbf0 [ 250.949933] load_elf_binary+0x13e/0x1700 [ 250.954321] ? lock_acquire+0xc2/0x390 [ 250.958430] ? bprm_execve+0x34f/0x8a0 [ 250.962544] ? lock_is_held_type+0xa7/0x120 [ 250.967118] ? find_held_lock+0x32/0x90 [ 250.971321] ? sched_clock_cpu+0xc/0xb0 [ 250.975527] bprm_execve+0x33d/0x8a0 [ 250.979452] do_execveat_common.isra.0+0x161/0x1d0 [ 250.984673] __x64_sys_execve+0x33/0x40 [ 250.988877] do_syscall_64+0x3d/0x80 [ 250.992806] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 250.998302] RIP: 0033:0x7fbc971d82fb [ 251.002235] Code: Unable to access opcode bytes at RIP 0x7fbc971d82d1. [ 251.009303] RSP: 002b:00007fffb8aed808 EFLAGS: 00000202 ORIG_RAX: 000000000000003b [ 251.017478] RAX: ffffffffffffffda RBX: 00007fffb8af2f00 RCX: 00007fbc971d82fb [ 251.025187] RDX: 00005574792aac50 RSI: 00007fffb8af2f00 RDI: 00007fffb8aed810 [ 251.032901] RBP: 00007fffb8aed970 R08: 0000000000000020 R09: 00007fbc9725c8b0 [ 251.040613] R10: 6d6c61632f6d6f63 R11: 0000000000000202 R12: 00005574792aac50 [ 251.048327] R13: 00007fffb8af35f0 R14: 00005574792aafdf R15: 00005574792aafe7 This is because the target reload msr address is calculated based on the wrong base msr and the target reload msr value is accessed from ds->pebs_event_reset[] with the wrong offset. According to Intel SDM Table 2-14, for extended PBES feature, the reload msr for MSR_IA32_FIXED_CTRx should be based on MSR_RELOAD_FIXED_CTRx. For fixed counters, let's fix it by overriding the reload msr address and its value, thus avoiding out-of-bounds access. Fixes: 42880f726c66("perf/x86/intel: Support PEBS output to PT") Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210621034710.31107-1-likexu@tencent.com
2021-06-23Merge branch 'stable/for-linus-5.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb fix from Konrad Rzeszutek Wilk: "A fix for the regression for the DMA operations where the offset was ignored and corruptions would appear. Going forward there will be a cleanups to make the offset and alignment logic more clearer and better test-cases to help with this" * 'stable/for-linus-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: manipulate orig_addr when tlb_addr has offset
2021-06-23Merge remote-tracking branch 'regulator/for-5.14' into regulator-nextMark Brown
2021-06-23Merge remote-tracking branch 'regulator/for-5.13' into regulator-linusMark Brown
2021-06-23Merge series "Support ROCKCHIP SPI new feature" from Jon Lin ↵Mark Brown
<jon.lin@rock-chips.com>: Changes in v10: - The internal CS inactive function is only supported after VER 0x00110002 Changes in v9: - Conver to use CS GPIO description Changes in v8: - There is a problem with the version 7 mail format. resend it Changes in v7: - Fall back "rockchip,rv1126-spi" to "rockchip,rk3066-spi" Changes in v6: - Consider to compatibility, the "rockchip,rk3568-spi" is removed in Series-changes v5, so the commit massage should also remove the corresponding information Changes in v5: - Change to leave one compatible id rv1126, and rk3568 is compatible with rv1126 Changes in v4: - Adjust the order patches - Simply commit massage like redundancy "application" content Changes in v3: - Fix compile error which is find by Sascha in [v2,2/8] Jon Lin (6): dt-bindings: spi: spi-rockchip: add description for rv1126 spi: rockchip: add compatible string for rv1126 spi: rockchip: Set rx_fifo interrupt waterline base on transfer item spi: rockchip: Wait for STB status in slave mode tx_xfer spi: rockchip: Support cs-gpio spi: rockchip: Support SPI_CS_HIGH .../devicetree/bindings/spi/spi-rockchip.yaml | 1 + drivers/spi/spi-rockchip.c | 55 ++++++++++++++----- 2 files changed, 41 insertions(+), 15 deletions(-) -- 2.17.1
2021-06-23spi: spi-sh-msiof: : use proper DMAENGINE API for terminationWolfram Sang
dmaengine_terminate_all() is deprecated in favor of explicitly saying if it should be sync or async. Here, we want dmaengine_terminate_sync() because there is no other synchronization code in the driver to handle an async case. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20210623095843.3228-3-wsa+renesas@sang-engineering.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23spi: spi-rspi: : use proper DMAENGINE API for terminationWolfram Sang
dmaengine_terminate_all() is deprecated in favor of explicitly saying if it should be sync or async. Here, we want dmaengine_terminate_sync() because there is no other synchronization code in the driver to handle an async case. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20210623095843.3228-2-wsa+renesas@sang-engineering.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23arm64: tlb: fix the TTL value of tlb_get_levelZhenyu Ye
The TTL field indicates the level of page table walk holding the *leaf* entry for the address being invalidated. But currently, the TTL field may be set to an incorrent value in the following stack: pte_free_tlb __pte_free_tlb tlb_remove_table tlb_table_invalidate tlb_flush_mmu_tlbonly tlb_flush In this case, we just want to flush a PTE page, but the tlb->cleared_pmds is set and we get tlb_level = 2 in the tlb_get_level() function. This may cause some unexpected problems. This patch set the TTL field to 0 if tlb->freed_tables is set. The tlb->freed_tables indicates page table pages are freed, not the leaf entry. Cc: <stable@vger.kernel.org> # 5.9.x Fixes: c4ab2cbc1d87 ("arm64: tlb: Set the TTL field in flush_tlb_range") Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: ZhuRui <zhurui3@huawei.com> Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com> Link: https://lore.kernel.org/r/b80ead47-1f88-3a00-18e1-cacc22f54cc4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-23spi: spi-rockchip: add description for rv1126Jon Lin
The description below will be used for rv1126.dtsi or compatible one in the future Signed-off-by: Jon Lin <jon.lin@rock-chips.com> Link: https://lore.kernel.org/r/20210621104800.19088-2-jon.lin@rock-chips.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23spi: rockchip: Support SPI_CS_HIGHJon Lin
1.Add standard spi-cs-high support 2.Refer to spi-controller.yaml for details Signed-off-by: Jon Lin <jon.lin@rock-chips.com> Link: https://lore.kernel.org/r/20210621104848.19539-2-jon.lin@rock-chips.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23spi: rockchip: Support cs-gpioJon Lin
1.Add standard cs-gpio support 2.Refer to spi-controller.yaml for details Signed-off-by: Jon Lin <jon.lin@rock-chips.com> Link: https://lore.kernel.org/r/20210621104848.19539-1-jon.lin@rock-chips.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23spi: rockchip: Wait for STB status in slave mode tx_xferJon Lin
After ROCKCHIP_SPI_VER2_TYPE2, SR->STB is a more accurate judgment bit for spi slave transmition. Signed-off-by: Jon Lin <jon.lin@rock-chips.com> Link: https://lore.kernel.org/r/20210621104800.19088-5-jon.lin@rock-chips.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23spi: rockchip: Set rx_fifo interrupt waterline base on transfer itemJon Lin
The error here is to calculate the width as 8 bits. In fact, 16 bits should be considered. Signed-off-by: Jon Lin <jon.lin@rock-chips.com> Link: https://lore.kernel.org/r/20210621104800.19088-4-jon.lin@rock-chips.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23spi: rockchip: add compatible string for rv1126Jon Lin
Add compatible string for rv1126 for potential applications. Signed-off-by: Jon Lin <jon.lin@rock-chips.com> Link: https://lore.kernel.org/r/20210621104800.19088-3-jon.lin@rock-chips.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23regulator: bd9576: Fix uninitializes variable may_have_irqsColin Ian King
The boolean variable may_have_irqs is not ininitialized and is only being set to true in the case where chip is ROHM_CHIP_TYPE_BD9576. Fix this by ininitialized may_have_irqs to false. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: e7bf1fa58c46 ("regulator: bd9576: Support error reporting") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Link: https://lore.kernel.org/r/20210622144730.22821-1-colin.king@canonical.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23regulator: max8893: Select REGMAP_I2C to fix build errorAxel Lin
Fix build error if REGMAP_I2C is not set. Signed-off-by: Axel Lin <axel.lin@ingics.com> Link: https://lore.kernel.org/r/20210622141526.472175-1-axel.lin@ingics.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23regulator: da9052: Ensure enough delay time for .set_voltage_time_selAxel Lin
Use DIV_ROUND_UP to prevent truncation by integer division issue. This ensures we return enough delay time. Also fix returning negative value when new_sel < old_sel. Signed-off-by: Axel Lin <axel.lin@ingics.com> Link: https://lore.kernel.org/r/20210618141412.4014912-1-axel.lin@ingics.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23Merge branch 'topic/ppc-kvm' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into HEAD - Support for the H_RPT_INVALIDATE hypercall - Conversion of Book3S entry/exit to C - Bug fixes
2021-06-23spi: spi-sun6i: Fix chipselect/clock bugMirko Vogt
The current sun6i SPI implementation initializes the transfer too early, resulting in SCK going high before the transfer. When using an additional (gpio) chipselect with sun6i, the chipselect is asserted at a time when clock is high, making the SPI transfer fail. This is due to SUN6I_GBL_CTL_BUS_ENABLE being written into SUN6I_GBL_CTL_REG at an early stage. Moving that to the transfer function, hence, right before the transfer starts, mitigates that problem. Fixes: 3558fe900e8af (spi: sunxi: Add Allwinner A31 SPI controller driver) Signed-off-by: Mirko Vogt <mirko-dev|linux@nanl.de> Signed-off-by: Ralf Schlatterbeck <rsc@runtux.com> Link: https://lore.kernel.org/r/20210614144507.y3udezjfbko7eavv@runtux.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23regulator: mt6358: Fix vdram2 .vsel_maskHsin-Hsiung Wang
The valid vsel value are 0 and 12, so the .vsel_mask should be 0xf. Signed-off-by: Hsin-Hsiung Wang <hsin-hsiung.wang@mediatek.com> Reviewed-by: Axel Lin <axel.lin@ingics.com> Link: https://lore.kernel.org/r/1624424169-510-1-git-send-email-hsin-hsiung.wang@mediatek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-23x86/sev: Use "SEV: " prefix for messages from sev.cJoerg Roedel
The source file has been renamed froms sev-es.c to sev.c, but the messages are still prefixed with "SEV-ES: ". Change that to "SEV: " to make it consistent. Fixes: e759959fe3b8 ("x86/sev-es: Rename sev-es.{ch} to sev.{ch}") Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210622144825.27588-4-joro@8bytes.org
2021-06-23x86/sev: Add defines for GHCB version 2 MSR protocol requestsBrijesh Singh
Add the necessary defines for supporting the GHCB version 2 protocol. This includes defines for: - MSR-based AP hlt request/response - Hypervisor Feature request/response This is the bare minimum of requests that need to be supported by a GHCB version 2 implementation. There are more requests in the specification, but those depend on Secure Nested Paging support being available. These defines are shared between SEV host and guest support. [ bp: Fold in https://lkml.kernel.org/r/20210622144825.27588-2-joro@8bytes.org too. Simplify the brewing macro maze into readability. ] Co-developed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/YNLXQIZ5e1wjkshG@8bytes.org
2021-06-23KVM: s390: allow facility 192 (vector-packed-decimal-enhancement facility 2)Christian Borntraeger
pass through newer vector instructions if vector support is enabled. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-06-23KVM: s390: gen_facilities: allow facilities 165, 193, 194 and 196Christian Borntraeger
This enables the NNPA, BEAR enhancement,reset DAT protection and processor activity counter facilities via the cpu model. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-06-23KVM: s390: get rid of register asm usageHeiko Carstens
Using register asm statements has been proven to be very error prone, especially when using code instrumentation where gcc may add function calls, which clobbers register contents in an unexpected way. Therefore get rid of register asm statements in kvm code, even though there is currently nothing wrong with them. This way we know for sure that this bug class won't be introduced here. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20210621140356.1210771-1-hca@linux.ibm.com [borntraeger@de.ibm.com: checkpatch strict fix] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-06-22scsi: sd: Call sd_revalidate_disk() for ioctl(BLKRRPART)Christoph Hellwig
While the disk state has nothing to do with partitions, BLKRRPART is used to force a full revalidate after things like a disk format for historical reasons. Restore that behavior. Link: https://lore.kernel.org/r/20210617115504.1732350-1-hch@lst.de Fixes: 471bd0af544b ("sd: use bdev_check_media_change") Reported-by: Xiang Chen <chenxiang66@hisilicon.com> Tested-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-22module: limit enabling module.sig_enforceMimi Zohar
Irrespective as to whether CONFIG_MODULE_SIG is configured, specifying "module.sig_enforce=1" on the boot command line sets "sig_enforce". Only allow "sig_enforce" to be set when CONFIG_MODULE_SIG is configured. This patch makes the presence of /sys/module/module/parameters/sig_enforce dependent on CONFIG_MODULE_SIG=y. Fixes: fda784e50aac ("module: export module signature enforcement status") Reported-by: Nayna Jain <nayna@linux.ibm.com> Tested-by: Mimi Zohar <zohar@linux.ibm.com> Tested-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-22btrfs: remove unused btrfs_fs_info::total_pinnedNikolay Borisov
This got added 14 years ago in 324ae4df00fd ("Btrfs: Add block group pinned accounting back") but it was not ever used. Subsequently its usage got gradually removed in 8790d502e440 ("Btrfs: Add support for mirroring across drives") and 11833d66be94 ("Btrfs: improve async block group caching"). Let's remove it for good! Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-22drm/kmb: Fix error return code in kmb_hw_init()Zhen Lei
When the call to platform_get_irq() to obtain the IRQ of the lcd fails, the returned error code should be propagated. However, we currently do not explicitly assign this error code to 'ret'. As a result, 0 was incorrectly returned. Fixes: 7f7b96a8a0a1 ("drm/kmb: Add support for KeemBay Display") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Anitha Chrisanthus <anitha.chrisanthus@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210513134639.6541-1-thunder.leizhen@huawei.com
2021-06-22Revert "PCI: PM: Do not read power state in pci_enable_device_flags()"Rafael J. Wysocki
Revert commit 4514d991d992 ("PCI: PM: Do not read power state in pci_enable_device_flags()") that is reported to cause PCI device initialization issues on some systems. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213481 Link: https://lore.kernel.org/linux-acpi/YNDoGICcg0V8HhpQ@eldamar.lan Reported-by: Michael <phyre@rogers.com> Reported-by: Salvatore Bonaccorso <carnil@debian.org> Fixes: 4514d991d992 ("PCI: PM: Do not read power state in pci_enable_device_flags()") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-22locking/lockdep: Correct the description error for check_redundant()Xiongwei Song
If there is no matched result, check_redundant() will return BFS_RNOMATCH. Signed-off-by: Xiongwei Song <sxwjean@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lkml.kernel.org/r/20210618130230.123249-1-sxwjean@me.com
2021-06-22futex: Provide FUTEX_LOCK_PI2 to support clock selectionThomas Gleixner
The FUTEX_LOCK_PI futex operand uses a CLOCK_REALTIME based absolute timeout since it was implemented, but it does not require that the FUTEX_CLOCK_REALTIME flag is set, because that was introduced later. In theory as none of the user space implementations can set the FUTEX_CLOCK_REALTIME flag on this operand, it would be possible to creatively abuse it and make the meaning invers, i.e. select CLOCK_REALTIME when not set and CLOCK_MONOTONIC when set. But that's a nasty hackery. Another option would be to have a new FUTEX_CLOCK_MONOTONIC flag only for FUTEX_LOCK_PI, but that's also awkward because it does not allow libraries to handle the timeout clock selection consistently. So provide a new FUTEX_LOCK_PI2 operand which implements the timeout semantics which the other operands use and leave FUTEX_LOCK_PI alone. Reported-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210422194705.440773992@linutronix.de
2021-06-22futex: Prepare futex_lock_pi() for runtime clock selectionThomas Gleixner
futex_lock_pi() is the only futex operation which cannot select the clock for timeouts (CLOCK_MONOTONIC/CLOCK_REALTIME). That's inconsistent and there is no particular reason why this cannot be supported. This was overlooked when CLOCK_REALTIME_FLAG was introduced and unfortunately not reported when the inconsistency was discovered in glibc. Prepare the function and enforce the CLOCK_REALTIME_FLAG on FUTEX_LOCK_PI so that a new FUTEX_LOCK_PI2 can implement it correctly. Reported-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210422194705.338657741@linutronix.de
2021-06-22lockdep/selftest: Remove wait-type RCU_CALLBACK testsPeter Zijlstra
The problem is that rcu_callback_map doesn't have wait_types defined, and doing so would make it indistinguishable from SOFTIRQ in any case. Remove it. Fixes: 9271a40d2a14 ("lockdep/selftest: Add wait context selftests") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20210617190313.384290291@infradead.org
2021-06-22lockdep/selftests: Fix selftests vs PROVE_RAW_LOCK_NESTINGPeter Zijlstra
When PROVE_RAW_LOCK_NESTING=y many of the selftests FAILED because HARDIRQ context is out-of-bounds for spinlocks. Instead make the default hardware context the threaded hardirq context, which preserves the old locking rules. The wait-type specific locking selftests will have a non-threaded HARDIRQ variant. Fixes: de8f5e4f2dc1 ("lockdep: Introduce wait-type checks") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20210617190313.322096283@infradead.org
2021-06-22lockdep: Fix wait-type for empty stackPeter Zijlstra
Even the very first lock can violate the wait-context check, consider the various IRQ contexts. Fixes: de8f5e4f2dc1 ("lockdep: Introduce wait-type checks") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20210617190313.256987481@infradead.org
2021-06-22locking/selftests: Add a selftest for check_irq_usage()Boqun Feng
Johannes Berg reported a lockdep problem which could be reproduced by the special test case introduced in this patch, so add it. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210618170110.3699115-5-boqun.feng@gmail.com
2021-06-22lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage()Boqun Feng
In the step #3 of check_irq_usage(), we seach backwards to find a lock whose usage conflicts the usage of @target_entry1 on safe/unsafe. However, we should only keep the irq-unsafe usage of @target_entry1 into consideration, because it could be a case where a lock is hardirq-unsafe but soft-safe, and in check_irq_usage() we find it because its hardirq-unsafe could result into a hardirq-safe-unsafe deadlock, but currently since we don't filter out the other usage bits, so we may find a lock dependency path softirq-unsafe -> softirq-safe, which in fact doesn't cause a deadlock. And this may cause misleading lockdep splats. Fix this by only keeping LOCKF_ENABLED_IRQ_ALL bits when we try the backwards search. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210618170110.3699115-4-boqun.feng@gmail.com
2021-06-22locking/lockdep: Remove the unnecessary trace savingBoqun Feng
In print_bad_irq_dependency(), save_trace() is called to set the ->trace for @prev_root as the current call trace, however @prev_root corresponds to the the held lock, which may not be acquired in current call trace, therefore it's wrong to use save_trace() to set ->trace of @prev_root. Moreover, with our adjustment of printing backwards dependency path, the ->trace of @prev_root is unncessary, so remove it. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210618170110.3699115-3-boqun.feng@gmail.com
2021-06-22locking/lockdep: Fix the dep path printing for backwards BFSBoqun Feng
We use the same code to print backwards lock dependency path as the forwards lock dependency path, and this could result into incorrect printing because for a backwards lock_list ->trace is not the call trace where the lock of ->class is acquired. Fix this by introducing a separate function on printing the backwards dependency path. Also add a few comments about the printing while we are at it. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210618170110.3699115-2-boqun.feng@gmail.com
2021-06-22sched/uclamp: Fix uclamp_tg_restrict()Qais Yousef
Now cpu.uclamp.min acts as a protection, we need to make sure that the uclamp request of the task is within the allowed range of the cgroup, that is it is clamp()'ed correctly by tg->uclamp[UCLAMP_MIN] and tg->uclamp[UCLAMP_MAX]. As reported by Xuewen [1] we can have some corner cases where there's inversion between uclamp requested by task (p) and the uclamp values of the taskgroup it's attached to (tg). Following table demonstrates 2 corner cases: | p | tg | effective -----------+-----+------+----------- CASE 1 -----------+-----+------+----------- uclamp_min | 60% | 0% | 60% -----------+-----+------+----------- uclamp_max | 80% | 50% | 50% -----------+-----+------+----------- CASE 2 -----------+-----+------+----------- uclamp_min | 0% | 30% | 30% -----------+-----+------+----------- uclamp_max | 20% | 50% | 20% -----------+-----+------+----------- With this fix we get: | p | tg | effective -----------+-----+------+----------- CASE 1 -----------+-----+------+----------- uclamp_min | 60% | 0% | 50% -----------+-----+------+----------- uclamp_max | 80% | 50% | 50% -----------+-----+------+----------- CASE 2 -----------+-----+------+----------- uclamp_min | 0% | 30% | 30% -----------+-----+------+----------- uclamp_max | 20% | 50% | 30% -----------+-----+------+----------- Additionally uclamp_update_active_tasks() must now unconditionally update both UCLAMP_MIN/MAX because changing the tg's UCLAMP_MAX for instance could have an impact on the effective UCLAMP_MIN of the tasks. | p | tg | effective -----------+-----+------+----------- old -----------+-----+------+----------- uclamp_min | 60% | 0% | 50% -----------+-----+------+----------- uclamp_max | 80% | 50% | 50% -----------+-----+------+----------- *new* -----------+-----+------+----------- uclamp_min | 60% | 0% | *60%* -----------+-----+------+----------- uclamp_max | 80% |*70%* | *70%* -----------+-----+------+----------- [1] https://lore.kernel.org/lkml/CAB8ipk_a6VFNjiEnHRHkUMBKbA+qzPQvhtNjJ_YNzQhqV_o8Zw@mail.gmail.com/ Fixes: 0c18f2ecfcc2 ("sched/uclamp: Fix wrong implementation of cpu.uclamp.min") Reported-by: Xuewen Yan <xuewen.yan94@gmail.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210617165155.3774110-1-qais.yousef@arm.com
2021-06-22sched/rt: Fix Deadline utilization tracking during policy changeVincent Donnefort
DL keeps track of the utilization on a per-rq basis with the structure avg_dl. This utilization is updated during task_tick_dl(), put_prev_task_dl() and set_next_task_dl(). However, when the current running task changes its policy, set_next_task_dl() which would usually take care of updating the utilization when the rq starts running DL tasks, will not see a such change, leaving the avg_dl structure outdated. When that very same task will be dequeued later, put_prev_task_dl() will then update the utilization, based on a wrong last_update_time, leading to a huge spike in the DL utilization signal. The signal would eventually recover from this issue after few ms. Even if no DL tasks are run, avg_dl is also updated in __update_blocked_others(). But as the CPU capacity depends partly on the avg_dl, this issue has nonetheless a significant impact on the scheduler. Fix this issue by ensuring a load update when a running task changes its policy to DL. Fixes: 3727e0e ("sched/dl: Add dl_rq utilization tracking") Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/1624271872-211872-3-git-send-email-vincent.donnefort@arm.com
2021-06-22sched/rt: Fix RT utilization tracking during policy changeVincent Donnefort
RT keeps track of the utilization on a per-rq basis with the structure avg_rt. This utilization is updated during task_tick_rt(), put_prev_task_rt() and set_next_task_rt(). However, when the current running task changes its policy, set_next_task_rt() which would usually take care of updating the utilization when the rq starts running RT tasks, will not see a such change, leaving the avg_rt structure outdated. When that very same task will be dequeued later, put_prev_task_rt() will then update the utilization, based on a wrong last_update_time, leading to a huge spike in the RT utilization signal. The signal would eventually recover from this issue after few ms. Even if no RT tasks are run, avg_rt is also updated in __update_blocked_others(). But as the CPU capacity depends partly on the avg_rt, this issue has nonetheless a significant impact on the scheduler. Fix this issue by ensuring a load update when a running task changes its policy to RT. Fixes: 371bf427 ("sched/rt: Add rt_rq utilization tracking") Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/1624271872-211872-2-git-send-email-vincent.donnefort@arm.com
2021-06-23KVM: PPC: Book3S HV: Workaround high stack usage with clangNathan Chancellor
LLVM does not emit optimal byteswap assembly, which results in high stack usage in kvmhv_enter_nested_guest() due to the inlining of byteswap_pt_regs(). With LLVM 12.0.0: arch/powerpc/kvm/book3s_hv_nested.c:289:6: error: stack frame size of 2512 bytes in function 'kvmhv_enter_nested_guest' [-Werror,-Wframe-larger-than=] long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) ^ 1 error generated. While this gets fixed in LLVM, mark byteswap_pt_regs() as noinline_for_stack so that it does not get inlined and break the build due to -Werror by default in arch/powerpc/. Not inlining saves approximately 800 bytes with LLVM 12.0.0: arch/powerpc/kvm/book3s_hv_nested.c:290:6: warning: stack frame size of 1728 bytes in function 'kvmhv_enter_nested_guest' [-Wframe-larger-than=] long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) ^ 1 warning generated. Cc: stable@vger.kernel.org # v4.20+ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/ClangBuiltLinux/linux/issues/1292 Link: https://bugs.llvm.org/show_bug.cgi?id=49610 Link: https://lore.kernel.org/r/202104031853.vDT0Qjqj-lkp@intel.com/ Link: https://gist.github.com/ba710e3703bf45043a31e2806c843ffd Link: https://lore.kernel.org/r/20210621182440.990242-1-nathan@kernel.org