summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-11-11KVM: x86/mmu: Properly dereference rcu-protected TDP MMU sptep iteratorSean Christopherson
Wrap the read of iter->sptep in tdp_mmu_map_handle_target_level() with rcu_dereference(). Shadow pages in the TDP MMU, and thus their SPTEs, are protected by rcu. This fixes a Sparse warning at tdp_mmu.c:900:51: warning: incorrect type in argument 1 (different address spaces) expected unsigned long long [usertype] *sptep got unsigned long long [noderef] [usertype] __rcu *[usertype] sptep Fixes: 7158bee4b475 ("KVM: MMU: pass kvm_mmu_page struct to make_spte") Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211103161833.3769487-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11KVM: x86: inhibit APICv when KVM_GUESTDBG_BLOCKIRQ activeMaxim Levitsky
KVM_GUESTDBG_BLOCKIRQ relies on interrupts being injected using standard kvm's inject_pending_event, and not via APICv/AVIC. Since this is a debug feature, just inhibit APICv/AVIC while KVM_GUESTDBG_BLOCKIRQ is in use on at least one vCPU. Fixes: 61e5f69ef0837 ("KVM: x86: implement KVM_GUESTDBG_BLOCKIRQ") Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Tested-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211108090245.166408-1-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11kvm: x86: Convert return type of *is_valid_rdpmc_ecx() to boolJim Mattson
These function names sound like predicates, and they have siblings, *is_valid_msr(), which _are_ predicates. Moreover, there are comments that essentially warn that these functions behave unexpectedly. Flip the polarity of the return values, so that they become predicates, and convert the boolean result to a success/failure code at the outer call site. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211105202058.1048757-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11KVM: x86: Fix recording of guest steal time / preempted statusDavid Woodhouse
In commit b043138246a4 ("x86/KVM: Make sure KVM_VCPU_FLUSH_TLB flag is not missed") we switched to using a gfn_to_pfn_cache for accessing the guest steal time structure in order to allow for an atomic xchg of the preempted field. This has a couple of problems. Firstly, kvm_map_gfn() doesn't work at all for IOMEM pages when the atomic flag is set, which it is in kvm_steal_time_set_preempted(). So a guest vCPU using an IOMEM page for its steal time would never have its preempted field set. Secondly, the gfn_to_pfn_cache is not invalidated in all cases where it should have been. There are two stages to the GFN->PFN conversion; first the GFN is converted to a userspace HVA, and then that HVA is looked up in the process page tables to find the underlying host PFN. Correct invalidation of the latter would require being hooked up to the MMU notifiers, but that doesn't happen---so it just keeps mapping and unmapping the *wrong* PFN after the userspace page tables change. In the !IOMEM case at least the stale page *is* pinned all the time it's cached, so it won't be freed and reused by anyone else while still receiving the steal time updates. The map/unmap dance only takes care of the KVM administrivia such as marking the page dirty. Until the gfn_to_pfn cache handles the remapping automatically by integrating with the MMU notifiers, we might as well not get a kernel mapping of it, and use the perfectly serviceable userspace HVA that we already have. We just need to implement the atomic xchg on the userspace address with appropriate exception handling, which is fairly trivial. Cc: stable@vger.kernel.org Fixes: b043138246a4 ("x86/KVM: Make sure KVM_VCPU_FLUSH_TLB flag is not missed") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <3645b9b889dac6438394194bb5586a46b68d581f.camel@infradead.org> [I didn't entirely agree with David's assessment of the usefulness of the gfn_to_pfn cache, and integrated the outcome of the discussion in the above commit message. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11selftest: KVM: Add intra host migration testsPeter Gonda
Adds testcases for intra host migration for SEV and SEV-ES. Also adds locking test to confirm no deadlock exists. Signed-off-by: Peter Gonda <pgonda@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20211021174303.385706-6-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11selftest: KVM: Add open sev dev helperPeter Gonda
Refactors out open path support from open_kvm_dev_path_or_exit() and adds new helper for SEV device path. Signed-off-by: Peter Gonda <pgonda@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20211021174303.385706-5-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11KVM: SEV: Add support for SEV-ES intra host migrationPeter Gonda
For SEV-ES to work with intra host migration the VMSAs, GHCB metadata, and other SEV-ES info needs to be preserved along with the guest's memory. Signed-off-by: Peter Gonda <pgonda@google.com> Reviewed-by: Marc Orr <marcorr@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20211021174303.385706-4-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11KVM: SEV: Add support for SEV intra host migrationPeter Gonda
For SEV to work with intra host migration, contents of the SEV info struct such as the ASID (used to index the encryption key in the AMD SP) and the list of memory regions need to be transferred to the target VM. This change adds a commands for a target VMM to get a source SEV VM's sev info. Signed-off-by: Peter Gonda <pgonda@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Marc Orr <marcorr@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20211021174303.385706-3-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11KVM: SEV: provide helpers to charge/uncharge misc_cgPaolo Bonzini
Avoid code duplication across all callers of misc_cg_try_charge and misc_cg_uncharge. The resource type for KVM is always derived from sev->es_active, and the quantity is always 1. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11KVM: generalize "bugged" VM to "dead" VMPaolo Bonzini
Generalize KVM_REQ_VM_BUGGED so that it can be called even in cases where it is by design that the VM cannot be operated upon. In this case any KVM_BUG_ON should still warn, so introduce a new flag kvm->vm_dead that is separate from kvm->vm_bugged. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11KVM: SEV: Refactor out sev_es_state structPeter Gonda
Move SEV-ES vCPU metadata into new sev_es_state struct from vcpu_svm. Signed-off-by: Peter Gonda <pgonda@google.com> Suggested-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20211021174303.385706-2-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11ALSA: fireworks: add support for Loud Onyx 1200f quirkTakashi Sakamoto
Loud Technologies shipped Onyx 1200f 2008 in its Mackie brand and already discontinued. The model uses component of Fireworks board module as its communication and DSP function. The latest firmware (v4.6.0) has a quirk that tx packet includes wrong value (0x1f) in its dbs field at middle and higher sampling transfer frequency. It brings ALSA fireworks driver discontinuity of data block counter. This commit fixes it by assuming it as a quirk of firmware version 4.6.0. $ cd linux-firewire-tools/src $ python crpp < /sys/bus/firewire/devices/fw1/config_rom ROM header and bus information block ----------------------------------------------------------------- 400 0404b9ef bus_info_length 4, crc_length 4, crc 47599 404 31333934 bus_name "1394" 408 e064a212 irmc 1, cmc 1, isc 1, bmc 0, pmc 0, cyc_clk_acc 100, max_rec 10 (2048), max_rom 2, gen 1, spd 2 (S400) 40c 000ff209 company_id 000ff2 | 410 62550ce0 device_id 0962550ce0 | EUI-64 000ff20962550ce0 root directory ----------------------------------------------------------------- 414 0008088e directory_length 8, crc 2190 418 03000ff2 vendor 41c 8100000f --> descriptor leaf at 458 420 1701200f model 424 81000018 --> descriptor leaf at 484 428 0c008380 node capabilities 42c 8d000003 --> eui-64 leaf at 438 430 d1000005 --> unit directory at 444 434 08000ff2 (immediate value) eui-64 leaf at 438 ----------------------------------------------------------------- 438 000281ae leaf_length 2, crc 33198 43c 000ff209 company_id 000ff2 | 440 62550ce0 device_id 0962550ce0 | EUI-64 000ff20962550ce0 unit directory at 444 ----------------------------------------------------------------- 444 00045d94 directory_length 4, crc 23956 448 1200a02d specifier id: 1394 TA 44c 13010000 version 450 1701200f model 454 8100000c --> descriptor leaf at 484 descriptor leaf at 458 ----------------------------------------------------------------- 458 000a199d leaf_length 10, crc 6557 45c 00000000 textual descriptor 460 00000000 minimal ASCII 464 4d61636b "Mack" 468 69650000 "ie" 46c 00000000 470 00000000 474 00000000 478 00000000 47c 00000000 480 00000000 descriptor leaf at 484 ----------------------------------------------------------------- 484 000a0964 leaf_length 10, crc 2404 488 00000000 textual descriptor 48c 00000000 minimal ASCII 490 4f6e7978 "Onyx" 494 20313230 " 120" 498 30460000 "0F" 49c 00000000 4a0 00000000 4a4 00000000 4a8 00000000 4ac 00000000 Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20211111103015.7498-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-11-11Merge branch 'kvm-guest-sev-migration' into kvm-masterPaolo Bonzini
Add guest api and guest kernel support for SEV live migration. Introduces a new hypercall to notify the host of changes to the page encryption status. If the page is encrypted then it must be migrated through the SEV firmware or a helper VM sharing the key. If page is not encrypted then it can be migrated normally by userspace. This new hypercall is invoked using paravirt_ops. Conflicts: sev_active() replaced by cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT).
2021-11-11x86/kvm: Add kexec support for SEV Live Migration.Ashish Kalra
Reset the host's shared pages list related to kernel specific page encryption status settings before we load a new kernel by kexec. We cannot reset the complete shared pages list here as we need to retain the UEFI/OVMF firmware specific settings. The host's shared pages list is maintained for the guest to keep track of all unencrypted guest memory regions, therefore we need to explicitly mark all shared pages as encrypted again before rebooting into the new guest kernel. Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Reviewed-by: Steve Rutherford <srutherford@google.com> Message-Id: <3e051424ab839ea470f88333273d7a185006754f.1629726117.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11x86/kvm: Add guest support for detecting and enabling SEV Live Migration ↵Ashish Kalra
feature. The guest support for detecting and enabling SEV Live migration feature uses the following logic : - kvm_init_plaform() checks if its booted under the EFI - If not EFI, i) if kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL), issue a wrmsrl() to enable the SEV live migration support - If EFI, i) If kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL), read the UEFI variable which indicates OVMF support for live migration ii) the variable indicates live migration is supported, issue a wrmsrl() to enable the SEV live migration support The EFI live migration check is done using a late_initcall() callback. Also, ensure that _bss_decrypted section is marked as decrypted in the hypervisor's guest page encryption status tracking. Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Reviewed-by: Steve Rutherford <srutherford@google.com> Message-Id: <b4453e4c87103ebef12217d2505ea99a1c3e0f0f.1629726117.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11EFI: Introduce the new AMD Memory Encryption GUID.Ashish Kalra
Introduce a new AMD Memory Encryption GUID which is currently used for defining a new UEFI environment variable which indicates UEFI/OVMF support for the SEV live migration feature. This variable is setup when UEFI/OVMF detects host/hypervisor support for SEV live migration and later this variable is read by the kernel using EFI runtime services to verify if OVMF supports the live migration feature. Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Message-Id: <1cea22976d2208f34d47e0c1ce0ecac816c13111.1629726117.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11mm: x86: Invoke hypercall when page encryption status is changedBrijesh Singh
Invoke a hypercall when a memory region is changed from encrypted -> decrypted and vice versa. Hypervisor needs to know the page encryption status during the guest migration. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Steve Rutherford <srutherford@google.com> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Reviewed-by: Borislav Petkov <bp@suse.de> Message-Id: <0a237d5bb08793916c7790a3e653a2cbe7485761.1629726117.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11x86/kvm: Add AMD SEV specific Hypercall3Brijesh Singh
KVM hypercall framework relies on alternative framework to patch the VMCALL -> VMMCALL on AMD platform. If a hypercall is made before apply_alternative() is called then it defaults to VMCALL. The approach works fine on non SEV guest. A VMCALL would causes #UD, and hypervisor will be able to decode the instruction and do the right things. But when SEV is active, guest memory is encrypted with guest key and hypervisor will not be able to decode the instruction bytes. To highlight the need to provide this interface, capturing the flow of apply_alternatives() : setup_arch() call init_hypervisor_platform() which detects the hypervisor platform the kernel is running under and then the hypervisor specific initialization code can make early hypercalls. For example, KVM specific initialization in case of SEV will try to mark the "__bss_decrypted" section's encryption state via early page encryption status hypercalls. Now, apply_alternatives() is called much later when setup_arch() calls check_bugs(), so we do need some kind of an early, pre-alternatives hypercall interface. Other cases of pre-alternatives hypercalls include marking per-cpu GHCB pages as decrypted on SEV-ES and per-cpu apf_reason, steal_time and kvm_apic_eoi as decrypted for SEV generally. Add SEV specific hypercall3, it unconditionally uses VMMCALL. The hypercall will be used by the SEV guest to notify encrypted pages to the hypervisor. This kvm_sev_hypercall3() function is abstracted and used as follows : All these early hypercalls are made through early_set_memory_XX() interfaces, which in turn invoke pv_ops (paravirt_ops). This early_set_memory_XX() -> pv_ops.mmu.notify_page_enc_status_changed() is a generic interface and can easily have SEV, TDX and any other future platform specific abstractions added to it. Currently, pv_ops.mmu.notify_page_enc_status_changed() callback is setup to invoke kvm_sev_hypercall3() in case of SEV. Similarly, in case of TDX, pv_ops.mmu.notify_page_enc_status_changed() can be setup to a TDX specific callback. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Steve Rutherford <srutherford@google.com> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-Id: <6fd25c749205dd0b1eb492c60d41b124760cc6ae.1629726117.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11selftests/net: udpgso_bench_rx: fix port argumentWillem de Bruijn
The below commit added optional support for passing a bind address. It configures the sockaddr bind arguments before parsing options and reconfigures on options -b and -4. This broke support for passing port (-p) on its own. Configure sockaddr after parsing all arguments. Fixes: 3327a9c46352 ("selftests: add functionals test for UDP GRO") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-11net: wwan: iosm: fix compilation warningM Chetan Kumar
curr_phase is unused. Removed the dead code. Fixes: 8d9be0634181 ("net: wwan: iosm: transport layer support for fw flashing/cd") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-11cxgb4: fix eeprom len when diagnostics not implementedRahul Lakkireddy
Ensure diagnostics monitoring support is implemented for the SFF 8472 compliant port module and set the correct length for ethtool port module eeprom read. Fixes: f56ec6766dcf ("cxgb4: Add support for ethtool i2c dump") Signed-off-by: Manoj Malviya <manojmalviya@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-11drm/ttm: Double check mem_type of BO while evictionxinhui pan
BO might sit in a wrong lru list as there is a small period of memory moving and lru list updating. Lets skip eviction if we hit such mismatch. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211110043149.57554-2-xinhui.pan@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
2021-11-11ata: sata_highbank: Remove unnecessary print function dev_err()Xu Wang
The print function dev_err() is redundant because platform_get_irq() already prints an error. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2021-11-11libata: fix read log timeout valueDamien Le Moal
Some ATA drives are very slow to respond to READ_LOG_EXT and READ_LOG_DMA_EXT commands issued from ata_dev_configure() when the device is revalidated right after resuming a system or inserting the ATA adapter driver (e.g. ahci). The default 5s timeout (ATA_EH_CMD_DFL_TIMEOUT) used for these commands is too short, causing errors during the device configuration. Ex: ... ata9: SATA max UDMA/133 abar m524288@0x9d200000 port 0x9d200400 irq 209 ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300) ata9.00: ATA-9: XXX XXXXXXXXXXXXXXX, XXXXXXXX, max UDMA/133 ata9.00: qc timeout (cmd 0x2f) ata9.00: Read log page 0x00 failed, Emask 0x4 ata9.00: Read log page 0x00 failed, Emask 0x40 ata9.00: NCQ Send/Recv Log not supported ata9.00: Read log page 0x08 failed, Emask 0x40 ata9.00: 27344764928 sectors, multi 16: LBA48 NCQ (depth 32), AA ata9.00: Read log page 0x00 failed, Emask 0x40 ata9.00: ATA Identify Device Log not supported ata9.00: failed to set xfermode (err_mask=0x40) ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300) ata9.00: configured for UDMA/133 ... The timeout error causes a soft reset of the drive link, followed in most cases by a successful revalidation as that give enough time to the drive to become fully ready to quickly process the read log commands. However, in some cases, this also fails resulting in the device being dropped. Fix this by using adding the ata_eh_revalidate_timeouts entries for the READ_LOG_EXT and READ_LOG_DMA_EXT commands. This defines a timeout increased to 15s, retriable one time. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2021-11-10net: fix premature exit from NAPI state polling in napi_disable()Alexander Lobakin
Commit 719c57197010 ("net: make napi_disable() symmetric with enable") accidentally introduced a bug sometimes leading to a kernel BUG when bringing an iface up/down under heavy traffic load. Prior to this commit, napi_disable() was polling n->state until none of (NAPIF_STATE_SCHED | NAPIF_STATE_NPSVC) is set and then always flip them. Now there's a possibility to get away with the NAPIF_STATE_SCHE unset as 'continue' drops us to the cmpxchg() call with an uninitialized variable, rather than straight to another round of the state check. Error path looks like: napi_disable(): unsigned long val, new; /* new is uninitialized */ do { val = READ_ONCE(n->state); /* NAPIF_STATE_NPSVC and/or NAPIF_STATE_SCHED is set */ if (val & (NAPIF_STATE_SCHED | NAPIF_STATE_NPSVC)) { /* true */ usleep_range(20, 200); continue; /* go straight to the condition check */ } new = val | <...> } while (cmpxchg(&n->state, val, new) != val); /* state == val, cmpxchg() writes garbage */ napi_enable(): do { val = READ_ONCE(n->state); BUG_ON(!test_bit(NAPI_STATE_SCHED, &val)); /* 50/50 boom */ <...> while the typical BUG splat is like: [ 172.652461] ------------[ cut here ]------------ [ 172.652462] kernel BUG at net/core/dev.c:6937! [ 172.656914] invalid opcode: 0000 [#1] PREEMPT SMP PTI [ 172.661966] CPU: 36 PID: 2829 Comm: xdp_redirect_cp Tainted: G I 5.15.0 #42 [ 172.670222] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0014.082620210524 08/26/2021 [ 172.680646] RIP: 0010:napi_enable+0x5a/0xd0 [ 172.684832] Code: 07 49 81 cc 00 01 00 00 4c 89 e2 48 89 d8 80 e6 fb f0 48 0f b1 55 10 48 39 c3 74 10 48 8b 5d 10 f6 c7 04 75 3d f6 c3 01 75 b4 <0f> 0b 5b 5d 41 5c c3 65 ff 05 b8 e5 61 53 48 c7 c6 c0 f3 34 ad 48 [ 172.703578] RSP: 0018:ffffa3c9497477a8 EFLAGS: 00010246 [ 172.708803] RAX: ffffa3c96615a014 RBX: 0000000000000000 RCX: ffff8a4b575301a0 < snip > [ 172.782403] Call Trace: [ 172.784857] <TASK> [ 172.786963] ice_up_complete+0x6f/0x210 [ice] [ 172.791349] ice_xdp+0x136/0x320 [ice] [ 172.795108] ? ice_change_mtu+0x180/0x180 [ice] [ 172.799648] dev_xdp_install+0x61/0xe0 [ 172.803401] dev_xdp_attach+0x1e0/0x550 [ 172.807240] dev_change_xdp_fd+0x1e6/0x220 [ 172.811338] do_setlink+0xee8/0x1010 [ 172.814917] rtnl_setlink+0xe5/0x170 [ 172.818499] ? bpf_lsm_binder_set_context_mgr+0x10/0x10 [ 172.823732] ? security_capable+0x36/0x50 < snip > Fix this by replacing 'do { } while (cmpxchg())' with an "infinite" for-loop with an explicit break. From v1 [0]: - just use a for-loop to simplify both the fix and the existing code (Eric). [0] https://lore.kernel.org/netdev/20211110191126.1214-1-alexandr.lobakin@intel.com Fixes: 719c57197010 ("net: make napi_disable() symmetric with enable") Suggested-by: Eric Dumazet <edumazet@google.com> # for-loop Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20211110195605.1304-1-alexandr.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-10Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Only bug fixes and cleanups for ext4 this merge window. Of note are fixes for the combination of the inline_data and fast_commit fixes, and more accurately calculating when to schedule additional lazy inode table init, especially when CONFIG_HZ is 100HZ" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix error code saved on super block during file system abort ext4: inline data inode fast commit replay fixes ext4: commit inline data during fast commit ext4: scope ret locally in ext4_try_to_trim_range() ext4: remove an unused variable warning with CONFIG_QUOTA=n ext4: fix boolreturn.cocci warnings in fs/ext4/name.c ext4: prevent getting empty inode buffer ext4: move ext4_fill_raw_inode() related functions ext4: factor out ext4_fill_raw_inode() ext4: prevent partial update of the extent blocks ext4: check for inconsistent extents between index and leaf block ext4: check for out-of-order index extents in ext4_valid_extent_entries() ext4: convert from atomic_t to refcount_t on ext4_io_end->count ext4: refresh the ext4_ext_path struct after dropping i_data_sem. ext4: ensure enough credits in ext4_ext_shift_path_extents ext4: correct the left/middle/right debug message for binsearch ext4: fix lazy initialization next schedule time computation in more granular unit Revert "ext4: enforce buffer head state assertion in ext4_da_map_blocks"
2021-11-10Merge tag 'for-5.16-deadlock-fix-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "Fix for a deadlock when direct/buffered IO is done on a mmaped file and a fault happens (details in the patch). There's a fstest generic/647 that triggers the problem and makes testing hard" * tag 'for-5.16-deadlock-fix-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix deadlock due to page faults during direct IO reads and writes
2021-11-10Merge tag 'nfsd-5.16' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "A slow cycle for nfsd: mainly cleanup, including Neil's patch dropping support for a filehandle format deprecated 20 years ago, and further xdr-related cleanup from Chuck" * tag 'nfsd-5.16' of git://linux-nfs.org/~bfields/linux: (26 commits) nfsd4: remove obselete comment nfsd: document server-to-server-copy parameters NFSD:fix boolreturn.cocci warning nfsd: update create verifier comment SUNRPC: Change return value type of .pc_encode SUNRPC: Replace the "__be32 *p" parameter to .pc_encode NFSD: Save location of NFSv4 COMPOUND status SUNRPC: Change return value type of .pc_decode SUNRPC: Replace the "__be32 *p" parameter to .pc_decode SUNRPC: De-duplicate .pc_release() call sites SUNRPC: Simplify the SVC dispatch code path SUNRPC: Capture value of xdr_buf::page_base SUNRPC: Add trace event when alloc_pages_bulk() makes no progress svcrdma: Split svcrmda_wc_{read,write} tracepoints svcrdma: Split the svcrdma_wc_send() tracepoint svcrdma: Split the svcrdma_wc_receive() tracepoint NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment() SUNRPC: xdr_stream_subsegment() must handle non-zero page_bases NFSD: Initialize pointer ni with NULL and not plain integer 0 NFSD: simplify struct nfsfh ...
2021-11-10Merge tag 'nfs-for-5.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: Features: - NFSv4.1 can always retrieve and cache the ACCESS mode on OPEN - Optimisations for READDIR and the 'ls -l' style workload - Further replacements of dprintk() with tracepoints and other tracing improvements - Ensure we re-probe NFSv4 server capabilities when the user does a "mount -o remount" Bugfixes: - Fix an Oops in pnfs_mark_request_commit() - Fix up deadlocks in the commit code - Fix regressions in NFSv2/v3 attribute revalidation due to the change_attr_type optimisations - Fix some dentry verifier races - Fix some missing dentry verifier settings - Fix a performance regression in nfs_set_open_stateid_locked() - SUNRPC was sending multiple SYN calls when re-establishing a TCP connection. - Fix multiple NFSv4 issues due to missing sanity checking of server return values - Fix a potential Oops when FREE_STATEID races with an unmount Cleanups: - Clean up the labelled NFS code - Remove unused header <linux/pnfs_osd_xdr.h>" * tag 'nfs-for-5.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (84 commits) NFSv4: Sanity check the parameters in nfs41_update_target_slotid() NFS: Remove the nfs4_label argument from decode_getattr_*() functions NFS: Remove the nfs4_label argument from nfs_setsecurity NFS: Remove the nfs4_label argument from nfs_fhget() NFS: Remove the nfs4_label argument from nfs_add_or_obtain() NFS: Remove the nfs4_label argument from nfs_instantiate() NFS: Remove the nfs4_label from the nfs_setattrres NFS: Remove the nfs4_label from the nfs4_getattr_res NFS: Remove the f_label from the nfs4_opendata and nfs_openres NFS: Remove the nfs4_label from the nfs4_lookupp_res struct NFS: Remove the label from the nfs4_lookup_res struct NFS: Remove the nfs4_label from the nfs4_link_res struct NFS: Remove the nfs4_label from the nfs4_create_res struct NFS: Remove the nfs4_label from the nfs_entry struct NFS: Create a new nfs_alloc_fattr_with_label() function NFS: Always initialise fattr->label in nfs_fattr_alloc() NFSv4.2: alloc_file_pseudo() takes an open flag, not an f_mode NFS: Don't allocate nfs_fattr on the stack in __nfs42_ssc_open() NFSv4: Remove unnecessary 'minor version' check NFSv4: Fix potential Oops in decode_op_map() ...
2021-11-10Merge branch 'exit-cleanups-for-v5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull exit cleanups from Eric Biederman: "While looking at some issues related to the exit path in the kernel I found several instances where the code is not using the existing abstractions properly. This set of changes introduces force_fatal_sig a way of sending a signal and not allowing it to be caught, and corrects the misuse of the existing abstractions that I found. A lot of the misuse of the existing abstractions are silly things such as doing something after calling a no return function, rolling BUG by hand, doing more work than necessary to terminate a kernel thread, or calling do_exit(SIGKILL) instead of calling force_sig(SIGKILL). In the review a deficiency in force_fatal_sig and force_sig_seccomp where ptrace or sigaction could prevent the delivery of the signal was found. I have added a change that adds SA_IMMUTABLE to change that makes it impossible to interrupt the delivery of those signals, and allows backporting to fix force_sig_seccomp And Arnd found an issue where a function passed to kthread_run had the wrong prototype, and after my cleanup was failing to build." * 'exit-cleanups-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits) soc: ti: fix wkup_m3_rproc_boot_thread return type signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed signal: Replace force_sigsegv(SIGSEGV) with force_fatal_sig(SIGSEGV) exit/r8188eu: Replace the macro thread_exit with a simple return 0 exit/rtl8712: Replace the macro thread_exit with a simple return 0 exit/rtl8723bs: Replace the macro thread_exit with a simple return 0 signal/x86: In emulate_vsyscall force a signal instead of calling do_exit signal/sparc32: In setup_rt_frame and setup_fram use force_fatal_sig signal/sparc32: Exit with a fatal signal when try_to_clear_window_buffer fails exit/syscall_user_dispatch: Send ordinary signals on failure signal: Implement force_fatal_sig exit/kthread: Have kernel threads return instead of calling do_exit signal/s390: Use force_sigsegv in default_trap_handler signal/vm86_32: Properly send SIGSEGV when the vm86 state cannot be saved. signal/vm86_32: Replace open coded BUG_ON with an actual BUG_ON signal/sparc: In setup_tsb_params convert open coded BUG into BUG signal/powerpc: On swapcontext failure force SIGSEGV signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL) signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT signal/sparc32: Remove unreachable do_exit in do_sparc_fault ...
2021-11-11Merge tag 'amd-drm-fixes-5.16-2021-11-10' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-5.16-2021-11-10: amdgpu: - Don't allow partial copy from user for DC debugfs - SRIOV fixes - GFX9 CSB pin count fix - Various IP version check fixes - DP 2.0 fixes - Limit DCN1 MPO fix to DCN1 amdkfd: - SVM fixes - Reset fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211110222536.7527-1-alexander.deucher@amd.com
2021-11-10Merge tag 'kernel.sys.v5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull prctl updates from Christian Brauner: "This contains the missing prctl uapi pieces for PR_SCHED_CORE. In order to activate core scheduling the caller is expected to specify the scope of the new core scheduling domain. For example, passing 2 in the 4th argument of prctl(PR_SCHED_CORE, PR_SCHED_CORE_CREATE, <pid>, 2, 0); would indicate that the new core scheduling domain encompasses all tasks in the process group of <pid>. Specifying 0 would only create a core scheduling domain for the thread identified by <pid> and 2 would encompass the whole thread-group of <pid>. Note, the values 0, 1, and 2 correspond to PIDTYPE_PID, PIDTYPE_TGID, and PIDTYPE_PGID. A first version tried to expose those values directly to which I objected because: - PIDTYPE_* is an enum that is kernel internal which we should not expose to userspace directly. - PIDTYPE_* indicates what a given struct pid is used for it doesn't express a scope. But what the 4th argument of PR_SCHED_CORE prctl() expresses is the scope of the operation, i.e. the scope of the core scheduling domain at creation time. So Eugene's patch now simply introduces three new defines PR_SCHED_CORE_SCOPE_THREAD, PR_SCHED_CORE_SCOPE_THREAD_GROUP, and PR_SCHED_CORE_SCOPE_PROCESS_GROUP. They simply express what happens. This has been on the mailing list for quite a while with all relevant scheduler folks Cced. I announced multiple times that I'd pick this up if I don't see or her anyone else doing it. None of this touches proper scheduler code but only concerns uapi so I think this is fine. With core scheduling being quite common now for vm managers (e.g. moving individual vcpu threads into their own core scheduling domain) and container managers (e.g. moving the init process into its own core scheduling domain and letting all created children inherit it) having to rely on raw numbers passed as the 4th argument in prctl() is a bit annoying and everyone is starting to come up with their own defines" * tag 'kernel.sys.v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: uapi/linux/prctl: provide macro definitions for the PR_SCHED_CORE type argument
2021-11-10Merge tag 'pidfd.v5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull pidfd updates from Christian Brauner: "Various places in the kernel have picked up pidfds. The two most recent additions have probably been the ability to use pidfds in bpf maps and the usage of pidfds in mm-based syscalls such as process_mrelease() and process_madvise(). The same pattern to turn a pidfd into a struct task exists in two places. One of those places used PIDTYPE_TGID while the other one used PIDTYPE_PID even though it is clearly documented in all pidfd-helpers that pidfds __currently__ only refer to thread-group leaders (subject to change in the future if need be). This isn't a bug per se but has the potential to be one if we allow pidfds to refer to individual threads. If that happens we want to audit all codepaths that make use of them to ensure they can deal with pidfds refering to individual threads. This adds a simple helper to turn a pidfd into a struct task making it easy to grep for such places. Plus, it gets rid of code-duplication" * tag 'pidfd.v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: mm: use pidfd_get_task() pid: add pidfd_get_task() helper
2021-11-10smb3: remove trivial dfs compile warningSteve French
Fix warning caused by recent changes to the dfs code: symbol 'tree_connect_dfs_target' was not declared. Should it be static? Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-10cifs: support nested dfs links over reconnectPaulo Alcantara
Mounting a dfs link that has nested links was already supported at mount(2), so make it work over reconnect as well. Make the following case work: * mount //root/dfs/link /mnt -o ... - final share: /server/share * in server settings - change target folder of /root/dfs/link3 to /server/share2 - change target folder of /root/dfs/link2 to /root/dfs/link3 - change target folder of /root/dfs/link to /root/dfs/link2 * mount -o remount,... /mnt - refresh all dfs referrals - mark current connection for failover - cifs_reconnect() reconnects to root server - tree_connect() * checks that /root/dfs/link2 is a link, then chase it * checks that root/dfs/link3 is a link, then chase it * finally tree connect to /server/share2 If the mounted share is no longer accessible and a reconnect had been triggered, the client will retry it from both last referral path (/root/dfs/link3) and original referral path (/root/dfs/link). Any new referral paths found while chasing dfs links over reconnect, it will be updated to TCP_Server_Info::leaf_fullpath, accordingly. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-10smb3: do not error on fsync when readonlySteve French
Linux allows doing a flush/fsync on a file open for read-only, but the protocol does not allow that. If the file passed in on the flush is read-only try to find a writeable handle for the same inode, if that is not possible skip sending the fsync call to the server to avoid breaking the apps. Reported-by: Julian Sikorski <belegdol@gmail.com> Tested-by: Julian Sikorski <belegdol@gmail.com> Suggested-by: Jeremy Allison <jra@samba.org> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-11-11Merge tag 'drm-misc-next-fixes-2021-11-10' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next Removed the TTM Huge Page functionnality to address a crash, a timeout fix for udl, CONFIG_FB dependency improvements, a fix for a circular locking depency in imx, a NULL pointer dereference fix for virtio, and a naming collision fix for drm/locking. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20211110082114.vfpkpnecwdfg27lk@gilmour
2021-11-10ALSA: hda: fix general protection fault in azx_runtime_idleKai Vehmanen
Fix a corner case between PCI device driver remove callback and runtime PM idle callback. Following sequence of events can happen: - at azx_create, context is allocated with devm_kzalloc() and stored as pci_set_drvdata() - user-space requests to unbind audio driver - dd.c:__device_release_driver() calls PCI remove - pci-driver.c:pci_device_remove() calls the audio driver azx_remove() callback and this is completed - pci-driver.c:pm_runtime_put_sync() leads to a call to rpm_idle() which again calls azx_runtime_idle() - the azx context object, as returned by dev_get_drvdata(), is no longer valid -> access fault in azx_runtime_idle when executing struct snd_card *card = dev_get_drvdata(dev); chip = card->private_data; if (chip->disabled || hda->init_failed) This was discovered by i915_module_load test with 5.15.0 based linux-next tree. Example log caught by i915_module_load test with linux-next https://intel-gfx-ci.01.org/tree/linux-next/ <4> [264.038232] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b73f0: 0000 [#1] PREEMPT SMP NOPTI <4> [264.038248] CPU: 0 PID: 5374 Comm: i915_module_loa Not tainted 5.15.0-next-20211109-gc8109c2ba35e-next-20211109 #1 [...] <4> [264.038267] RIP: 0010:azx_runtime_idle+0x12/0x60 [snd_hda_intel] [...] <4> [264.038355] Call Trace: <4> [264.038359] <TASK> <4> [264.038362] __rpm_callback+0x3d/0x110 <4> [264.038371] rpm_idle+0x27f/0x380 <4> [264.038376] __pm_runtime_idle+0x3b/0x100 <4> [264.038382] pci_device_remove+0x6d/0xa0 <4> [264.038388] device_release_driver_internal+0xef/0x1e0 <4> [264.038395] unbind_store+0xeb/0x120 <4> [264.038400] kernfs_fop_write_iter+0x11a/0x1c0 Fix the issue by setting drvdata to NULL at end of azx_remove(). Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20211110210307.1172004-1-kai.vehmanen@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-11-10afs: Use folios in directory handlingDavid Howells
Convert the AFS directory handling code to use folios. With these changes, afs passes -g quick xfstests. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing@auristor.com cc: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Jeff Layton <jlayton@kernel.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/162877312172.3085614.992850861791211206.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/162981154845.1901565.2078707403143240098.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/163005746215.2472992.8321380998443828308.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163584190457.4023316.10544419117563104940.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/CAH2r5mtECQA6K_OGgU=_G8qLY3G-6-jo1odVyF9EK+O2-EWLFg@mail.gmail.com/ # v3 Link: https://lore.kernel.org/r/163649330345.309189.11182522282723655658.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/163657854055.834781.5800946340537517009.stgit@warthog.procyon.org.uk/ # v5
2021-11-10netfs, 9p, afs, ceph: Use foliosDavid Howells
Convert the netfs helper library to use folios throughout, convert the 9p and afs filesystems to use folios in their file I/O paths and convert the ceph filesystem to use just enough folios to compile. With these changes, afs passes -g quick xfstests. Changes ======= ver #5: - Got rid of folio_end{io,_read,_write}() and inlined the stuff it does instead (Willy decided he didn't want this after all). ver #4: - Fixed a bug in afs_redirty_page() whereby it didn't set the next page index in the loop and returned too early. - Simplified a check in v9fs_vfs_write_folio_locked()[1]. - Undid a change to afs_symlink_readpage()[1]. - Used offset_in_folio() in afs_write_end()[1]. - Changed from using page_endio() to folio_end{io,_read,_write}()[1]. ver #2: - Add 9p foliation. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dominique Martinet <asmadeus@codewreck.org> Tested-by: kafs-testing@auristor.com cc: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Dominique Martinet <asmadeus@codewreck.org> cc: v9fs-developer@lists.sourceforge.net cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/YYKa3bfQZxK5/wDN@casper.infradead.org/ [1] Link: https://lore.kernel.org/r/2408234.1628687271@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/162877311459.3085614.10601478228012245108.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/162981153551.1901565.3124454657133703341.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/163005745264.2472992.9852048135392188995.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163584187452.4023316.500389675405550116.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/163649328026.309189.1124218109373941936.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/163657852454.834781.9265101983152100556.stgit@warthog.procyon.org.uk/ # v5
2021-11-10folio: Add a function to get the host inode for a folioDavid Howells
Add a convenience function, folio_inode() that will get the host inode from a folio's mapping. Changes: ver #3: - Fix mistake in function description[2]. ver #2: - Fix contradiction between doc and implementation by disallowing use with swap caches[1]. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dominique Martinet <asmadeus@codewreck.org> Tested-by: kafs-testing@auristor.com Link: https://lore.kernel.org/r/YST8OcVNy02Rivbm@casper.infradead.org/ [1] Link: https://lore.kernel.org/r/YYKLkBwQdtn4ja+i@casper.infradead.org/ [2] Link: https://lore.kernel.org/r/162880453171.3369675.3704943108660112470.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/162981151155.1901565.7010079316994382707.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/163005744370.2472992.18324470937328925723.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163584184628.4023316.9386282630968981869.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/163649325519.309189.15072332908703129455.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/163657850401.834781.1031963517399283294.stgit@warthog.procyon.org.uk/ # v5
2021-11-10folio: Add a function to change the private data attached to a folioDavid Howells
Add a function, folio_change_private(), that will change the private data attached to a folio, without the need to twiddle the private bit or the refcount. It assumes that folio_add_private() has already been called on the page. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dominique Martinet <asmadeus@codewreck.org> Tested-by: kafs-testing@auristor.com Link: https://lore.kernel.org/r/162981149911.1901565.17776700811659843340.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/163005743485.2472992.5100702469503007023.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163584180781.4023316.5037526301198034310.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/163649324326.309189.17817587229450840783.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/163657848531.834781.14269656212269187893.stgit@warthog.procyon.org.uk/ # v5
2021-11-10Merge tag 'thermal-5.16-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more thermal control updates from Rafael Wysocki: "These fix two issues in the thermal core and one in the int340x thermal driver. Specifics: - Replace pr_warn() with pr_warn_once() in user_space_bind() to reduce kernel log noise (Rafael Wysocki). - Extend the RFIM mailbox interface in the int340x thermal driver to return 64 bit values to allow all values returned by the hardware to be handled correctly (Srinivas Pandruvada). - Fix possible NULL pointer dereferences in the of_thermal_ family of functions (Subbaraman Narayanamurthy)" * tag 'thermal-5.16-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: Replace pr_warn() with pr_warn_once() in user_space_bind() thermal: Fix NULL pointer dereferences in of_thermal_ functions thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses
2021-11-10Merge tag 'pm-5.16-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These fix three intel_pstate driver regressions, fix locking in the core code suspending and resuming devices during system PM transitions, fix the handling of cpuidle drivers based on runtime PM during system-wide suspend, fix two issues in the operating performance points (OPP) framework and resource-managed helpers to it. Specifics: - Fix two intel_pstate driver regressions related to the HWP interrupt handling added recently (Srinivas Pandruvada). - Fix intel_pstate driver regression introduced during the 5.11 cycle and causing HWP desired performance to be mishandled in some cases when switching driver modes and during system suspend and shutdown (Rafael Wysocki). - Fix system-wide device suspend and resume locking to avoid deadlocks when device objects are deleted during a system-wide PM transition (Rafael Wysocki). - Modify system-wide suspend of devices to prevent cpuidle drivers based on runtime PM from misbehaving during the "no IRQ" phase of it (Ulf Hansson). - Fix return value of _opp_add_static_v2() helper (YueHaibing). - Fix required-opp handle count (Pavankumar Kondeti). - Add resource managed OPP helpers, update dev_pm_opp_attach_genpd(), update their devfreq users, and make minor DT binding change (Dmitry Osipenko)" * tag 'pm-5.16-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: sleep: Avoid calling put_device() under dpm_list_mtx cpufreq: intel_pstate: Clear HWP Status during HWP Interrupt enable cpufreq: intel_pstate: Fix unchecked MSR 0x773 access cpufreq: intel_pstate: Clear HWP desired on suspend/shutdown and offline PM: sleep: Fix runtime PM based cpuidle support dt-bindings: opp: Allow multi-worded OPP entry name opp: Fix return in _opp_add_static_v2() PM / devfreq: tegra30: Check whether clk_round_rate() returns zero rate PM / devfreq: tegra30: Use resource-managed helpers PM / devfreq: Add devm_devfreq_add_governor() opp: Add more resource-managed variants of dev_pm_opp_of_add_table() opp: Change type of dev_pm_opp_attach_genpd(names) argument opp: Fix required-opps phandle array count check
2021-11-10Merge tag 'acpi-5.16-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI updates from Rafael Wysocki: "These add support for a new ACPI device configuration object called _DSC, fix some issues including one recent regression, add two new items to quirk lists and clean up assorted pieces of code. Specifics: - Add support for new ACPI device configuration object called _DSC ("Deepest State for Configuration") to allow certain devices to be probed without changing their power states, document it and make two drivers use it (Sakari Ailus, Rajmohan Mani). - Fix device wakeup power reference counting broken recently by mistake (Rafael Wysocki). - Drop unused symbol and macros depending on it from acgcc.h (Rafael Wysocki). - Add HP ZHAN 66 Pro to the "no EC wakeup" quirk list (Binbin Zhou). - Add Xiaomi Mi Pad 2 to the backlight quirk list and drop an unused piece of data from all of the list entries (Hans de Goede). - Fix register read accesses handling in the Intel PMIC operation region driver (Hans de Goede). - Clean up static variables initialization in the EC driver (wangzhitong)" * tag 'acpi-5.16-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Documentation: ACPI: Fix non-D0 probe _DSC object example ACPI: Drop ACPI_USE_BUILTIN_STDARG ifdef from acgcc.h ACPI: PM: Fix device wakeup power reference counting error ACPI: video: use platform backlight driver on Xiaomi Mi Pad 2 ACPI: video: Drop dmi_system_id.ident settings from video_detect_dmi_table[] ACPI: PMIC: Fix intel_pmic_regs_handler() read accesses ACPI: EC: Remove initialization of static variables to false ACPI: EC: Use ec_no_wakeup on HP ZHAN 66 Pro at24: Support probing while in non-zero ACPI D state media: i2c: imx319: Support device probe in non-zero ACPI D state ACPI: Add a convenience function to tell a device is in D0 state Documentation: ACPI: Document _DSC object usage for enum power state i2c: Allow an ACPI driver to manage the device's power state during probe ACPI: scan: Obtain device's desired enumeration power state
2021-11-10Merge tag 'dmaengine-5.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "A bunch of driver updates, no new driver or controller support this time though: - Another pile of idxd updates - pm routines cleanup for at_xdmac driver - Correct handling of callback_result for few drivers - zynqmp_dma driver updates and descriptor management refinement - Hardware handshaking support for dw-axi-dmac - Support for remotely powered controllers in Qcom bam dma - tegra driver updates" * tag 'dmaengine-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (69 commits) dmaengine: ti: k3-udma: Set r/tchan or rflow to NULL if request fail dmaengine: ti: k3-udma: Set bchan to NULL if a channel request fail dmaengine: stm32-dma: avoid 64-bit division in stm32_dma_get_max_width dmaengine: fsl-edma: support edma memcpy dmaengine: idxd: fix resource leak on dmaengine driver disable dmaengine: idxd: cleanup completion record allocation dmaengine: zynqmp_dma: Correctly handle descriptor callbacks dmaengine: xilinx_dma: Correctly handle cyclic descriptor callbacks dmaengine: altera-msgdma: Correctly handle descriptor callbacks dmaengine: at_xdmac: fix compilation warning dmaengine: dw-axi-dmac: Simplify assignment in dma_chan_pause() dmaengine: qcom: bam_dma: Add "powered remotely" mode dt-bindings: dmaengine: bam_dma: Add "powered remotely" mode dmaengine: sa11x0: Mark PM functions as __maybe_unused dmaengine: switch from 'pci_' to 'dma_' API dmaengine: ioat: switch from 'pci_' to 'dma_' API dmaengine: hsu: switch from 'pci_' to 'dma_' API dmaengine: hisi_dma: switch from 'pci_' to 'dma_' API dmaengine: dw: switch from 'pci_' to 'dma_' API dmaengine: dw-edma-pcie: switch from 'pci_' to 'dma_' API ...
2021-11-10ALSA: hda: Free card instance properly at probe errorsTakashi Iwai
The recent change in hda-intel driver to allow repeated probes surfaced a problem that has been hidden until; the probe process in the work calls azx_free() at the error path, and this skips the card free process that eventually releases codec instances. As a result, we get a kernel WARNING like: snd_hda_intel 0000:00:1f.3: Cannot probe codecs, giving up ------------[ cut here ]------------ WARNING: CPU: 14 PID: 186 at sound/hda/hdac_bus.c:73 .... For fixing this, we need to call snd_card_free() instead of azx_free(). Additionally, the device drvdata has to be cleared, as the driver binding itself is still active. Then the PM and other driver callbacks will ignore the procedure. Fixes: c0f1886de7e1 ("ALSA: hda: intel: Allow repeatedly probing on codec configuration errors") Reported-and-tested-by: Scott Branden <scott.branden@broadcom.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/063e2397-7edb-5f48-7b0d-618b938d9dd8@broadcom.com Link: https://lore.kernel.org/r/20211110194633.19098-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-11-10Merge tag 'tag-chrome-platform-for-v5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux Pull chrome platform updates from Benson Leung: "cros_ec_typec: - Clean up use of cros_ec_check_features cros_ec_*: - Rename and move cros_ec_pd_command to cros_ec_command, and make changes to cros_ec_typec and cros_ec_proto to use the new common command, reducing duplication. sensorhub: - simplify getting .driver_data in cros_ec_sensors_core and cros_ec_sensorhub misc: - Maintainership change. Enric Balletbo i Serra has moved on from Collabora, so removing him from chrome/platform maintainers. Thanks for all of your hard work maintaining this, Enric, and best of luck to you in your new role! - Add Prashant Malani as driver maintainer for cros_ec_typec.c and cros_usbpd_notify. He was already principal contributor of these drivers" * tag 'tag-chrome-platform-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: platform/chrome: cros_ec_proto: Use ec_command for check_features platform/chrome: cros_ec_proto: Use EC struct for features MAINTAINERS: Chrome: Drop Enric Balletbo i Serra platform/chrome: cros_ec_typec: Use cros_ec_command() platform/chrome: cros_ec_proto: Add version for ec_command platform/chrome: cros_ec_proto: Make data pointers void platform/chrome: cros_usbpd_notify: Move ec_command() platform/chrome: cros_usbpd_notify: Rename cros_ec_pd_command() platform/chrome: cros_ec: Fix spelling mistake "responsed" -> "response" platform/chrome: cros_ec_sensorhub: simplify getting .driver_data iio: common: cros_ec_sensors: simplify getting .driver_data platform/chrome: cros-ec-typec: Cleanup use of check_features platform/chrome: cros_ec_proto: Fix check_features ret val MAINTAINERS: Add Prashant's maintainership of cros_ec drivers
2021-11-10Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - Fix double-evaluation of 'pte' macro argument when using 52-bit PAs - Fix signedness of some MTE prctl PR_* constants - Fix kmemleak memory usage by skipping early pgtable allocations - Fix printing of CPU feature register strings - Remove redundant -nostdlib linker flag for vDSO binaries * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: pgtable: make __pte_to_phys/__phys_to_pte_val inline functions arm64: Track no early_pgtable_alloc() for kmemleak arm64: mte: change PR_MTE_TCF_NONE back into an unsigned long arm64: vdso: remove -nostdlib compiler flag arm64: arm64_ftr_reg->name may not be a human-readable string
2021-11-10Merge tag 'arm-fixes-5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "This is one set of fixes for the NXP/FSL DPAA2 drivers, addressing a few minor issues. I received these just after sending out the last v5.15 fixes, and nothing in here seemed urgent enough for a quick follow-up" * tag 'arm-fixes-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: soc: fsl: dpaa2-console: free buffer before returning from dpaa2_console_read soc: fsl: dpio: use the combined functions to protect critical zone soc: fsl: dpio: replace smp_processor_id with raw_smp_processor_id