summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2021-09-30vfio: simplify iommu group allocation for mediated devicesChristoph Hellwig
Reuse the logic in vfio_noiommu_group_alloc to allocate a fake single-device iommu group for mediated devices by factoring out a common function, and replacing the noiommu boolean field in struct vfio_group with an enum to distinguish the three different kinds of groups. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-8-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-09-30vfio: Move vfio_iommu_group_get() to vfio_register_group_dev()Jason Gunthorpe
We don't need to hold a reference to the group in the driver as well as obtain a reference to the same group as the first thing vfio_register_group_dev() does. Since the drivers never use the group move this all into the core code. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-2-hch@lst.de Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-09-30sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=yArd Biesheuvel
THREAD_INFO_IN_TASK moved the CPU field out of thread_info, but this causes some issues on architectures that define raw_smp_processor_id() in terms of this field, due to the fact that #include'ing linux/sched.h to get at struct task_struct is problematic in terms of circular dependencies. Given that thread_info and task_struct are the same data structure anyway when THREAD_INFO_IN_TASK=y, let's move it back so that having access to the type definition of struct thread_info is sufficient to reference the CPU number of the current task. Note that this requires THREAD_INFO_IN_TASK's definition of the task_thread_info() helper to be updated, as task_cpu() takes a pointer-to-const, whereas task_thread_info() (which is used to generate lvalues as well), needs a non-const pointer. So make it a macro instead. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
2021-09-30kvm: irqfd: avoid update unmodified entries of the routingLongpeng(Mike)
All of the irqfds would to be updated when update the irq routing, it's too expensive if there're too many irqfds. However we can reduce the cost by avoid some unnecessary updates. For irqs of MSI type on X86, the update can be saved if the msi values are not change. The vfio migration could receives benefit from this optimi- zaiton. The test VM has 128 vcpus and 8 VF (with 65 vectors enabled), so the VM has more than 520 irqfds. We mesure the cost of the vfio_msix_enable (in QEMU, it would set routing for each irqfd) for each VF, and we can see the total cost can be significantly reduced. Origin Apply this Patch 1st 8 4 2nd 15 5 3rd 22 6 4th 24 6 5th 36 7 6th 44 7 7th 51 8 8th 58 8 Total 258ms 51ms We're also tring to optimize the QEMU part [1], but it's still worth to optimize the KVM to gain more benefits. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg04215.html Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Message-Id: <20210827080003.2689-1-longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30kvm: rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDSJuergen Gross
KVM_MAX_VCPU_ID is not specifying the highest allowed vcpu-id, but the number of allowed vcpu-ids. This has already led to confusion, so rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDS to make its semantics more clear Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210913135745.13944-3-jgross@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30KVM: Make kvm_make_vcpus_request_mask() use pre-allocated cpu_kick_maskVitaly Kuznetsov
kvm_make_vcpus_request_mask() already disables preemption so just like kvm_make_all_cpus_request_except() it can be switched to using pre-allocated per-cpu cpumasks. This allows for improvements for both users of the function: in Hyper-V emulation code 'tlb_flush' can now be dropped from 'struct kvm_vcpu_hv' and kvm_make_scan_ioapic_request_mask() gets rid of dynamic allocation. cpumask_available() checks in kvm_make_vcpu_request() and kvm_kick_many_cpus() can now be dropped as they checks for an impossible condition: kvm_init() makes sure per-cpu masks are allocated. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210903075141.403071-9-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30KVM: Drop 'except' parameter from kvm_make_vcpus_request_mask()Vitaly Kuznetsov
Both remaining callers of kvm_make_vcpus_request_mask() pass 'NULL' for 'except' parameter so it can just be dropped. No functional change intended ©. Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210903075141.403071-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30platform/chrome: cros_ec_proto: Add version for ec_commandPrashant Malani
Add a version parameter to cros_ec_command() for callers that may want to specify which version of the host command they would like to use. Signed-off-by: Prashant Malani <pmalani@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Link: https://lore.kernel.org/r/20210930022403.3358070-5-pmalani@chromium.org
2021-09-30platform/chrome: cros_ec_proto: Make data pointers voidPrashant Malani
Convert the input and output data pointers for cros_ec_command() to void pointers so that the callers don't have to cast their custom structs to uint8_t *. Signed-off-by: Prashant Malani <pmalani@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Link: https://lore.kernel.org/r/20210930022403.3358070-4-pmalani@chromium.org
2021-09-30platform/chrome: cros_usbpd_notify: Move ec_command()Prashant Malani
cros_ec_command() can be used by other modules too. So, move it to a common location and export it. This patch does not introduce any functional changes. Signed-off-by: Prashant Malani <pmalani@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Link: https://lore.kernel.org/r/20210930022403.3358070-3-pmalani@chromium.org
2021-09-29arm64: kasan: mte: move GCR_EL1 switch to task switch when KASAN disabledPeter Collingbourne
It is not necessary to write to GCR_EL1 on every kernel entry and exit when HW tag-based KASAN is disabled because the kernel will not execute any IRG instructions in that mode. Since accessing GCR_EL1 can be expensive on some microarchitectures, avoid doing so by moving the access to task switch when HW tag-based KASAN is disabled. Signed-off-by: Peter Collingbourne <pcc@google.com> Acked-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://linux-review.googlesource.com/id/I78e90d60612a94c24344526f476ac4ff216e10d2 Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210924010655.2886918-1-pcc@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29swiotlb: Support aligned swiotlb buffersDavid Stevens
Add an argument to swiotlb_tbl_map_single that specifies the desired alignment of the allocated buffer. This is used by dma-iommu to ensure the buffer is aligned to the iova granule size when using swiotlb with untrusted sub-granule mappings. This addresses an issue where adjacent slots could be exposed to the untrusted device if IO_TLB_SIZE < iova granule < PAGE_SIZE. Signed-off-by: David Stevens <stevensd@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210929023300.335969-7-stevensd@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-29net: phy: micrel: Add support for LAN8804 PHYHoratiu Vultur
The LAN8804 PHY has same features as that of LAN8814 PHY except that it doesn't support 1588, SyncE or Q-USGMII. This PHY is found inside the LAN966X switches. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28bpf: Replace callers of BPF_CAST_CALL with proper function typedefKees Cook
In order to keep ahead of cases in the kernel where Control Flow Integrity (CFI) may trip over function call casts, enabling -Wcast-function-type is helpful. To that end, BPF_CAST_CALL causes various warnings and is one of the last places in the kernel triggering this warning. For actual function calls, replace BPF_CAST_CALL() with a typedef, which captures the same details about the given function pointers. This change results in no object code difference. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://github.com/KSPP/linux/issues/20 Link: https://lore.kernel.org/lkml/CAEf4Bzb46=-J5Fxc3mMZ8JQPtK1uoE0q6+g6WPz53Cvx=CBEhw@mail.gmail.com Link: https://lore.kernel.org/bpf/20210928230946.4062144-3-keescook@chromium.org
2021-09-28bpf: Replace "want address" users of BPF_CAST_CALL with BPF_CALL_IMMKees Cook
In order to keep ahead of cases in the kernel where Control Flow Integrity (CFI) may trip over function call casts, enabling -Wcast-function-type is helpful. To that end, BPF_CAST_CALL causes various warnings and is one of the last places in the kernel triggering this warning. Most places using BPF_CAST_CALL actually just want a void * to perform math on. It's not actually performing a call, so just use a different helper to get the void *, by way of the new BPF_CALL_IMM() helper, which can clean up a common copy/paste idiom as well. This change results in no object code difference. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://github.com/KSPP/linux/issues/20 Link: https://lore.kernel.org/lkml/CAEf4Bzb46=-J5Fxc3mMZ8JQPtK1uoE0q6+g6WPz53Cvx=CBEhw@mail.gmail.com Link: https://lore.kernel.org/bpf/20210928230946.4062144-2-keescook@chromium.org
2021-09-28bus/fsl-mc: Add generic implementation for open/reset/close commandsDiana Craciun
The open/reset/close commands format is similar for all objects. Currently there are multiple implementations for these commands scattered through various drivers. The code is cavsi-identical. Create a generic implementation for the open/reset/close commands. One of the consumer will be the VFIO driver which needs to be able to reset a device. Signed-off-by: Diana Craciun <diana.craciun@oss.nxp.com> Reviewed-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Link: https://lore.kernel.org/r/20210922110530.24736-1-diana.craciun@oss.nxp.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-09-28PCI/ACPI: Remove OSC_PCI_SUPPORT_MASKS and OSC_PCI_CONTROL_MASKSJoerg Roedel
These masks are only used internally in the PCI Host Bridge _OSC negotiation code, which already makes sure nothing outside of these masks is set. Remove the masks and simplify the code. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20210824122054.29481-2-joro@8bytes.org Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
2021-09-28IB/mlx5: Enable UAR to have DevX UIDMeir Lichtinger
UID field was added to alloc_uar and dealloc_uar PRM command, to specify DevX UID for UAR. This change enables firmware validating user access to its own UAR resources. For the kernel allocated UARs the UID will stay 0 as of today. Signed-off-by: Meir Lichtinger <meirl@nvidia.com> Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2021-09-28net/mlx5: Add uid field to UAR allocation structuresMeir Lichtinger
Add uid field to mlx5_ifc_alloc_uar_in_bits and mlx5_ifc_dealloc_uar_out_bits structs. This field will be used by FW to manage UAR according to uid. Signed-off-by: Meir Lichtinger <meirl@nvidia.com> Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2021-09-28ieee80211: Add new A-MPDU factor macro for HE 6 GHz peer capsPradeep Kumar Chitrapu
Add IEEE80211_HE_6GHZ_MAX_AMPDU_FACTOR as per IEEE Std 802.11ax-2021, 9.4.2.263 to use for peer max A-MPDU factor in 6 GHz band. Signed-off-by: Pradeep Kumar Chitrapu <pradeepc@codeaurora.org> Signed-off-by: Jouni Malinen <jouni@codeaurora.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210913175510.193005-1-jouni@codeaurora.org
2021-09-28octeontx2-pf: Use hardware register for CQE countGeetha sowjanya
Current driver uses software CQ head pointer to poll on CQE header in memory to determine if CQE is valid. Software needs to make sure, that the reads of the CQE do not get re-ordered so much that it ends up with an inconsistent view of the CQE. To ensure that DMB barrier after read to first CQE cacheline and before reading of the rest of the CQE is needed. But having barrier for every CQE read will impact the performance, instead use hardware CQ head and tail pointers to find the valid number of CQEs. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2021-09-28 The following pull-request contains BPF updates for your *net* tree. We've added 10 non-merge commits during the last 14 day(s) which contain a total of 11 files changed, 139 insertions(+), 53 deletions(-). The main changes are: 1) Fix MIPS JIT jump code emission for too large offsets, from Piotr Krysiuk. 2) Fix x86 JIT atomic/fetch emission when dst reg maps to rax, from Johan Almbladh. 3) Fix cgroup_sk_alloc corner case when called from interrupt, from Daniel Borkmann. 4) Fix segfault in libbpf's linker for objects without BTF, from Kumar Kartikeya Dwivedi. 5) Fix bpf_jit_charge_modmem for applications with CAP_BPF, from Lorenz Bauer. 6) Fix return value handling for struct_ops BPF programs, from Hou Tao. 7) Various fixes to BPF selftests, from Jiri Benc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> ,
2021-09-27Merge tag '20210927135559.738-6-srinivas.kandagatla@linaro.org' into ↵Bjorn Andersson
drivers-for-5.16 v5.15-rc1 + 20210927135559.738-[23456]-srinivas.kandagatla@linaro.org This immutable branch is based on v5.15-rc1 and contains the following patches extending the existig APR driver to also implement GPR: 20210927135559.738-2-srinivas.kandagatla@linaro.org 20210927135559.738-3-srinivas.kandagatla@linaro.org 20210927135559.738-4-srinivas.kandagatla@linaro.org 20210927135559.738-5-srinivas.kandagatla@linaro.org 20210927135559.738-6-srinivas.kandagatla@linaro.org
2021-09-27soc: qcom: apr: Add GPR supportSrinivas Kandagatla
Qualcomm Generic Packet router aka GPR is the IPC mechanism found in AudioReach next generation signal processing framework to perform command and response messages between various processors. GPR has concepts of static and dynamic port, all static services like APM (Audio Processing Manager), PRM (Proxy resource manager) have fixed port numbers where as dynamic services like graphs have dynamic port numbers which are allocated at runtime. All GPR packet messages will have source and destination domain and port along with opcode and payload. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210927135559.738-6-srinivas.kandagatla@linaro.org
2021-09-27soc: qcom: apr: make code more reuseableSrinivas Kandagatla
APR and other packet routers like GPR are pretty much same and interact with other drivers in similar way. Ex: GPR ports can be considered as APR services, only difference is they are allocated dynamically. Other difference is packet layout, which should not matter with the apis abstracted. Apart from this the rest of the functionality is pretty much identical across APR and GPR. Make the apr code more reusable by abstracting it service level, rather than device level so that we do not need to write new drivers for other new packet routers like GPR. This patch is in preparation to add GPR support to this driver. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210927135559.738-4-srinivas.kandagatla@linaro.org
2021-09-27Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "A bit late... I got sidetracked by back-from-vacation routines and conferences. But most of these patches are already a few weeks old and things look more calm on the mailing list than what this pull request would suggest. x86: - missing TLB flush - nested virtualization fixes for SMM (secure boot on nested hypervisor) and other nested SVM fixes - syscall fuzzing fixes - live migration fix for AMD SEV - mirror VMs now work for SEV-ES too - fixes for reset - possible out-of-bounds access in IOAPIC emulation - fix enlightened VMCS on Windows 2022 ARM: - Add missing FORCE target when building the EL2 object - Fix a PMU probe regression on some platforms Generic: - KCSAN fixes selftests: - random fixes, mostly for clang compilation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits) selftests: KVM: Explicitly use movq to read xmm registers selftests: KVM: Call ucall_init when setting up in rseq_test KVM: Remove tlbs_dirty KVM: X86: Synchronize the shadow pagetable before link it KVM: X86: Fix missed remote tlb flush in rmap_write_protect() KVM: x86: nSVM: don't copy virt_ext from vmcb12 KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0 KVM: x86: nSVM: restore int_vector in svm_clear_vintr kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[] KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest state KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode KVM: x86: reset pdptrs_from_userspace when exiting smm KVM: x86: nSVM: restore the L1 host state prior to resuming nested guest on SMM exit KVM: nVMX: Filter out all unsupported controls when eVMCS was activated KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs KVM: Clean up benign vcpu->cpu data races when kicking vCPUs ...
2021-09-27soc: qcom: aoss: Expose send for generic usecaseDeepak Kumar Singh
Not all upcoming usecases will have an interface to allow the aoss driver to hook onto. Expose the send api and create a get function to enable drivers to send their own messages to aoss. Signed-off-by: Chris Lew <clew@codeaurora.org> Signed-off-by: Deepak Kumar Singh <deesin@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/1630420228-31075-2-git-send-email-deesin@codeaurora.org
2021-09-27nvdimm/pmem: move dax_attribute_group from dax to pmemChristoph Hellwig
dax_attribute_group is only used by the pmem driver, and can avoid the completely pointless lookup by the disk name if moved there. This leaves just a single caller of dax_get_by_host, so move dax_get_by_host into the same ifdef block as that caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20210922173431.2454024-3-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-27PCI: ACPI: Drop acpi_pci_busRafael J. Wysocki
The acpi_pci_bus structure was used primarily for running acpi_pci_find_companion() during PCI device objects registration, but after commit 375553a93201 ("PCI: Setup ACPI fwnode early and at the same time with OF") that function is called by pci_setup_device() via pci_set_acpi_fwnode(), which happens before calling pci_device_add() on the new PCI device object, so its ACPI companion has been set already when acpi_device_notify() runs and it will never call ->find_companion() from acpi_pci_bus. For this reason, modify acpi_device_notify() and acpi_device_notify_remove() to call pci_acpi_setup() and pci_acpi_cleanup(), respectively, directly on PCI device objects and drop acpi_pci_bus altogether. While at it, notice that pci_acpi_setup() and pci_acpi_cleanup() can obtain the ACPI companion pointer, which is guaranteed to not be NULL, from their callers and modify them to work that way so as to reduce the number of redundant checks somewhat. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Ferry Toth <fntoth@gmail.com>
2021-09-27Merge 5.15-rc3 into tty-nextGreg Kroah-Hartman
We need the tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-27leds: trigger: use RCU to protect the led_cdevs listJohannes Berg
Even with the previous commit 27af8e2c90fb ("leds: trigger: fix potential deadlock with libata") to this file, we still get lockdep unhappy, and Boqun explained the report here: https://lore.kernel.org/r/YNA+d1X4UkoQ7g8a@boqun-archlinux Effectively, this means that the read_lock_irqsave() isn't enough here because another CPU might be trying to do a write lock, and thus block the readers. This is all pretty messy, but it doesn't seem right that the LEDs framework imposes some locking requirements on users, in particular we'd have to make the spinlock in the iwlwifi driver always disable IRQs, even if we don't need that for any other reason, just to avoid this deadlock. Since writes to the led_cdevs list are rare (and are done by userspace), just switch the list to RCU. This costs a synchronize_rcu() at removal time so we can ensure things are correct, but that seems like a small price to pay for getting lock-free iterations and no deadlocks (nor any locking requirements imposed on users.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Pavel Machek <pavel@ucw.cz>
2021-09-27mm: Add folio_pfn()Matthew Wilcox (Oracle)
This is the folio equivalent of page_to_pfn(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm/workingset: Convert workingset_activation to take a folioMatthew Wilcox (Oracle)
This function already assumed it was being passed a head page. No real change here, except that thp_nr_pages() compiles away on kernels with THP compiled out while folio_nr_pages() is always present. Also convert page_memcg_rcu() to folio_memcg_rcu(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm/memcg: Add folio_lruvec_relock_irq() and folio_lruvec_relock_irqsave()Matthew Wilcox (Oracle)
These are the folio equivalents of relock_page_lruvec_irq() and folio_lruvec_relock_irqsave(). Also convert page_matches_lruvec() to folio_matches_lruvec(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm/memcg: Add folio_lruvec_lock() and similar functionsMatthew Wilcox (Oracle)
These are the folio equivalents of lock_page_lruvec() and similar functions. Also convert lruvec_memcg_debug() to take a folio. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm/memcg: Add folio_lruvec()Matthew Wilcox (Oracle)
This replaces mem_cgroup_page_lruvec(). All callers converted. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm/memcg: Add folio_memcg_lock() and folio_memcg_unlock()Matthew Wilcox (Oracle)
These are the folio equivalents of lock_page_memcg() and unlock_page_memcg(). lock_page_memcg() and unlock_page_memcg() have too many callers to be easily replaced in a single patch, so reimplement them as wrappers for now to be cleaned up later when enough callers have been converted to use folios. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm/memcg: Convert mem_cgroup_track_foreign_dirty_slowpath() to folioMatthew Wilcox (Oracle)
The page was only being used for the memcg and to gather trace information, so this is a simple conversion. The only caller of mem_cgroup_track_foreign_dirty() will be converted to folios in a later patch, so doing this now makes that patch simpler. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm/memcg: Convert mem_cgroup_migrate() to take foliosMatthew Wilcox (Oracle)
Convert all callers of mem_cgroup_migrate() to call page_folio() first. They all look like they're using head pages already, but this proves it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm/memcg: Convert mem_cgroup_uncharge() to take a folioMatthew Wilcox (Oracle)
Convert all the callers to call page_folio(). Most of them were already using a head page, but a few of them I can't prove were, so this may actually fix a bug. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm/memcg: Convert mem_cgroup_charge() to take a folioMatthew Wilcox (Oracle)
Convert all callers of mem_cgroup_charge() to call page_folio() on the page they're currently passing in. Many of them will be converted to use folios themselves soon. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm/memcg: Add folio_memcg() and related functionsMatthew Wilcox (Oracle)
memcg information is only stored in the head page, so the memcg subsystem needs to assure that all accesses are to the head page. The first step is converting page_memcg() to folio_memcg(). The callers of page_memcg() and PageMemcgKmem() are not yet ready to be converted to use folios, so retain them as wrappers around folio_memcg() and folio_memcg_kmem(). They will be converted in a later patch set. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm: Add folio_nid()Matthew Wilcox (Oracle)
This is the folio equivalent of page_to_nid(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-09-27mm: Add folio_mapped()Matthew Wilcox (Oracle)
This function is the equivalent of page_mapped(). It is slightly shorter as we do not need to handle the PageTail() case. Reimplement page_mapped() as a wrapper around folio_mapped(). folio_mapped() is 13 bytes smaller than page_mapped(), but the page_mapped() wrapper is 30 bytes, for a net increase of 17 bytes of text. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
2021-09-27fs/netfs: Add folio fscache functionsMatthew Wilcox (Oracle)
Match the page writeback functions by adding folio_start_fscache(), folio_end_fscache(), folio_wait_fscache() and folio_wait_fscache_killable(). Remove set_page_private_2(). Also rewrite the kernel-doc to describe when to use the function rather than what the function does, and include the kernel-doc in the appropriate rst file. Saves 31 bytes of text in netfs_rreq_unlock() due to set_page_fscache() calling page_folio() once instead of three times. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Howells <dhowells@redhat.com>
2021-09-27mm/filemap: Add folio private_2 functionsMatthew Wilcox (Oracle)
end_page_private_2() becomes folio_end_private_2(), wait_on_page_private_2() becomes folio_wait_private_2() and wait_on_page_private_2_killable() becomes folio_wait_private_2_killable(). Adjust the fscache equivalents to call page_folio() before calling these functions to avoid adding wrappers. Ends up costing 1 byte of text in ceph & netfs, but the core shrinks by three calls to page_folio(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
2021-09-27mm/filemap: Convert page wait queues to be foliosMatthew Wilcox (Oracle)
Reinforce that page flags are actually in the head page by changing the type from page to folio. Increases the size of cachefiles by two bytes, but the kernel core is unchanged in size. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jeff Layton <jlayton@kernel.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: David Howells <dhowells@redhat.com>
2021-09-27mm/filemap: Add folio_wait_bit()Matthew Wilcox (Oracle)
Rename wait_on_page_bit() to folio_wait_bit(). We must always wait on the folio, otherwise we won't be woken up due to the tail page hashing to a different bucket from the head page. This commit shrinks the kernel by 770 bytes, mostly due to moving the page waitqueue lookup into folio_wait_bit_common(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jeff Layton <jlayton@kernel.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
2021-09-27mm/writeback: Add folio_wait_stable()Matthew Wilcox (Oracle)
Move wait_for_stable_page() into the folio compatibility file. folio_wait_stable() avoids a call to compound_head() and is 14 bytes smaller than wait_for_stable_page() was. The net text size grows by 16 bytes as a result of this patch. We can also remove thp_head() as this was the last user. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jeff Layton <jlayton@kernel.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: David Howells <dhowells@redhat.com>
2021-09-27mm/writeback: Add folio_wait_writeback()Matthew Wilcox (Oracle)
wait_on_page_writeback_killable() only has one caller, so convert it to call folio_wait_writeback_killable(). For the wait_on_page_writeback() callers, add a compatibility wrapper around folio_wait_writeback(). Turning PageWriteback() into folio_test_writeback() eliminates a call to compound_head() which saves 8 bytes and 15 bytes in the two functions. Unfortunately, that is more than offset by adding the wait_on_page_writeback compatibility wrapper for a net increase in text of 7 bytes. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jeff Layton <jlayton@kernel.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Howells <dhowells@redhat.com>