summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-18Merge tag 'printk-for-5.13-fixup' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk fixup from Petr Mladek: "Fix misplaced EXPORT_SYMBOL(vsprintf)" * tag 'printk-for-5.13-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: Move EXPORT_SYMBOL() closer to vprintk definition
2021-06-18Merge tag 'pm-5.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Remove recently added frequency invariance support from the CPPC cpufreq driver, because it has turned out to be problematic and it cannot be fixed properly on time for 5.13 (Viresh Kumar)" * tag 'pm-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "cpufreq: CPPC: Add support for frequency invariance"
2021-06-18Merge tag 'usb-5.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are three small USB fixes for reported problems for 5.13-rc7. They include: - disable autosuspend for a cypress USB hub - fix the battery charger detection for the chipidea driver - fix a kernel panic in the dwc3 driver due to a previous change in 5.13-rc1. All have been in linux-next with no reported problems" * tag 'usb-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: core: hub: Disable autosuspend for Cypress CY7C65632 usb: chipidea: imx: Fix Battery Charger 1.2 CDP detection usb: dwc3: core: fix kernel panic when do reboot
2021-06-18Merge branch 'path-lookup' into docs-nextJonathan Corbet
From Fox Chen: The Path lookup is a very complex subject in VFS. The path-lookup document provides a very detailed guidance to help people understand how path lookup works in the kernel. This document was originally written based on three lwn articles five years ago. As times goes by, some of the content is outdated. This patchset is intended to update the document to make it more relevant to current codebase.
2021-06-18x86/mm: Avoid truncating memblocks for SGX memoryFan Du
tl;dr: Several SGX users reported seeing the following message on NUMA systems: sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0. This turned out to be the memblock code mistakenly throwing away SGX memory. === Full Changelog === The 'max_pfn' variable represents the highest known RAM address. It can be used, for instance, to quickly determine for which physical addresses there is mem_map[] space allocated. The numa_meminfo code makes an effort to throw out ("trim") all memory blocks which are above 'max_pfn'. SGX memory is not considered RAM (it is marked as "Reserved" in the e820) and is not taken into account by max_pfn. Despite this, SGX memory areas have NUMA affinity and are enumerated in the ACPI SRAT table. The existing SGX code uses the numa_meminfo mechanism to look up the NUMA affinity for its memory areas. In cases where SGX memory was above max_pfn (usually just the one EPC section in the last highest NUMA node), the numa_memblock is truncated at 'max_pfn', which is below the SGX memory. When the SGX code tries to look up the affinity of this memory, it fails and produces an error message: sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0. and assigns the memory to NUMA node 0. Instead of silently truncating the memory block at 'max_pfn' and dropping the SGX memory, add the truncated portion to 'numa_reserved_meminfo'. This allows the SGX code to later determine the NUMA affinity of its 'Reserved' area. Before, numa_meminfo looked like this (from 'crash'): blk = { start = 0x0, end = 0x2080000000, nid = 0x0 } { start = 0x2080000000, end = 0x4000000000, nid = 0x1 } numa_reserved_meminfo is empty. With this, numa_meminfo looks like this: blk = { start = 0x0, end = 0x2080000000, nid = 0x0 } { start = 0x2080000000, end = 0x4000000000, nid = 0x1 } and numa_reserved_meminfo has an entry for node 1's SGX memory: blk = { start = 0x4000000000, end = 0x4080000000, nid = 0x1 } [ daveh: completely rewrote/reworked changelog ] Fixes: 5d30f92e7631 ("x86/NUMA: Provide a range-to-target_node lookup facility") Reported-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20210617194657.0A99CB22@viggo.jf.intel.com
2021-06-18Merge tag 'drm-fixes-2021-06-18' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Not much happening in fixes land this week only one PR for two amdgpu powergating fixes was waiting for me, maybe something will show up over the weekend, maybe not. amdgpu: - GFX9 and 10 powergating fixes" * tag 'drm-fixes-2021-06-18' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell. drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue.
2021-06-18docs: path-lookup: use bare function() rather than literalsFox Chen
As suggested by Matthew Wilcox and Jonathan Corbet, drop ``...`` literals around function names of this patchset. Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-14-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: update symlink descriptionFox Chen
instead of lookup_real()/vfs_create(), i_op->lookup() and i_op->create() will be called directly. update vfs_open() logic should_follow_link is merged into lookup_last() or open_last_lookup() which returns symlink name instead of an integer. Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-13-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: update get_link() ->follow_link descriptionFox Chen
get_link() is merged into pick_link(). i_op->follow_link is replaced with i_op->get_link(). get_link() can return ERR_PTR(0) which equals NULL. Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-12-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: update WALK_GET, WALK_PUT descFox Chen
WALK_GET is changed to WALK_TRAILING with a different meaning. Here it should be WALK_NOFOLLOW. WALK_PUT dosn't exist, we have WALK_MORE. WALK_PUT == !WALK_MORE And there is not should_follow_link(). Related commits: commit 8c4efe22e7c4 ("namei: invert the meaning of WALK_FOLLOW") commit 1c4ff1a87e46 ("namei: invert WALK_PUT logics") Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> [jc: applied language tweaks suggested by Neil] Link: https://lore.kernel.org/r/20210527091618.287093-11-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: no get_link()Fox Chen
no get_link() anymore. we have step_into() and pick_link(). walk_component() will call step_into(), in turn call pick_link, and return symlink name. Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-10-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: update i_op->put_link and cookie descriptionFox Chen
No inode->put_link operation anymore. We use delayed_call to deal with link destruction. Cookie has been replaced with struct delayed_call. Related commit: commit fceef393a538 ("switch ->get_link() to delayed_call, kill ->put_link()") Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-9-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: i_op->follow_link replaced with i_op->get_linkFox Chen
follow_link has been replaced by get_link() which can be called in RCU mode. see commit: commit 6b2553918d8b ("replace ->follow_link() with new method that could stay in RCU mode") Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-8-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: Add macro name to symlink limit descriptionFox Chen
Add macro name MAXSYMLINKS to the symlink limit description, so that it is consistent with path name length description above. Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-7-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: remove filename_mountpointFox Chen
No filename_mountpoint any more see commit: commit 161aff1d93ab ("LOOKUP_MOUNTPOINT: fold path_mountpointat() into path_lookupat()") Without filename_mountpoint and path_mountpoint(), the numbers should be four & three: "These four correspond roughly to the three path_*() functions" Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-6-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: update do_last() partFox Chen
traling_symlink() was merged into lookup_last, do_last(). do_last() has later been split into open_last_lookups() and do_open(). see related commit: commit c5971b8c6354 ("take post-lookup part of do_last() out of loop") Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-5-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: update path_mountpoint() partFox Chen
path_mountpoint() doesn't exist anymore. Have been folded into path_lookup_at when flag is set with LOOKUP_MOUNTPOINT. Check commit: commit 161aff1d93abf0e ("LOOKUP_MOUNTPOINT: fold path_mountpointat() into path_lookupat()") Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-4-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: update path_to_nameidata() partFox Chen
No path_to_namei() anymore, step_into() will be called. Related commit: commit c99687a03a78 ("fold path_to_nameidata() into its only remaining caller") Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-3-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: path-lookup: update follow_managed() partFox Chen
No follow_managed() anymore, handle_mounts(), traverse_mounts(), will do the job. see commit 9deed3ebca24 ("new helper: traverse_mounts()") Signed-off-by: Fox Chen <foxhlchen@gmail.com> Reviewed-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20210527091618.287093-2-foxhlchen@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18docs: Makefile: Use CONFIG_SHELL not SHELLKees Cook
Fix think-o about which variable to find the Kbuild-configured shell. This has accidentally worked due to most shells setting $SHELL by default. Fixes: 51e46c7a4007 ("docs, parallelism: Rearrange how jobserver reservations are made") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210617225808.3907377-1-keescook@chromium.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-06-18Merge branch arm64/for-next/caches into kvmarm-master/nextMarc Zyngier
arm64 cache management function cleanup from Fuad Tabba, shared with the arm64 tree. * arm64/for-next/caches: arm64: Rename arm64-internal cache maintenance functions arm64: Fix cache maintenance function comments arm64: sync_icache_aliases to take end parameter instead of size arm64: __clean_dcache_area_pou to take end parameter instead of size arm64: __clean_dcache_area_pop to take end parameter instead of size arm64: __clean_dcache_area_poc to take end parameter instead of size arm64: __flush_dcache_area to take end parameter instead of size arm64: dcache_by_line_op to take end parameter instead of size arm64: __inval_dcache_area to take end parameter instead of size arm64: Fix comments to refer to correct function __flush_icache_range arm64: Move documentation of dcache_by_line_op arm64: assembler: remove user_alt arm64: Downgrade flush_icache_range to invalidate arm64: Do not enable uaccess for invalidate_icache_range arm64: Do not enable uaccess for flush_icache_range arm64: Apply errata to swsusp_arch_suspend_exit arm64: assembler: add conditional cache fixups arm64: assembler: replace `kaddr` with `addr` Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-06-18PCI: aardvark: Fix kernel panic during PIO transferPali Rohár
Trying to start a new PIO transfer by writing value 0 in PIO_START register when previous transfer has not yet completed (which is indicated by value 1 in PIO_START) causes an External Abort on CPU, which results in kernel panic: SError Interrupt on CPU0, code 0xbf000002 -- SError Kernel panic - not syncing: Asynchronous SError Interrupt To prevent kernel panic, it is required to reject a new PIO transfer when previous one has not finished yet. If previous PIO transfer is not finished yet, the kernel may issue a new PIO request only if the previous PIO transfer timed out. In the past the root cause of this issue was incorrectly identified (as it often happens during link retraining or after link down event) and special hack was implemented in Trusted Firmware to catch all SError events in EL3, to ignore errors with code 0xbf000002 and not forwarding any other errors to kernel and instead throw panic from EL3 Trusted Firmware handler. Links to discussion and patches about this issue: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dcdac5c50 https://lore.kernel.org/linux-pci/20190316161243.29517-1-repk@triplefau.lt/ https://lore.kernel.org/linux-pci/971be151d24312cc533989a64bd454b4@www.loen.fr/ https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/1541 But the real cause was the fact that during link retraining or after link down event the PIO transfer may take longer time, up to the 1.44s until it times out. This increased probability that a new PIO transfer would be issued by kernel while previous one has not finished yet. After applying this change into the kernel, it is possible to revert the mentioned TF-A hack and SError events do not have to be caught in TF-A EL3. Link: https://lore.kernel.org/r/20210608203655.31228-1-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Marek Behún <kabel@kernel.org> Cc: stable@vger.kernel.org # 7fbcb5da811b ("PCI: aardvark: Don't rely on jiffies while holding spinlock")
2021-06-18PCI: Add AMD RS690 quirk to enable 64-bit DMAMikel Rychliski
Although the AMD RS690 chipset has 64-bit DMA support, BIOS implementations sometimes fail to configure the memory limit registers correctly. The Acer F690GVM mainboard uses this chipset and a Marvell 88E8056 NIC. The sky2 driver programs the NIC to use 64-bit DMA, which will not work: sky2 0000:02:00.0: error interrupt status=0x8 sky2 0000:02:00.0 eth0: tx timeout sky2 0000:02:00.0 eth0: transmit ring 0 .. 22 report=0 done=0 Other drivers required by this mainboard either don't support 64-bit DMA, or have it disabled using driver specific quirks. For example, the ahci driver has quirks to enable or disable 64-bit DMA depending on the BIOS version (see ahci_sb600_enable_64bit() in ahci.c). This ahci quirk matches against the SB600 SATA controller, but the real issue is almost certainly with the RS690 PCI host that it was commonly attached to. To avoid this issue in all drivers with 64-bit DMA support, fix the configuration of the PCI host. If the kernel is aware of physical memory above 4GB, but the BIOS never configured the PCI host with this information, update the registers with our values. [bhelgaas: drop PCI_DEVICE_ID_ATI_RS690 definition] Link: https://lore.kernel.org/r/20210611214823.4898-1-mikel@mikelr.com Signed-off-by: Mikel Rychliski <mikel@mikelr.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-06-18PCI: Add ACS quirk for Broadcom BCM57414 NICSriharsha Basavapatna
The Broadcom BCM57414 NIC may be a multi-function device. While it does not advertise an ACS capability, peer-to-peer transactions are not possible between the individual functions, so it is safe to treat them as fully isolated. Add an ACS quirk for this device so the functions can be in independent IOMMU groups and attached individually to userspace applications using VFIO. [bhelgaas: commit log] Link: https://lore.kernel.org/r/1621645997-16251-1-git-send-email-michael.chan@broadcom.com Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
2021-06-18PCI: Mark AMD Navi14 GPU ATS as brokenEvan Quan
Observed unexpected GPU hang during runpm stress test on 0x7341 rev 0x00. Further debugging shows broken ATS is related. Disable ATS on this part. Similar issues on other devices: a2da5d8cc0b0 ("PCI: Mark AMD Raven iGPU ATS as broken in some platforms") 45beb31d3afb ("PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken") 5e89cd303e3a ("PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken") Suggested-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20210602021255.939090-1-evan.quan@amd.com Signed-off-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Krzysztof Wilczyński <kw@linux.com> Cc: stable@vger.kernel.org
2021-06-18PCI: Work around Huawei Intelligent NIC VF FLR erratumChiqijun
pcie_flr() starts a Function Level Reset (FLR), waits 100ms (the maximum time allowed for FLR completion by PCIe r5.0, sec 6.6.2), and waits for the FLR to complete. It assumes the FLR is complete when a config read returns valid data. When we do an FLR on several Huawei Intelligent NIC VFs at the same time, firmware on the NIC processes them serially. The VF may respond to config reads before the firmware has completed its reset processing. If we bind a driver to the VF (e.g., by assigning the VF to a virtual machine) in the interval between the successful config read and completion of the firmware reset processing, the NIC VF driver may fail to load. Prevent this driver failure by waiting for the NIC firmware to complete its reset processing. Not all NIC firmware supports this feature. [bhelgaas: commit log] Link: https://support.huawei.com/enterprise/en/doc/EDOC1100063073/87950645/vm-oss-occasionally-fail-to-load-the-in200-driver-when-the-vf-performs-flr Link: https://lore.kernel.org/r/20210414132301.1793-1-chiqijun@huawei.com Signed-off-by: Chiqijun <chiqijun@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org
2021-06-18PCI: Mark some NVIDIA GPUs to avoid bus resetShanker Donthineni
Some NVIDIA GPU devices do not work with SBR. Triggering SBR leaves the device inoperable for the current system boot. It requires a system hard-reboot to get the GPU device back to normal operating condition post-SBR. For the affected devices, enable NO_BUS_RESET quirk to avoid the issue. This issue will be fixed in the next generation of hardware. Link: https://lore.kernel.org/r/20210608054857.18963-8-ameynarkhede03@gmail.com Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Cc: stable@vger.kernel.org
2021-06-18PCI: Mark TI C667X to avoid bus resetAntti Järvinen
Some TI KeyStone C667X devices do not support bus/hot reset. The PCIESS automatically disables LTSSM when Secondary Bus Reset is received and device stops working. Prevent bus reset for these devices. With this change, the device can be assigned to VMs with VFIO, but it will leak state between VMs. Reference: https://e2e.ti.com/support/processors/f/791/t/954382 Link: https://lore.kernel.org/r/20210315102606.17153-1-antti.jarvinen@gmail.com Signed-off-by: Antti Järvinen <antti.jarvinen@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kishon Vijay Abraham I <kishon@ti.com> Cc: stable@vger.kernel.org
2021-06-18PCI: tegra194: Fix MCFG quirk build regressionsJon Hunter
7f100744749e ("PCI: tegra: Add Tegra194 MCFG quirks for ECAM errata") caused a few build regressions: - 7f100744749e removed the Makefile rule for CONFIG_PCIE_TEGRA194, so pcie-tegra.c can no longer be built as a module. Restore that rule. - 7f100744749e added "#ifdef CONFIG_PCIE_TEGRA194" around the native driver, but that's only set when the driver is built-in (for a module, CONFIG_PCIE_TEGRA194_MODULE is defined). The ACPI quirk is completely independent of the rest of the native driver, so move the quirk to its own file and remove the #ifdef in the native driver. - 7f100744749e added symbols that are always defined but used only when CONFIG_PCIEASPM, which causes warnings when CONFIG_PCIEASPM is not set: drivers/pci/controller/dwc/pcie-tegra194.c:259:18: warning: ‘event_cntr_data_offset’ defined but not used [-Wunused-const-variable=] drivers/pci/controller/dwc/pcie-tegra194.c:250:18: warning: ‘event_cntr_ctrl_offset’ defined but not used [-Wunused-const-variable=] drivers/pci/controller/dwc/pcie-tegra194.c:243:27: warning: ‘pcie_gen_freq’ defined but not used [-Wunused-const-variable=] Fixes: 7f100744749e ("PCI: tegra: Add Tegra194 MCFG quirks for ECAM errata") Link: https://lore.kernel.org/r/20210610064134.336781-1-jonathanh@nvidia.com Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Thierry Reding <treding@nvidia.com>
2021-06-18PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GBPunit Agrawal
Alexandru and Qu reported this resource allocation failure on ROCKPro64 v2 and ROCK Pi 4B, both based on the RK3399: pci_bus 0000:00: root bus resource [mem 0xfa000000-0xfbdfffff 64bit] pci 0000:00:00.0: PCI bridge to [bus 01] pci 0000:00:00.0: BAR 14: no space for [mem size 0x00100000] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit] "BAR 14" is the PCI bridge's 32-bit non-prefetchable window, and our PCI allocation code isn't smart enough to allocate it in a host bridge window marked as 64-bit, even though this should work fine. A DT host bridge description includes the windows from the CPU address space to the PCI bus space. On a few architectures (microblaze, powerpc, sparc), the DT may also describe PCI devices themselves, including their BARs. Before 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses"), of_bus_pci_get_flags() ignored the fact that some DT addresses described 64-bit windows and BARs. That was a problem because the virtio virtual NIC has a 32-bit BAR and a 64-bit BAR, and the driver couldn't distinguish them. 9d57e61bf723 set IORESOURCE_MEM_64 for those 64-bit DT ranges, which fixed the virtio driver. But it also set IORESOURCE_MEM_64 for host bridge windows, which exposed the fact that the PCI allocator isn't smart enough to put 32-bit resources in those 64-bit windows. Clear IORESOURCE_MEM_64 from host bridge windows since we don't need that information. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Fixes: 9d57e61bf723 ("of/pci: Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses") Link: https://lore.kernel.org/r/20210614230457.752811-1-punitagrawal@gmail.com Reported-at: https://lore.kernel.org/lkml/7a1e2ebc-f7d8-8431-d844-41a9c36a8911@arm.com/ Reported-at: https://lore.kernel.org/lkml/YMyTUv7Jsd89PGci@m4/T/#u Reported-by: Alexandru Elisei <alexandru.elisei@arm.com> Reported-by: Qu Wenruo <wqu@suse.com> Tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Domenico Andreoli <domenico.andreoli@linux.com> Signed-off-by: Punit Agrawal <punitagrawal@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org>
2021-06-18Merge branch kvm-arm64/pmu-fixes into kvmarm-master/nextMarc Zyngier
Fixes for the PMUv3 emulation of PMCR_EL0: - Don't spuriously reset the cycle counter when resetting other counters - Force PMCR_EL0 to become effective after having restored it * kvm-arm64/pmu-fixes: KVM: arm64: Restore PMU configuration on first run KVM: arm64: Don't zero the cycle count register when PMCR_EL0.P is set
2021-06-18KVM: arm64: Restore PMU configuration on first runMarc Zyngier
Restoring a guest with an active virtual PMU results in no perf counters being instanciated on the host side. Not quite what you'd expect from a restore. In order to fix this, force a writeback of PMCR_EL0 on the first run of a vcpu (using a new request so that it happens once the vcpu has been loaded). This will in turn create all the host-side counters that were missing. Reported-by: Jinank Jain <jinankj@amazon.de> Tested-by: Jinank Jain <jinankj@amazon.de> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/87wnrbylxv.wl-maz@kernel.org Link: https://lore.kernel.org/r/b53dfcf9bbc4db7f96154b1cd5188d72b9766358.camel@amazon.de
2021-06-18tracing: Do no increment trace_clock_global() by oneSteven Rostedt (VMware)
The trace_clock_global() tries to make sure the events between CPUs is somewhat in order. A global value is used and updated by the latest read of a clock. If one CPU is ahead by a little, and is read by another CPU, a lock is taken, and if the timestamp of the other CPU is behind, it will simply use the other CPUs timestamp. The lock is also only taken with a "trylock" due to tracing, and strange recursions can happen. The lock is not taken at all in NMI context. In the case where the lock is not able to be taken, the non synced timestamp is returned. But it will not be less than the saved global timestamp. The problem arises because when the time goes "backwards" the time returned is the saved timestamp plus 1. If the lock is not taken, and the plus one to the timestamp is returned, there's a small race that can cause the time to go backwards! CPU0 CPU1 ---- ---- trace_clock_global() { ts = clock() [ 1000 ] trylock(clock_lock) [ success ] global_ts = ts; [ 1000 ] <interrupted by NMI> trace_clock_global() { ts = clock() [ 999 ] if (ts < global_ts) ts = global_ts + 1 [ 1001 ] trylock(clock_lock) [ fail ] return ts [ 1001] } unlock(clock_lock); return ts; [ 1000 ] } trace_clock_global() { ts = clock() [ 1000 ] if (ts < global_ts) [ false 1000 == 1000 ] trylock(clock_lock) [ success ] global_ts = ts; [ 1000 ] unlock(clock_lock) return ts; [ 1000 ] } The above case shows to reads of trace_clock_global() on the same CPU, but the second read returns one less than the first read. That is, time when backwards, and this is not what is allowed by trace_clock_global(). This was triggered by heavy tracing and the ring buffer checker that tests for the clock going backwards: Ring buffer clock went backwards: 20613921464 -> 20613921463 ------------[ cut here ]------------ WARNING: CPU: 2 PID: 0 at kernel/trace/ring_buffer.c:3412 check_buffer+0x1b9/0x1c0 Modules linked in: [..] [CPU: 2]TIME DOES NOT MATCH expected:20620711698 actual:20620711697 delta:6790234 before:20613921463 after:20613921463 [20613915818] PAGE TIME STAMP [20613915818] delta:0 [20613915819] delta:1 [20613916035] delta:216 [20613916465] delta:430 [20613916575] delta:110 [20613916749] delta:174 [20613917248] delta:499 [20613917333] delta:85 [20613917775] delta:442 [20613917921] delta:146 [20613918321] delta:400 [20613918568] delta:247 [20613918768] delta:200 [20613919306] delta:538 [20613919353] delta:47 [20613919980] delta:627 [20613920296] delta:316 [20613920571] delta:275 [20613920862] delta:291 [20613921152] delta:290 [20613921464] delta:312 [20613921464] delta:0 TIME EXTEND [20613921464] delta:0 This happened more than once, and always for an off by one result. It also started happening after commit aafe104aa9096 was added. Cc: stable@vger.kernel.org Fixes: aafe104aa9096 ("tracing: Restructure trace_clock_global() to never block") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-18tracing: Do not stop recording comms if the trace file is being readSteven Rostedt (VMware)
A while ago, when the "trace" file was opened, tracing was stopped, and code was added to stop recording the comms to saved_cmdlines, for mapping of the pids to the task name. Code has been added that only records the comm if a trace event occurred, and there's no reason to not trace it if the trace file is opened. Cc: stable@vger.kernel.org Fixes: 7ffbd48d5cab2 ("tracing: Cache comms only after an event occurred") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-18tracing: Do not stop recording cmdlines when tracing is offSteven Rostedt (VMware)
The saved_cmdlines is used to map pids to the task name, such that the output of the tracing does not just show pids, but also gives a human readable name for the task. If the name is not mapped, the output looks like this: <...>-1316 [005] ...2 132.044039: ... Instead of this: gnome-shell-1316 [005] ...2 132.044039: ... The names are updated when tracing is running, but are skipped if tracing is stopped. Unfortunately, this stops the recording of the names if the top level tracer is stopped, and not if there's other tracers active. The recording of a name only happens when a new event is written into a ring buffer, so there is no need to test if tracing is on or not. If tracing is off, then no event is written and no need to test if tracing is off or not. Remove the check, as it hides the names of tasks for events in the instance buffers. Cc: stable@vger.kernel.org Fixes: 7ffbd48d5cab2 ("tracing: Cache comms only after an event occurred") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-18recordmcount: Correct st_shndx handlingPeter Zijlstra
One should only use st_shndx when >SHN_UNDEF and <SHN_LORESERVE. When SHN_XINDEX, then use .symtab_shndx. Otherwise use 0. This handles the case: st_shndx >= SHN_LORESERVE && st_shndx != SHN_XINDEX. Link: https://lore.kernel.org/lkml/20210607023839.26387-1-mark-pk.tsai@mediatek.com/ Link: https://lkml.kernel.org/r/20210616154126.2794-1-mark-pk.tsai@mediatek.com Reported-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> [handle endianness of sym->st_shndx] Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-18pinctrl: stm32: fix the reported number of GPIO lines per bankFabien Dessenne
Each GPIO bank supports a variable number of lines which is usually 16, but is less in some cases : this is specified by the last argument of the "gpio-ranges" bank node property. Report to the framework, the actual number of lines, so the libgpiod gpioinfo command lists the actually existing GPIO lines. Fixes: 1dc9d289154b ("pinctrl: stm32: add possibility to use gpio-ranges to declare bank range") Signed-off-by: Fabien Dessenne <fabien.dessenne@foss.st.com> Link: https://lore.kernel.org/r/20210617144629.2557693-1-fabien.dessenne@foss.st.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-06-18KVM: arm64: Don't zero the cycle count register when PMCR_EL0.P is setAlexandru Elisei
According to ARM DDI 0487G.a, page D13-3895, setting the PMCR_EL0.P bit to 1 has the following effect: "Reset all event counters accessible in the current Exception level, not including PMCCNTR_EL0, to zero." Similar behaviour is described for AArch32 on page G8-7022. Make it so. Fixes: c01d6a18023b ("KVM: arm64: pmu: Only handle supported event counters") Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210618105139.83795-1-alexandru.elisei@arm.com
2021-06-18Merge branch kvm-arm64/mmu/stage2-cmos into kvmarm-master/nextMarc Zyngier
Cache maintenance updates from Yanan Wang, moving the CMOs down into the page-table code. This ensures that we only issue them when actually performing a mapping rather than upfront. * kvm-arm64/mmu/stage2-cmos: KVM: arm64: Move guest CMOs to the fault handlers KVM: arm64: Tweak parameters of guest cache maintenance functions KVM: arm64: Introduce mm_ops member for structure stage2_attr_data KVM: arm64: Introduce two cache maintenance callbacks
2021-06-18KVM: arm64: Move guest CMOs to the fault handlersYanan Wang
We currently uniformly perform CMOs of D-cache and I-cache in function user_mem_abort before calling the fault handlers. If we get concurrent guest faults(e.g. translation faults, permission faults) or some really unnecessary guest faults caused by BBM, CMOs for the first vcpu are necessary while the others later are not. By moving CMOs to the fault handlers, we can easily identify conditions where they are really needed and avoid the unnecessary ones. As it's a time consuming process to perform CMOs especially when flushing a block range, so this solution reduces much load of kvm and improve efficiency of the stage-2 page table code. We can imagine two specific scenarios which will gain much benefit: 1) In a normal VM startup, this solution will improve the efficiency of handling guest page faults incurred by vCPUs, when initially populating stage-2 page tables. 2) After live migration, the heavy workload will be resumed on the destination VM, however all the stage-2 page tables need to be rebuilt at the moment. So this solution will ease the performance drop during resuming stage. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210617105824.31752-5-wangyanan55@huawei.com
2021-06-18KVM: arm64: Tweak parameters of guest cache maintenance functionsYanan Wang
Adjust the parameter "kvm_pfn_t pfn" of __clean_dcache_guest_page and __invalidate_icache_guest_page to "void *va", which paves the way for converting these two guest CMO functions into callbacks in structure kvm_pgtable_mm_ops. No functional change. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210617105824.31752-4-wangyanan55@huawei.com
2021-06-18KVM: arm64: Introduce mm_ops member for structure stage2_attr_dataYanan Wang
Also add a mm_ops member for structure stage2_attr_data, since we will move I-cache maintenance for guest stage-2 to the permission path and as a result will need mm_ops for some callbacks. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210617105824.31752-3-wangyanan55@huawei.com
2021-06-18KVM: arm64: Introduce two cache maintenance callbacksYanan Wang
To prepare for performing CMOs for guest stage-2 in the fault handlers in pgtable.c, here introduce two cache maintenance callbacks in struct kvm_pgtable_mm_ops. We also adjust the comment alignment for the existing part but make no real content change at all. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> [maz: fixed up comments and renamed callbacks] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210617105824.31752-2-wangyanan55@huawei.com
2021-06-18mac80211: handle various extensible elements correctlyJohannes Berg
Various elements are parsed with a requirement to have an exact size, when really we should only check that they have the minimum size that we need. Check only that and therefore ignore any additional data that they might carry. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20210618133832.cd101f8040a4.Iadf0e9b37b100c6c6e79c7b298cc657c2be9151a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-18mac80211: reset profile_periodicity/ema_apJohannes Berg
Apparently we never clear these values, so they'll remain set since the setting of them is conditional. Clear the values in the relevant other cases. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20210618133832.316e32d136a9.I2a12e51814258e1e1b526103894f4b9f19a91c8d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-18cfg80211: avoid double free of PMSR requestAvraham Stern
If cfg80211_pmsr_process_abort() moves all the PMSR requests that need to be freed into a local list before aborting and freeing them. As a result, it is possible that cfg80211_pmsr_complete() will run in parallel and free the same PMSR request. Fix it by freeing the request in cfg80211_pmsr_complete() only if it is still in the original pmsr list. Cc: stable@vger.kernel.org Fixes: 9bb7e0f24e7e ("cfg80211: add peer measurement with FTM initiator API") Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20210618133832.1fbef57e269a.I00294bebdb0680b892f8d1d5c871fd9dbe785a5e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-18cfg80211: make certificate generation more robustJohannes Berg
If all net/wireless/certs/*.hex files are deleted, the build will hang at this point since the 'cat' command will have no arguments. Do "echo | cat - ..." so that even if the "..." part is empty, the whole thing won't hang. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20210618133832.c989056c3664.Ic3b77531d00b30b26dcd69c64e55ae2f60c3f31e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-18KVM: x86/mmu: Remove redundant root_hpa checksDavid Matlack
The root_hpa checks below the top-level check in kvm_mmu_page_fault are theoretically redundant since there is no longer a way for the root_hpa to be reset during a page fault. The details of why are described in commit ddce6208217c ("KVM: x86/mmu: Move root_hpa validity checks to top of page fault handler") __direct_map, kvm_tdp_mmu_map, and get_mmio_spte are all only reachable through kvm_mmu_page_fault, therefore their root_hpa checks are redundant. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210617231948.2591431-5-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-18KVM: x86/mmu: Refactor is_tdp_mmu_root into is_tdp_mmuDavid Matlack
This change simplifies the call sites slightly and also abstracts away the implementation detail of looking at root_hpa as the mechanism for determining if the mmu is the TDP MMU. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210617231948.2591431-4-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-18KVM: x86/mmu: Remove redundant is_tdp_mmu_enabled checkDavid Matlack
This check is redundant because the root shadow page will only be a TDP MMU page if is_tdp_mmu_enabled() returns true, and is_tdp_mmu_enabled() never changes for the lifetime of a VM. It's possible that this check was added for performance reasons but it is unlikely that it is useful in practice since to_shadow_page() is cheap. That being said, this patch also caches the return value of is_tdp_mmu_root() in direct_page_fault() since there's no reason to duplicate the call so many times, so performance is not a concern. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210617231948.2591431-3-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>