linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-11-11	empty include/asm-generic/vga.h	Al Viro
	all places that use anything defined in it (vgacon, mdacon and vga16fb) are built only on architectures that have all that stuff in their native asm/vga.h allows to kill stub asm/vga.h on sh, while we are at it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-11-11	sparc: get rid of asm/vga.h	Al Viro
	The only thing we are using it for on sparc is telling vt_buffer.h to pick what it would pick by default anyway - we are not accessing any VRAM here... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-11-11	asm/vga.h: don't bother with scr_mem{cpy,move}v() unless we need to	Al Viro
	... if they are identical to fallbacks, just leave them alone. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-11-11	vt_buffer.h: get rid of dead code in default scr_...() instances	Al Viro
	Only 4 architectures define VT_BUF_HAVE_RW (alpha, mips, powerpc, sparc) and all of them define VT_BUF_HAVE_MEM{SET,CPY,MOVE}W. In other words, the code under #ifdef VT_BUF_HAVE_RW in default scr_mem...w() instances won't be compiled anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-11-11	PCI: Unexport pci_walk_bus_locked()	Keith Busch
	There's only one user of pci_walk_bus_locked(), and it's internal to the PCI core. Unexport it and make it private to drivers/pci/. Link: https://lore.kernel.org/r/20241022224851.340648-6-kbusch@meta.com Signed-off-by: Keith Busch <kbusch@kernel.org> [bhelgaas: move decl to drivers/pci/pci.h] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-11	PCI: Abstract LBMS seen check into pcie_lbms_seen()	Ilpo Järvinen
	The Target Speed quirk in pcie_failed_link_retrain() uses the presence of LBMS bit as one of the triggering conditions, effectively monopolizing the use of that bit. An upcoming change will introduce a PCIe bandwidth controller which sets up an interrupt to track LBMS. As LBMS will be cleared by the interrupt handler, the Target Speed quirk will no longer be able to observe LBMS directly. As a preparatory step for the change, extract the LBMS seen check from pcie_failed_link_retrain() into a new function pcie_lmbs_seen(). Link: https://lore.kernel.org/r/20241018144755.7875-6-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-11	PCI: Refactor pcie_update_link_speed()	Ilpo Järvinen
	pcie_update_link_speed() is passed the Link Status register but not all callers have that value at hand nor need the value. Refactor pcie_update_link_speed() to include reading the Link Status register and create __pcie_update_link_speed() which can be used by the hotplug code that has the register value at hand beforehand (and needs the value for other purposes). Link: https://lore.kernel.org/r/20241018144755.7875-5-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-11	PCI: Store all PCIe Supported Link Speeds	Ilpo Järvinen
	The PCIe bandwidth controller added by a subsequent commit will require selecting PCIe Link Speeds that are lower than the Maximum Link Speed. The struct pci_bus only stores max_bus_speed. Even if PCIe r6.1 sec 8.2.1 currently disallows gaps in supported Link Speeds, the Implementation Note in PCIe r6.1 sec 7.5.3.18, recommends determining supported Link Speeds using the Supported Link Speeds Vector in the Link Capabilities 2 Register (when available) to "avoid software being confused if a future specification defines Links that do not require support for all slower speeds." Reuse code in pcie_get_speed_cap() to add pcie_get_supported_speeds() to query the Supported Link Speeds Vector of a PCIe device. The value is taken directly from the Supported Link Speeds Vector or synthesized from the Max Link Speed in the Link Capabilities Register when the Link Capabilities 2 Register is not available. The Supported Link Speeds Vector in the Link Capabilities Register 2 corresponds to the bus below on Root Ports and Downstream Ports, whereas it corresponds to the bus above on Upstream Ports and Endpoints (PCIe r6.1 sec 7.5.3.18): Supported Link Speeds Vector - This field indicates the supported Link speed(s) of the associated Port. Add supported_speeds into the struct pci_dev that caches the Supported Link Speeds Vector. supported_speeds contains a set of Link Speeds only in the case where PCIe Link Speed can be determined. Root Complex Integrated Endpoints do not have a well-defined Link Speed because they do not implement either of the Link Capabilities Registers, which is allowed by PCIe r6.1 sec 7.5.3 (the same limitation applies to determining cur_bus_speed and max_bus_speed that are PCI_SPEED_UNKNOWN in such case). This is of no concern from PCIe bandwidth controller point of view because such devices are not attached into a PCIe Root Port that could be controlled. The supported_speeds field keeps the extra reserved zero at the least significant bit to match the Link Capabilities 2 Register layout. An attempt was made to store supported_speeds field into the struct pci_bus as an intersection of both ends of the Link, however, the subordinate struct pci_bus is not available early enough. The Target Speed quirk (in pcie_failed_link_retrain()) can run either during initial scan or later, requiring it to use the API provided by the PCIe bandwidth controller to set the Target Link Speed in order to co-exist with the bandwidth controller. When the Target Speed quirk is calling the bandwidth controller during initial scan, the struct pci_bus is not yet initialized. As such, storing supported_speeds into the struct pci_bus is not viable. Suggested-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/20241018144755.7875-4-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: move pcie_get_supported_speeds() decl to drivers/pci/pci.h] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-11	Merge branch kvm-arm64/vgic-its-fixes into kvmarm/next	Oliver Upton
	* kvm-arm64/vgic-its-fixes: : Fixes for vgic-its save/restore, courtesy of Kunkun Jiang and Jing Zhang : : Address bugs where restoring an ITS consumes a stale DTE/ITE, which : may lead to either garbage mappings in the ITS or the overall restore : ioctl failing. The fix in both cases is to zero a DTE/ITE when its : translation has been invalidated by the guest. KVM: arm64: vgic-its: Clear ITE when DISCARD frees an ITE KVM: arm64: vgic-its: Clear DTE when MAPD unmaps a device KVM: arm64: vgic-its: Add a data length check in vgic_its_save_* Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11	KVM: arm64: vgic-its: Clear ITE when DISCARD frees an ITE	Kunkun Jiang
	When DISCARD frees an ITE, it does not invalidate the corresponding ITE. In the scenario of continuous saves and restores, there may be a situation where an ITE is not saved but is restored. This is unreasonable and may cause restore to fail. This patch clears the corresponding ITE when DISCARD frees an ITE. Cc: stable@vger.kernel.org Fixes: eff484e0298d ("KVM: arm64: vgic-its: ITT save and restore") Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> [Jing: Update with entry write helper] Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20241107214137.428439-6-jingzhangos@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11	KVM: arm64: vgic-its: Clear DTE when MAPD unmaps a device	Kunkun Jiang
	vgic_its_save_device_tables will traverse its->device_list to save DTE for each device. vgic_its_restore_device_tables will traverse each entry of device table and check if it is valid. Restore if valid. But when MAPD unmaps a device, it does not invalidate the corresponding DTE. In the scenario of continuous saves and restores, there may be a situation where a device's DTE is not saved but is restored. This is unreasonable and may cause restore to fail. This patch clears the corresponding DTE when MAPD unmaps a device. Cc: stable@vger.kernel.org Fixes: 57a9a117154c ("KVM: arm64: vgic-its: Device table save/restore") Co-developed-by: Shusen Li <lishusen2@huawei.com> Signed-off-by: Shusen Li <lishusen2@huawei.com> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> [Jing: Update with entry write helper] Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20241107214137.428439-5-jingzhangos@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11	KVM: arm64: vgic-its: Add a data length check in vgic_its_save_*	Jing Zhang
	In all the vgic_its_save_*() functinos, they do not check whether the data length is 8 bytes before calling vgic_write_guest_lock. This patch adds the check. To prevent the kernel from being blown up when the fault occurs, KVM_BUG_ON() is used. And the other BUG_ON()s are replaced together. Cc: stable@vger.kernel.org Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> [Jing: Update with the new entry read/write helpers] Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20241107214137.428439-4-jingzhangos@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11	PCI: Convert __pci_walk_bus() to be recursive	Keith Busch
	The original implementation of __pci_walk_bus() chose a non-recursive walk, presumably as a precaution on stack use. We do recursive bus walking in other places though. For example: pci_bus_resettable() pci_stop_bus_device() pci_remove_bus_device() pci_bus_allocate_dev_resources() So recursive pci bus walking is well tested and safe, and is easier to follow. Convert __pci_walk_bus() to be recursive to make it easier to introduce finer grain locking in the future. Link: https://lore.kernel.org/r/20241022224851.340648-5-kbusch@meta.com Signed-off-by: Keith Busch <kbusch@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-11	PCI: Move __pci_walk_bus() mutex to where we need it	Keith Busch
	Simplify __pci_walk_bus() by moving the pci_bus_sem mutex into pci_walk_bus(), the only place it is needed, and removing the parameter that told __pci_walk_bus() whether to acquire the mutex. Link: https://lore.kernel.org/r/20241022224851.340648-4-kbusch@meta.com Signed-off-by: Keith Busch <kbusch@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-11-11	PCI: Make pci_destroy_dev() concurrent safe	Keith Busch
	Use an atomic flag instead of the racy check against the device's kobj parent. We shouldn't be poking into device implementation details at this level anyway. Link: https://lore.kernel.org/r/20241022224851.340648-3-kbusch@meta.com Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-11	drm/amdgpu/mes12: correct kiq unmap latency	Jack Xiao
	Correct kiq unmap queue timeout value. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit cfe98204a06329b6b7fce1b828b7d620473181ff) Cc: stable@vger.kernel.org # 6.11.x
2024-11-11	PCI: Make pci_stop_dev() concurrent safe	Keith Busch
	Use the atomic ADDED flag to ensure concurrent callers can't attempt to stop the device multiple times. Callers should currently all be holding the pci_rescan_remove_lock, so there shouldn't be an existing race. But that global lock can cause lock dependency issues, so this is preparing to reduce reliance on that lock by using the existing existing atomic bit ops. Link: https://lore.kernel.org/r/20241022224851.340648-2-kbusch@meta.com Signed-off-by: Keith Busch <kbusch@kernel.org> [bhelgaas: squash https://lore.kernel.org/r/20241111180659.3321671-1-kbusch@meta.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-11	drm/amdgpu: fix check in gmc_v9_0_get_vm_pte()	Christian König
	The coherency flags can only be determined when the BO is locked and that in turn is only guaranteed when the mapping is validated. Fix the check, move the resource check into the function and add an assert that the BO is locked. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: d1a372af1c3d ("drm/amdgpu: Set MTYPE in PTE based on BO flags") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1b4ca8546f5b5c482717bedb8e031227b1541539) Cc: stable@vger.kernel.org
2024-11-11	drm/amd/pm: print pp_dpm_mclk in ascending order on SMU v14.0.0	Tim Huang
	Currently, the pp_dpm_mclk values are reported in descending order on SMU IP v14.0.0/1/4. Adjust to ascending order for consistency with other clock interfaces. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d4be16ccfd5bf822176740a51ff2306679a2247e) Cc: stable@vger.kernel.org
2024-11-11	drm/amdgpu: Fix video caps for H264 and HEVC encode maximum size	David Rosca
	H264 supports 4096x4096 starting from Polaris. HEVC also supports 4096x4096, with VCN 3 and newer 8192x4352 is supported. Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 69e9a9e65b1ea542d07e3fdd4222b46e9f5a3a29) Cc: stable@vger.kernel.org
2024-11-11	drm/amd/display: Adjust VSDB parser for replay feature	Rodrigo Siqueira
	At some point, the IEEE ID identification for the replay check in the AMD EDID was added. However, this check causes the following out-of-bounds issues when using KASAN: [ 27.804016] BUG: KASAN: slab-out-of-bounds in amdgpu_dm_update_freesync_caps+0xefa/0x17a0 [amdgpu] [ 27.804788] Read of size 1 at addr ffff8881647fdb00 by task systemd-udevd/383 ... [ 27.821207] Memory state around the buggy address: [ 27.821215] ffff8881647fda00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 27.821224] ffff8881647fda80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 27.821234] >ffff8881647fdb00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 27.821243] ^ [ 27.821250] ffff8881647fdb80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 27.821259] ffff8881647fdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 27.821268] ================================================================== This is caused because the ID extraction happens outside of the range of the edid lenght. This commit addresses this issue by considering the amd_vsdb_block size. Cc: ChiaHsuan Chung <chiahsuan.chung@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b7e381b1ccd5e778e3d9c44c669ad38439a861d8) Cc: stable@vger.kernel.org
2024-11-11	drm/amd/display: Require minimum VBlank size for stutter optimization	Dillon Varone
	If the nominal VBlank is too small, optimizing for stutter can cause the prefetch bandwidth to increase drasticaly, resulting in higher clock and power requirements. Only optimize if it is >3x the stutter latency. Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 003215f962cdf2265f126a3f4c9ad20917f87fca) Cc: stable@vger.kernel.org
2024-11-11	drm/amd/display: Handle dml allocation failure to avoid crash	Ryan Seto
	[Why] In the case where a dml allocation fails for any reason, the current state's dml contexts would no longer be valid. Then subsequent calls dc_state_copy_internal would shallow copy invalid memory and if the new state was released, a double free would occur. [How] Reset dml pointers in new_state to NULL and avoid invalid pointer Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Ryan Seto <ryanseto@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bcafdc61529a48f6f06355d78eb41b3aeda5296c) Cc: stable@vger.kernel.org
2024-11-11	drm/amd/display: Fix Panel Replay not update screen correctly	Tom Chung
	[Why] In certain use case such as KDE login screen, there will be no atomic commit while do the frame update. If the Panel Replay enabled, it will cause the screen not updated and looks like system hang. [How] Delay few atomic commits before enabled the Panel Replay just like PSR. Fixes: be64336307a6c ("drm/amd/display: Re-enable panel replay feature") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3686 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3682 Tested-By: Corey Hickey <bugfood-c@fatooh.org> Tested-By: James Courtier-Dutton <james.dutton@gmail.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ca628f0eddd73adfccfcc06b2a55d915bca4a342) Cc: stable@vger.kernel.org # 6.11+
2024-11-11	drm/amd/display: Change some variable name of psr	Tom Chung
	Panel Replay feature may also use the same variable with PSR. Change the variable name and make it not specify for PSR. Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c7fafb7a46b38a11a19342d153f505749bf56f3e) Cc: stable@vger.kernel.org # 6.11+
2024-11-11	Merge branch 'replace-page_frag-with-page_frag_cache-part-1'	Jakub Kicinski
	Yunsheng Lin says: ==================== Replace page_frag with page_frag_cache (Part-1) This is part 1 of "Replace page_frag with page_frag_cache", which mainly contain refactoring and optimization for the implementation of page_frag API before the replacing. As the discussion in [1], it would be better to target net-next tree to get more testing as all the callers page_frag API are in networking, and the chance of conflicting with MM tree seems low as implementation of page_frag API seems quite self-contained. After [2], there are still two implementations for page frag: 1. mm/page_alloc.c: net stack seems to be using it in the rx part with 'struct page_frag_cache' and the main API being page_frag_alloc_align(). 2. net/core/sock.c: net stack seems to be using it in the tx part with 'struct page_frag' and the main API being skb_page_frag_refill(). This patchset tries to unfiy the page frag implementation by replacing page_frag with page_frag_cache for sk_page_frag() first. net_high_order_alloc_disable_key for the implementation in net/core/sock.c doesn't seems matter that much now as pcp is also supported for high-order pages: commit 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists") As the related change is mostly related to networking, so targeting the net-next. And will try to replace the rest of page_frag in the follow patchset. After this patchset: 1. Unify the page frag implementation by taking the best out of two the existing implementations: we are able to save some space for the 'page_frag_cache' API user, and avoid 'get_page()' for the old 'page_frag' API user. 2. Future bugfix and performance can be done in one place, hence improving maintainability of page_frag's implementation. Kernel Image changing: Linux Kernel total \| text data bss ------------------------------------------------------ after 45250307 \| 27274279 17209996 766032 before 45254134 \| 27278118 17209984 766032 delta -3827 \| -3839 +12 +0 1. https://lore.kernel.org/all/add10dd4-7f5d-4aa1-aa04-767590f944e0@redhat.com/ 2. https://lore.kernel.org/all/20240228093013.8263-1-linyunsheng@huawei.com/ ==================== Link: https://patch.msgid.link/20241028115343.3405838-1-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: page_frag: use __alloc_pages() to replace alloc_pages_node()	Yunsheng Lin
	It seems there is about 24Bytes binary size increase for __page_frag_cache_refill() after refactoring in arm64 system with 64K PAGE_SIZE. By doing the gdb disassembling, It seems we can have more than 100Bytes decrease for the binary size by using __alloc_pages() to replace alloc_pages_node(), as there seems to be some unnecessary checking for nid being NUMA_NO_NODE, especially when page_frag is part of the mm system. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-8-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: page_frag: reuse existing space for 'size' and 'pfmemalloc'	Yunsheng Lin
	Currently there is one 'struct page_frag' for every 'struct sock' and 'struct task_struct', we are about to replace the 'struct page_frag' with 'struct page_frag_cache' for them. Before begin the replacing, we need to ensure the size of 'struct page_frag_cache' is not bigger than the size of 'struct page_frag', as there may be tens of thousands of 'struct sock' and 'struct task_struct' instances in the system. By or'ing the page order & pfmemalloc with lower bits of 'va' instead of using 'u16' or 'u32' for page size and 'u8' for pfmemalloc, we are able to avoid 3 or 5 bytes space waste. And page address & pfmemalloc & order is unchanged for the same page in the same 'page_frag_cache' instance, it makes sense to fit them together. After this patch, the size of 'struct page_frag_cache' should be the same as the size of 'struct page_frag'. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-7-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	xtensa: remove the get_order() implementation	Yunsheng Lin
	As the get_order() implemented by xtensa supporting 'nsau' instruction seems be the same as the generic implementation in include/asm-generic/getorder.h when size is not a constant value as the generic implementation calling the fls*() is also utilizing the 'nsau' instruction for xtensa. So remove the get_order() implemented by xtensa, as using the generic implementation may enable the compiler to do the computing when size is a constant value instead of runtime computing and enable the using of get_order() in BUILD_BUG_ON() macro in next patch. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-6-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: page_frag: avoid caller accessing 'page_frag_cache' directly	Yunsheng Lin
	Use appropriate frag_page API instead of caller accessing 'page_frag_cache' directly. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20241028115343.3405838-5-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: page_frag: use initial zero offset for page_frag_alloc_align()	Yunsheng Lin
	We are about to use page_frag_alloc_*() API to not just allocate memory for skb->data, but also use them to do the memory allocation for skb frag too. Currently the implementation of page_frag in mm subsystem is running the offset as a countdown rather than count-up value, there may have several advantages to that as mentioned in [1], but it may have some disadvantages, for example, it may disable skb frag coalescing and more correct cache prefetching We have a trade-off to make in order to have a unified implementation and API for page_frag, so use a initial zero offset in this patch, and the following patch will try to make some optimization to avoid the disadvantages as much as possible. 1. https://lore.kernel.org/all/f4abe71b3439b39d17a6fb2d410180f367cadf5c.camel@gmail.com/ CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-4-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: move the page fragment allocator from page_alloc into its own file	Yunsheng Lin
	Inspired by [1], move the page fragment allocator from page_alloc into its own c file and header file, as we are about to make more change for it to replace another page_frag implementation in sock.c As this patchset is going to replace 'struct page_frag' with 'struct page_frag_cache' in sched.h, including page_frag_cache.h in sched.h has a compiler error caused by interdependence between mm_types.h and mm.h for asm-offsets.c, see [2]. So avoid the compiler error by moving 'struct page_frag_cache' to mm_types_task.h as suggested by Alexander, see [3]. 1. https://lore.kernel.org/all/20230411160902.4134381-3-dhowells@redhat.com/ 2. https://lore.kernel.org/all/15623dac-9358-4597-b3ee-3694a5956920@gmail.com/ 3. https://lore.kernel.org/all/CAKgT0UdH1yD=LSCXFJ=YM_aiA4OomD-2wXykO42bizaWMt_HOA@mail.gmail.com/ CC: David Howells <dhowells@redhat.com> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-3-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	mm: page_frag: add a test module for page_frag	Yunsheng Lin
	The testing is done by ensuring that the fragment allocated from a frag_frag_cache instance is pushed into a ptr_ring instance in a kthread binded to a specified cpu, and a kthread binded to a specified cpu will pop the fragment from the ptr_ring and free the fragment. CC: Andrew Morton <akpm@linux-foundation.org> CC: Linux-MM <linux-mm@kvack.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241028115343.3405838-2-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11	Merge branch kvm-arm64/nv-pmu into kvmarm/next	Oliver Upton
	* kvm-arm64/nv-pmu: : Support for vEL2 PMU controls : : Align the vEL2 PMU support with the current state of non-nested KVM, : including: : : - Trap routing, with the annoying complication of EL2 traps that apply : in Host EL0 : : - PMU emulation, using the correct configuration bits depending on : whether a counter falls in the hypervisor or guest range of PMCs : : - Perf event swizzling across nested boundaries, as the event filtering : needs to be remapped to cope with vEL2 KVM: arm64: nv: Reprogram PMU events affected by nested transition KVM: arm64: nv: Apply EL2 event filtering when in hyp context KVM: arm64: nv: Honor MDCR_EL2.HLP KVM: arm64: nv: Honor MDCR_EL2.HPME KVM: arm64: Add helpers to determine if PMC counts at a given EL KVM: arm64: nv: Adjust range of accessible PMCs according to HPMN KVM: arm64: Rename kvm_pmu_valid_counter_mask() KVM: arm64: nv: Advertise support for FEAT_HPMN0 KVM: arm64: nv: Describe trap behaviour of MDCR_EL2.HPMN KVM: arm64: nv: Honor MDCR_EL2.{TPM, TPMCR} in Host EL0 KVM: arm64: nv: Reinject traps that take effect in Host EL0 KVM: arm64: nv: Rename BEHAVE_FORWARD_ANY KVM: arm64: nv: Allow coarse-grained trap combos to use complex traps KVM: arm64: Describe RES0/RES1 bits of MDCR_EL2 arm64: sysreg: Add new definitions for ID_AA64DFR0_EL1 arm64: sysreg: Migrate MDCR_EL2 definition to table arm64: sysreg: Describe ID_AA64DFR2_EL1 fields Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11	Merge branch kvm-arm64/mmio-sea into kvmarm/next	Oliver Upton
	* kvm-arm64/mmio-sea: : Fix for SEA injection in response to MMIO : : Fix + test coverage for SEA injection in response to an unhandled MMIO : exit to userspace. Naturally, if userspace decides to abort an MMIO : instruction KVM shouldn't continue with instruction emulation... KVM: arm64: selftests: Add tests for MMIO external abort injection KVM: arm64: selftests: Convert to kernel's ESR terminology tools: arm64: Grab a copy of esr.h from kernel KVM: arm64: Don't retire aborted MMIO instruction Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11	Merge branch kvm-arm64/misc into kvmarm/next	Oliver Upton
	* kvm-arm64/misc: : Miscellaneous updates : : - Drop useless check against vgic state in ICC_CLTR_EL1.SEIS read : emulation : : - Fix trap configuration for pKVM : : - Close the door on initialization bugs surrounding userspace irqchip : static key by removing it. KVM: selftests: Don't bother deleting memslots in KVM when freeing VMs KVM: arm64: Get rid of userspace_irqchip_in_use KVM: arm64: Initialize trap register values in hyp in pKVM KVM: arm64: Initialize the hypervisor's VM state at EL2 KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use KVM: arm64: Move pkvm_vcpu_init_traps() to init_pkvm_hyp_vcpu() KVM: arm64: Don't map 'kvm_vgic_global_state' at EL2 with pKVM KVM: arm64: Just advertise SEIS as 0 when emulating ICC_CTLR_EL1 Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11	KVM: selftests: Don't bother deleting memslots in KVM when freeing VMs	Sean Christopherson
	When freeing a VM, don't call into KVM to manually remove each memslot, simply cleanup and free any userspace assets associated with the memory region. KVM is ultimately responsible for ensuring kernel resources are freed when the VM is destroyed, deleting memslots one-by-one is unnecessarily slow, and unless a test is already leaking the VM fd, the VM will be destroyed when kvm_vm_release() is called. Not deleting KVM's memslot also allows cleaning up dead VMs without having to care whether or not the to-be-freed VM is dead or alive. Reported-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/kvmarm/Zy0bcM0m-N18gAZz@google.com/ Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-11	nfsd: have nfsd4_deleg_getattr_conflict pass back write deleg pointer	Jeff Layton
	Currently we pass back the size and whether it has been modified, but those just mirror values tracked inside the delegation. In a later patch, we'll need to get at the timestamps in the delegation too, so just pass back a reference to the write delegation, and use that to properly override values in the iattr. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	nfsd: drop the nfsd4_fattr_args "size" field	Jeff Layton
	We already have a slot for this in the kstat structure. Just overwrite that instead of keeping a copy. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	nfsd: drop the ncf_cb_bmap field	Jeff Layton
	This is always the same value, and in a later patch we're going to need to set bits in WORD2. We can simplify this code and save a little space in the delegation too. Just hardcode the bitmap in the callback encode function. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	nfsd: drop inode parameter from nfsd4_change_attribute()	Jeff Layton
	The inode that nfs4_open_delegation() passes to this function is wrong, which throws off the result. The inode will end up getting a directory-style change attr instead of a regular-file-style one. Fix up nfs4_delegation_stat() to fetch STATX_MODE, and then drop the inode parameter from nfsd4_change_attribute(), since it's no longer needed. Fixes: c5967721e106 ("NFSD: handle GETATTR conflict with write delegation") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	xdrgen: emit maxsize macros	Chuck Lever
	Add "definitions" subcommand logic to emit maxsize macros in generated code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	xdrgen: Add generator code for XDR width macros	Chuck Lever
	Introduce logic in the code generators to emit maxsize (XDR width) definitions. In C, these are pre-processor macros. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	xdrgen: XDR width for union types	Chuck Lever
	Not yet complete. The tool doesn't do any math yet. Thus, even though the maximum XDR width of a union is the width of the union enumerator plus the width of its largest arm, we're using the sum of all the elements of the union for the moment. This means that buffer size requirements are overestimated, and that the generated maxsize macro cannot yet be used for determining data element alignment in the XDR buffer. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	xdrgen: XDR width for pointer types	Chuck Lever
	The XDR width of a pointer type is the sum of the widths of each of the struct's fields, except for the last field. The width of the implicit boolean "value follows" field is added as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	xdrgen: XDR width for struct types	Chuck Lever
	The XDR width of a struct type is the sum of the widths of each of the struct's fields. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	xdrgen: XDR width for typedef	Chuck Lever
	The XDR width of a typedef is the same as the width of the base type. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	xdrgen: XDR width for optional_data type	Chuck Lever
	Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	xdrgen: XDR width for variable-length array	Chuck Lever
	Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11	xdrgen: XDR width for fixed-length array	Chuck Lever
	Signed-off-by: Chuck Lever <chuck.lever@oracle.com>