linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-08-06	kconfig: lxdialog: replace strcpy() with strncpy() in inputbox.c	Suchit Karunakaran
	strcpy() performs no bounds checking and can lead to buffer overflows if the input string exceeds the destination buffer size. This patch replaces it with strncpy(), and null terminates the input string. Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com> Reviewed-by: Nicolas Schier <nicolas.schier@linux.dev> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-08-06	kconfig: lxdialog: replace strcpy with snprintf in print_autowrap	Suchit Karunakaran
	strcpy() does not perform bounds checking and can lead to buffer overflows if the source string exceeds the destination buffer size. In print_autowrap(), replace strcpy() with snprintf() to safely copy the prompt string into the fixed-size tempstr buffer. Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-08-05	net: ti: icssg-prueth: Fix skb handling for XDP_PASS	Meghana Malladi
	emac_rx_packet() is a common function for handling traffic for both xdp and non-xdp use cases. Use common logic for handling skb with or without xdp to prevent any incorrect packet processing. This patch fixes ping working with XDP_PASS for icssg driver. Fixes: 62aa3246f4623 ("net: ti: icssg-prueth: Add XDP support") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250803180216.3569139-1-m-malladi@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05	net: Update threaded state in napi config in netif_set_threaded	Samiullah Khawaja
	Commit 2677010e7793 ("Add support to set NAPI threaded for individual NAPI") added support to enable/disable threaded napi using netlink. This also extended the napi config save/restore functionality to set the napi threaded state. This breaks netdev reset for drivers that use napi threaded at device level and also use napi config save/restore on napi_disable/napi_enable. Basically on netdev with napi threaded enabled at device level, a napi_enable call will get stuck trying to stop the napi kthread. This is because the napi->config->threaded is set to disabled when threaded is enabled at device level. The issue can be reproduced on virtio-net device using qemu. To reproduce the issue run following, echo 1 > /sys/class/net/threaded ethtool -L eth0 combined 1 Update the threaded state in napi config in netif_set_threaded and add a new test that verifies this scenario. Tested on qemu with virtio-net: NETIF=eth0 ./tools/testing/selftests/drivers/net/napi_threaded.py TAP version 13 1..2 ok 1 napi_threaded.change_num_queues ok 2 napi_threaded.enable_dev_threaded_disable_napi_threaded # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0 Fixes: 2677010e7793 ("Add support to set NAPI threaded for individual NAPI") Signed-off-by: Samiullah Khawaja <skhawaja@google.com> Link: https://patch.msgid.link/20250804164457.2494390-1-skhawaja@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05	selftests: netdevsim: Xfail nexthop test on slow machines	Ido Schimmel
	A lot of test cases in the file are related to the idle and unbalanced timers of resilient nexthop groups and these tests are reported to be flaky on slow machines running debug kernels. Rather than marking a lot of individual tests with xfail_on_slow(), simply mark all the tests. Note that the test is stable on non-debug machines and that with debug kernels we are mainly interested in the output of various sanitizers in order to determine pass / fail. Before: # make -C tools/testing/selftests KSFT_MACHINE_SLOW=yes \ TARGETS=drivers/net/netdevsim TEST_PROGS=nexthop.sh \ TEST_GEN_PROGS="" run_tests [...] # TEST: Bucket migration after idle timer (with delete) [FAIL] # Group expected to still be unbalanced [...] not ok 1 selftests: drivers/net/netdevsim: nexthop.sh # exit=1 After: # make -C tools/testing/selftests KSFT_MACHINE_SLOW=yes \ TARGETS=drivers/net/netdevsim TEST_PROGS=nexthop.sh \ TEST_GEN_PROGS="" run_tests [...] # TEST: Bucket migration after idle timer (with delete) [XFAIL] # Group expected to still be unbalanced [...] ok 1 selftests: drivers/net/netdevsim: nexthop.sh Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20250729160609.02e0f157@kernel.org/ Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250804114320.193203-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05	Merge branch 'eth-fbnic-fix-drop-stats-support'	Jakub Kicinski
	Mohsin Bashir says: ==================== eth: fbnic: Fix drop stats support Fix hardware drop stats support on the TX path of fbnic by addressing two issues: ensure that tx_dropped stats are correctly copied to the rtnl_link_stats64 struct, and protect the copying of drop stats from fdb->hw_stats to the local variable with the hw_stats_lock to ensure consistency. ==================== Link: https://patch.msgid.link/20250802024636.679317-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05	eth: fbnic: Lock the tx_dropped update	Mohsin Bashir
	Wrap copying of drop stats on TX path from fbd->hw_stats by the hw_stats_lock. Currently, it is being performed outside the lock and another thread accessing fbd->hw_stats can lead to inconsistencies. Fixes: 5f8bd2ce8269 ("eth: fbnic: add support for TMI stats") Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250802024636.679317-3-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05	eth: fbnic: Fix tx_dropped reporting	Mohsin Bashir
	Correctly copy the tx_dropped stats from the fbd->hw_stats to the rtnl_link_stats64 struct. Fixes: 5f8bd2ce8269 ("eth: fbnic: add support for TMI stats") Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250802024636.679317-2-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05	eth: fbnic: remove the debugging trick of super high page bias	Jakub Kicinski
	Alex added page bias of LONG_MAX, which is admittedly quite a clever way of catching overflows of the pp ref count. The page pool code was "optimized" to leave the ref at 1 for freed pages so it can't catch basic bugs by itself any more. (Something we should probably address under DEBUG_NET...) Unfortunately for fbnic since commit f7dc3248dcfb ("skbuff: Optimization of SKB coalescing for page pool") core _may_ actually take two extra pp refcounts, if one of them is returned before driver gives up the bias the ret < 0 check in page_pool_unref_netmem() will trigger. While at it add a FBNIC_ to the name of the driver constant. Fixes: 0cb4c0a13723 ("eth: fbnic: Implement Rx queue alloc/start/stop/free") Link: https://patch.msgid.link/20250801170754.2439577-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05	net: ftgmac100: fix potential NULL pointer access in ftgmac100_phy_disconnect	Heiner Kallweit
	After the call to phy_disconnect() netdev->phydev is reset to NULL. So fixed_phy_unregister() would be called with a NULL pointer as argument. Therefore cache the phy_device before this call. Fixes: e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Link: https://patch.msgid.link/2b80a77a-06db-4dd7-85dc-3a8e0de55a1d@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05	dt-bindings: net: Replace bouncing Alexandru Tachici emails	Krzysztof Kozlowski
	Emails to alexandru.tachici@analog.com bounce permanently: Remote Server returned '550 5.1.10 RESOLVER.ADR.RecipientNotFound; Recipient not found by SMTP address lookup' so replace him with Marcelo Schmitt from Analog. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Marcelo Schmitt <marcelo.schmitt@analog.com> Link: https://patch.msgid.link/20250724113758.61874-2-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05	vfio/type1: conditional rescheduling while pinning	Keith Busch
	A large DMA mapping request can loop through dma address pinning for many pages. In cases where THP can not be used, the repeated vmf_insert_pfn can be costly, so let the task reschedule as need to prevent CPU stalls. Failure to do so has potential harmful side effects, like increased memory pressure as unrelated rcu tasks are unable to make their reclaim callbacks and result in OOM conditions. rcu: INFO: rcu_sched self-detected stall on CPU rcu: 36-....: (20999 ticks this GP) idle=b01c/1/0x4000000000000000 softirq=35839/35839 fqs=3538 rcu: hardirqs softirqs csw/system rcu: number: 0 107 0 rcu: cputime: 50 0 10446 ==> 10556(ms) rcu: (t=21075 jiffies g=377761 q=204059 ncpus=384) ... <TASK> ? asm_sysvec_apic_timer_interrupt+0x16/0x20 ? walk_system_ram_range+0x63/0x120 ? walk_system_ram_range+0x46/0x120 ? pgprot_writethrough+0x20/0x20 lookup_memtype+0x67/0xf0 track_pfn_insert+0x20/0x40 vmf_insert_pfn_prot+0x88/0x140 vfio_pci_mmap_huge_fault+0xf9/0x1b0 [vfio_pci_core] __do_fault+0x28/0x1b0 handle_mm_fault+0xef1/0x2560 fixup_user_fault+0xf5/0x270 vaddr_get_pfns+0x169/0x2f0 [vfio_iommu_type1] vfio_pin_pages_remote+0x162/0x8e0 [vfio_iommu_type1] vfio_iommu_type1_ioctl+0x1121/0x1810 [vfio_iommu_type1] ? futex_wake+0x1c1/0x260 x64_sys_call+0x234/0x17a0 do_syscall_64+0x63/0x130 ? exc_page_fault+0x63/0x130 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20250715184622.3561598-1-kbusch@meta.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-05	vfio/qat: add support for intel QAT 6xxx virtual functions	Małgorzata Mielnik
	Extend the qat_vfio_pci variant driver to support QAT 6xxx Virtual Functions (VFs). Add the relevant QAT 6xxx VF device IDs to the driver's probe table, enabling proper detection and initialization of these devices. Update the module description to reflect that the driver now supports all QAT generations. Signed-off-by: Małgorzata Mielnik <malgorzata.mielnik@intel.com> Signed-off-by: Suman Kumar Chakraborty <suman.kumar.chakraborty@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Link: https://lore.kernel.org/r/20250715081150.1244466-1-suman.kumar.chakraborty@intel.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-05	vfio/qat: Remove myself from VFIO QAT PCI driver maintainers	Xin Zeng
	Remove myself from VFIO QAT PCI driver maintainers as I'm leaving Intel. Signed-off-by: Xin Zeng <xin.zeng@intel.com> Link: https://lore.kernel.org/r/20250715001357.33725-1-xin.zeng@intel.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-05	vfio/pci: Do vf_token checks for VFIO_DEVICE_BIND_IOMMUFD	Jason Gunthorpe
	This was missed during the initial implementation. The VFIO PCI encodes the vf_token inside the device name when opening the device from the group FD, something like: "0000:04:10.0 vf_token=bd8d9d2b-5a5f-4f5a-a211-f591514ba1f3" This is used to control access to a VF unless there is co-ordination with the owner of the PF. Since we no longer have a device name in the cdev path, pass the token directly through VFIO_DEVICE_BIND_IOMMUFD using an optional field indicated by VFIO_DEVICE_BIND_FLAG_TOKEN. Fixes: 5fcc26969a16 ("vfio: Add VFIO_DEVICE_BIND_IOMMUFD") Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/0-v3-bdd8716e85fe+3978a-vfio_token_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-08-05	Input: add keycode for performance mode key	Marcos Alano
	Alienware calls this key "Performance Boost". Dell calls it "G-Mode". The goal is to have a specific keycode to detect when this key is pressed, so userspace can act upon it and do what have to do, usually starting the power profile for performance. Signed-off-by: Marcos Alano <marcoshalano@gmail.com> Link: https://lore.kernel.org/r/20250509193708.2190586-1-marcoshalano@gmail.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-08-06	Merge tag 'drm-intel-next-fixes-2025-08-05' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/i915/kernel into drm-next drm/i915 fixes for v6.17-rc1: - Fixes around DP LFPS (Low-Frequency Periodic Signaling) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/e1147bede8f219682419d198022cfe8d9d4edc28@intel.com
2025-08-05	selftests/perf_events: Add a mmap() correctness test	Lorenzo Stoakes
	Exercise various mmap(), munmap() and mremap() invocations, which might cause a perf buffer mapping to be split or truncated. To avoid hard coding the perf event and having dependencies on architectures and configuration options, scan through event types in sysfs and try to open them. On success, try to mmap() and if that succeeds try to mmap() the AUX buffer. In case that no AUX buffer supporting event is found, only test the base buffer mapping. If no mappable event is found or permissions are not sufficient, skip the tests. Reserve a PROT_NONE region for both rb and aux tests to allow testing the case where mremap unmaps beyond the end of a mapped VMA to prevent it from unmapping unrelated mappings. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Co-developed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
2025-08-05	perf/core: Prevent VMA split of buffer mappings	Thomas Gleixner
	The perf mmap code is careful about mmap()'ing the user page with the ringbuffer and additionally the auxiliary buffer, when the event supports it. Once the first mapping is established, subsequent mapping have to use the same offset and the same size in both cases. The reference counting for the ringbuffer and the auxiliary buffer depends on this being correct. Though perf does not prevent that a related mapping is split via mmap(2), munmap(2) or mremap(2). A split of a VMA results in perf_mmap_open() calls, which take reference counts, but then the subsequent perf_mmap_close() calls are not longer fulfilling the offset and size checks. This leads to reference count leaks. As perf already has the requirement for subsequent mappings to match the initial mapping, the obvious consequence is that VMA splits, caused by resizing of a mapping or partial unmapping, have to be prevented. Implement the vm_operations_struct::may_split() callback and return unconditionally -EINVAL. That ensures that the mapping offsets and sizes cannot be changed after the fact. Remapping to a different fixed address with the same size is still possible as it takes the references for the new mapping and drops those of the old mapping. Fixes: 45bfb2e50471 ("perf: Add AUX area to ring buffer for raw data streams") Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-27504 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: stable@vger.kernel.org
2025-08-05	perf/core: Handle buffer mapping fail correctly in perf_mmap()	Thomas Gleixner
	After successful allocation of a buffer or a successful attachment to an existing buffer perf_mmap() tries to map the buffer read only into the page table. If that fails, the already set up page table entries are zapped, but the other perf specific side effects of that failure are not handled. The calling code just cleans up the VMA and does not invoke perf_mmap_close(). This leaks reference counts, corrupts user->vm accounting and also results in an unbalanced invocation of event::event_mapped(). Cure this by moving the event::event_mapped() invocation before the map_range() call so that on map_range() failure perf_mmap_close() can be invoked without causing an unbalanced event::event_unmapped() call. perf_mmap_close() undoes the reference counts and eventually frees buffers. Fixes: b709eb872e19 ("perf: map pages in advance") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: stable@vger.kernel.org
2025-08-05	perf/core: Exit early on perf_mmap() fail	Thomas Gleixner
	When perf_mmap() fails to allocate a buffer, it still invokes the event_mapped() callback of the related event. On X86 this might increase the perf_rdpmc_allowed reference counter. But nothing undoes this as perf_mmap_close() is never called in this case, which causes another reference count leak. Return early on failure to prevent that. Fixes: 1e0fb9ec679c ("perf: Add pmu callbacks to track event mapping and unmapping") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: stable@vger.kernel.org
2025-08-05	perf/core: Don't leak AUX buffer refcount on allocation failure	Thomas Gleixner
	Failure of the AUX buffer allocation leaks the reference count. Set the reference count to 1 only when the allocation succeeds. Fixes: 45bfb2e50471 ("perf: Add AUX area to ring buffer for raw data streams") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: stable@vger.kernel.org
2025-08-05	perf/core: Preserve AUX buffer allocation failure result	Thomas Gleixner
	A recent overhaul sets the return value to 0 unconditionally after the allocations, which causes reference count leaks and corrupts the user->vm accounting. Preserve the AUX buffer allocation failure return value, so that the subsequent code works correctly. Fixes: 0983593f32c4 ("perf/core: Lift event->mmap_mutex in perf_mmap()") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: stable@vger.kernel.org
2025-08-05	RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages	Pedro Falcato
	Ever since commit c2ff29e99a76 ("siw: Inline do_tcp_sendpages()"), we have been doing this: static int siw_tcp_sendpages(struct socket s, struct page page, int offset, size_t size) [...] / Calculate the number of bytes we need to push, for this page * specifically / size_t bytes = min_t(size_t, PAGE_SIZE - offset, size); / If we can't splice it, then copy it in, as normal / if (!sendpage_ok(page[i])) msg.msg_flags &= ~MSG_SPLICE_PAGES; / Set the bvec pointing to the page, with len $bytes / bvec_set_page(&bvec, page[i], bytes, offset); / Set the iter to $size, aka the size of the whole sendpages (!!!) / iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); try_page_again: lock_sock(sk); / Sendmsg with $size size (!!!) / rv = tcp_sendmsg_locked(sk, &msg, size); This means we've been sending oversized iov_iters and tcp_sendmsg calls for a while. This has a been a benign bug because sendpage_ok() always returned true. With the recent slab allocator changes being slowly introduced into next (that disallow sendpage on large kmalloc allocations), we have recently hit out-of-bounds crashes, due to slight differences in iov_iter behavior between the MSG_SPLICE_PAGES and "regular" copy paths: (MSG_SPLICE_PAGES) skb_splice_from_iter iov_iter_extract_pages iov_iter_extract_bvec_pages uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere skb_splice_from_iter gets a "short" read (!MSG_SPLICE_PAGES) skb_copy_to_page_nocache copy=iov_iter_count [...] copy_from_iter / this doesn't help */ if (unlikely(iter->count < len)) len = iter->count; iterate_bvec ... and we run off the bvecs Fix this by properly setting the iov_iter's byte count, plus sending the correct byte count to tcp_sendmsg_locked. Link: https://patch.msgid.link/r/20250729120348.495568-1-pfalcato@suse.de Cc: stable@vger.kernel.org Fixes: c2ff29e99a76 ("siw: Inline do_tcp_sendpages()") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Bernard Metzler <bernard.metzler@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-08-05	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux	Linus Torvalds
	Pull ARM update from Russell King: "Just one development update this time: - Finish removing Coresight support" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9449/1: coresight: Finish removal of Coresight support in arch/arm/kernel
2025-08-05	Merge tag 'exfat-for-6.17-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Use generic_write_sync instead of vfs_fsync_range in exfat_file_write_iter. It will fix an issue where fdatasync would be set incorrectly. - Fix potential infinite loop by the self-linked chain. * tag 'exfat-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: add cluster chain loop check for dir exfat: fdatasync flag should be same like generic_write_sync()
2025-08-05	Merge tag 'mm-stable-2025-08-03-12-35' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: "Significant patch series in this pull request: - "mseal cleanups" (Lorenzo Stoakes) Some mseal cleaning with no intended functional change. - "Optimizations for khugepaged" (David Hildenbrand) Improve khugepaged throughput by batching PTE operations for large folios. This gain is mainly for arm64. - "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes" (Mike Rapoport) A bugfix, additional debug code and cleanups to the execmem code. - "mm/shmem, swap: bugfix and improvement of mTHP swap in" (Kairui Song) Bugfixes, cleanups and performance improvememnts to the mTHP swapin code" * tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (38 commits) mm: mempool: fix crash in mempool_free() for zero-minimum pools mm: correct type for vmalloc vm_flags fields mm/shmem, swap: fix major fault counting mm/shmem, swap: rework swap entry and index calculation for large swapin mm/shmem, swap: simplify swapin path and result handling mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO mm/shmem, swap: tidy up swap entry splitting mm/shmem, swap: tidy up THP swapin checks mm/shmem, swap: avoid redundant Xarray lookup during swapin x86/ftrace: enable EXECMEM_ROX_CACHE for ftrace allocations x86/kprobes: enable EXECMEM_ROX_CACHE for kprobes allocations execmem: drop writable parameter from execmem_fill_trapping_insns() execmem: add fallback for failures in vmalloc(VM_ALLOW_HUGE_VMAP) execmem: move execmem_force_rw() and execmem_restore_rox() before use execmem: rework execmem_cache_free() execmem: introduce execmem_alloc_rw() execmem: drop unused execmem_update_copy() mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped mm/rmap: add anon_vma lifetime debug check mm: remove mm/io-mapping.c ...
2025-08-05	s390/mm: Allocate page table with PAGE_SIZE granularity	Sumanth Korikkar
	Make vmem_pte_alloc() consistent by always allocating page table of PAGE_SIZE granularity, regardless of whether page_table_alloc() (with slab) or memblock_alloc() is used. This ensures page table can be fully freed when the corresponding page table entries are removed. Fixes: d08d4e7cd6bf ("s390/mm: use full 4KB page for 2KB PTE") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-08-05	LoongArch: vDSO: Remove -nostdlib complier flag	Wentao Guan
	Since $(LD) is directly used, hence -nostdlib is unneeded, MIPS has removed this, we should remove it too. bdbf2038fbf4 ("MIPS: VDSO: remove -nostdlib compiler flag"). In fact, other architectures also use $(LD) now. fe00e50b2db8 ("ARM: 8858/1: vdso: use $(LD) instead of $(CC) to link VDSO") 691efbedc60d ("arm64: vdso: use $(LD) instead of $(CC) to link VDSO") 2ff906994b6c ("MIPS: VDSO: Use $(LD) instead of $(CC) to link VDSO") 2b2a25845d53 ("s390/vdso: Use $(LD) instead of $(CC) to link vDSO") Cc: stable@vger.kernel.org Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn> Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	LoongArch: dts: Add eMMC/SDIO controller support to Loongson-2K2000	Binbin Zhou
	The Loongson-2K2000 integrates one eMMC controller and one SDIO controller. The module is supported now, enable it. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	LoongArch: dts: Add SDIO controller support to Loongson-2K1000	Binbin Zhou
	The Loongson-2K1000 integrates one SDIO controller for SD storage cards and SDIO cards. The module is supported now, enable it. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	LoongArch: dts: Add SDIO controller support to Loongson-2K0500	Binbin Zhou
	The Loongson-2K0500 integrates two SDIO controllers for SD storage cards and SDIO cards, supporting SD storage card boot. The module is supported now, enable it. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	LoongArch: BPF: Set bpf_jit_bypass_spec_v1/v4()	Tiezhu Yang
	JITs can set bpf_jit_bypass_spec_v1/v4() if they want the verifier to skip analysis/patching for the respective vulnerability, it is safe to set both bpf_jit_bypass_spec_v1/v4(), because there is no speculation barrier instruction for LoongArch. Suggested-by: Luis Gerhorst <luis.gerhorst@fau.de> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	LoongArch: BPF: Fix the tailcall hierarchy	Haoran Jiang
	In specific use cases combining tailcalls and BPF-to-BPF calls， MAX_TAIL_CALL_CNT won't work because of missing tail_call_cnt back-propagation from callee to caller. This patch fixes this tailcall issue caused by abusing the tailcall in bpf2bpf feature on LoongArch like the way of "bpf, x64: Fix tailcall hierarchy". Push tail_call_cnt_ptr and tail_call_cnt into the stack, tail_call_cnt_ptr is passed between tailcall and bpf2bpf, uses tail_call_cnt_ptr to increment tail_call_cnt. Fixes: bb035ef0cc91 ("LoongArch: BPF: Support mixing bpf2bpf and tailcalls") Reviewed-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	LoongArch: BPF: Fix jump offset calculation in tailcall	Haoran Jiang
	The extra pass of bpf_int_jit_compile() skips JIT context initialization which essentially skips offset calculation leaving out_offset = -1, so the jmp_offset in emit_bpf_tail_call is calculated by "#define jmp_offset (out_offset - (cur_offset))" is a negative number, which is wrong. The final generated assembly are as follow. 54: bgeu $a2, $t1, -8 # 0x0000004c 58: addi.d $a6, $s5, -1 5c: bltz $a6, -16 # 0x0000004c 60: alsl.d $t2, $a2, $a1, 0x3 64: ld.d $t2, $t2, 264 68: beq $t2, $zero, -28 # 0x0000004c Before apply this patch, the follow test case will reveal soft lock issues. cd tools/testing/selftests/bpf/ ./test_progs --allow=tailcalls/tailcall_bpf2bpf_1 dmesg: watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [test_progs:25056] Cc: stable@vger.kernel.org Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	LoongArch: BPF: Add struct ops support for trampoline	Tiezhu Yang
	Use BPF_TRAMP_F_INDIRECT flag to detect struct ops and emit proper prologue and epilogue for this case. With this patch, all of the struct_ops related testcases (except struct_ops_multi_pages) passed on LoongArch. The testcase struct_ops_multi_pages failed is because the actual image_pages_cnt is 40 which is bigger than MAX_TRAMP_IMAGE_PAGES. Before: $ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages ... WATCHDOG: test case struct_ops_module/struct_ops_load executes for 10 seconds... After: $ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages ... #15 bad_struct_ops:OK ... #399 struct_ops_autocreate:OK ... #400 struct_ops_kptr_return:OK ... #401 struct_ops_maybe_null:OK ... #402 struct_ops_module:OK ... #404 struct_ops_no_cfi:OK ... #405 struct_ops_private_stack:SKIP ... #406 struct_ops_refcounted:OK Summary: 8/25 PASSED, 3 SKIPPED, 0 FAILED Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	LoongArch: BPF: Add basic bpf trampoline support	Chenghao Duan
	BPF trampoline is the critical infrastructure of the BPF subsystem, acting as a mediator between kernel functions and BPF programs. Numerous important features, such as using BPF program for zero overhead kernel introspection, rely on this key component. The related tests have passed, including the following technical points: 1. fentry 2. fmod_ret 3. fexit The following related testcases passed on LoongArch: sudo ./test_progs -a fentry_test/fentry sudo ./test_progs -a fexit_test/fexit sudo ./test_progs -a fentry_fexit sudo ./test_progs -a modify_return sudo ./test_progs -a fexit_sleep sudo ./test_progs -a test_overhead sudo ./test_progs -a trampoline_count This issue was first reported by Geliang Tang in June 2024 while debugging MPTCP BPF selftests on a LoongArch machine (see commit eef0532e900c "selftests/bpf: Null checks for links in bpf_tcp_ca"). Geliang, Huacai, and Tiezhu then worked together to drive the implementation of this feature, encouraging broader collaboration among Chinese kernel engineers. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202507100034.wXofj6VX-lkp@intel.com/ Reported-by: Geliang Tang <geliang@kernel.org> Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn> Tested-by: Vincent Li <vincent.mc.li@gmail.com> Co-developed-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	LoongArch: BPF: Add dynamic code modification support	Chenghao Duan
	This commit adds support for BPF dynamic code modification on the LoongArch architecture: 1. Add bpf_arch_text_copy() for instruction block copying. 2. Add bpf_arch_text_poke() for runtime instruction patching. 3. Add bpf_arch_text_invalidate() for code invalidation. On LoongArch, since symbol addresses in the direct mapping region can't be reached via relative jump instructions from the paged mapping region, we use the move_imm+jirl instruction pair as absolute jump instructions. These require 2-5 instructions, so we reserve 5 NOP instructions in the program as placeholders for function jumps. The larch_insn_text_copy() function is solely used for BPF. And the use of larch_insn_text_copy() requires PAGE_SIZE alignment. Currently, only the size of the BPF trampoline is page-aligned. Co-developed-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	LoongArch: BPF: Rename and refactor validate_code()	Chenghao Duan
	1. Rename the existing validate_code() to validate_ctx() 2. Factor out the code validation handling into a new helper validate_code() Then: * validate_code() is used to check the validity of code. * validate_ctx() is used to check both code validity and table entry correctness. The new validate_code() will be used in subsequent changes. Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com> Co-developed-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-08-05	vhost: initialize vq->nheads properly	Jason Wang
	Commit 7918bb2d19c9 ("vhost: basic in order support") introduces vq->nheads to store the number of batched used buffers per used elem but it forgets to initialize the vq->nheads to NULL in vhost_dev_init() this will cause kfree() that would try to free it without be allocated if SET_OWNER is not called. Reported-by: JAEHOON KIM <jhkim@linux.ibm.com> Reported-by: Breno Leitao <leitao@debian.org> Fixes: 45347e79b544 ("vhost: basic in order support") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20250729073916.80647-1-jasowang@redhat.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Tested-by: Breno Leitao <leitao@debian.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Jaehoon Kim <jhkim@linux.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-08-04	dpll: zl3073x: ZL3073X_I2C and ZL3073X_SPI should depend on NET	Geert Uytterhoeven
	When making ZL3073X invisible, it was overlooked that ZL3073X depends on NET, while ZL3073X_I2C and ZL3073X_SPI do not, causing: WARNING: unmet direct dependencies detected for ZL3073X when selected by ZL3073X_I2C WARNING: unmet direct dependencies detected for ZL3073X when selected by ZL3073X_SPI WARNING: unmet direct dependencies detected for ZL3073X Depends on [n]: NET [=n] Selected by [y]: - ZL3073X_I2C [=y] && I2C [=y] Selected by [y]: - ZL3073X_SPI [=y] && SPI [=y] Fix this by adding the missing dependencies to ZL3073X_I2C and ZL3073X_SPI. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202508022110.nTqZ5Ylu-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202508022351.NHIxPF8j-lkp@intel.com/ Fixes: a4f0866e3dbbf3fe ("dpll: Make ZL3073X invisible") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20250802155302.3673457-1-geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04	net/sched: mqprio: fix stack out-of-bounds write in tc entry parsing	Maher Azzouzi
	TCA_MQPRIO_TC_ENTRY_INDEX is validated using NLA_POLICY_MAX(NLA_U32, TC_QOPT_MAX_QUEUE), which allows the value TC_QOPT_MAX_QUEUE (16). This leads to a 4-byte out-of-bounds stack write in the fp[] array, which only has room for 16 elements (0–15). Fix this by changing the policy to allow only up to TC_QOPT_MAX_QUEUE - 1. Fixes: f62af20bed2d ("net/sched: mqprio: allow per-TC user input of FP adminStatus") Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Maher Azzouzi <maherazz04@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250802001857.2702497-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04	Revert "net: mdio_bus: Use devm for getting reset GPIO"	Jakub Kicinski
	This reverts commit 3b98c9352511db627b606477fc7944b2fa53a165. Russell says: Using devm_() [here] is completely wrong, because this is called from mdiobus_register_device(). This is not the probe function for the device, and thus there is no code to trigger the release of the resource on unregistration. Moreover, when the mdiodev is eventually probed, if the driver fails or the driver is unbound, the GPIO will be released, but a reference will be left behind. Using devm with a struct device that is not currently being probed is fundamentally wrong - an abuse of devm. Reported-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/95449490-fa58-41d4-9493-c9213c1f2e7d@sirena.org.uk Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Fixes: 3b98c9352511 ("net: mdio_bus: Use devm for getting reset GPIO") Link: https://patch.msgid.link/20250801212742.2607149-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04	selftests: net: packetdrill: xfail all problems on slow machines	Jakub Kicinski
	We keep seeing flakes on packetdrill on debug kernels, while non-debug kernels are stable, not a single flake in 200 runs. Time to give up, debug kernels appear to suffer from 10msec latency spikes and any timing-sensitive test is bound to flake. Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250801181638.2483531-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04	net/packet: fix a race in packet_set_ring() and packet_notifier()	Quang Le
	When packet_set_ring() releases po->bind_lock, another thread can run packet_notifier() and process an NETDEV_UP event. This race and the fix are both similar to that of commit 15fe076edea7 ("net/packet: fix a race in packet_bind() and packet_notifier()"). There too the packet_notifier NETDEV_UP event managed to run while a po->bind_lock critical section had to be temporarily released. And the fix was similarly to temporarily set po->num to zero to keep the socket unhooked until the lock is retaken. The po->bind_lock in packet_set_ring and packet_notifier precede the introduction of git history. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Quang Le <quanglex97@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250801175423.2970334-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04	benet: fix BUG when creating VFs	Michal Schmidt
	benet crashes as soon as SRIOV VFs are created: kernel BUG at mm/vmalloc.c:3457! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 4 UID: 0 PID: 7408 Comm: test.sh Kdump: loaded Not tainted 6.16.0+ #1 PREEMPT(voluntary) [...] RIP: 0010:vunmap+0x5f/0x70 [...] Call Trace: <TASK> __iommu_dma_free+0xe8/0x1c0 be_cmd_set_mac_list+0x3fe/0x640 [be2net] be_cmd_set_mac+0xaf/0x110 [be2net] be_vf_eth_addr_config+0x19f/0x330 [be2net] be_vf_setup+0x4f7/0x990 [be2net] be_pci_sriov_configure+0x3a1/0x470 [be2net] sriov_numvfs_store+0x20b/0x380 kernfs_fop_write_iter+0x354/0x530 vfs_write+0x9b9/0xf60 ksys_write+0xf3/0x1d0 do_syscall_64+0x8c/0x3d0 be_cmd_set_mac_list() calls dma_free_coherent() under a spin_lock_bh. Fix it by freeing only after the lock has been released. Fixes: 1a82d19ca2d6 ("be2net: fix sleeping while atomic bugs in be_ndo_bridge_getlink") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250801101338.72502-1-mschmidt@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04	net: airoha: npu: Add missing MODULE_FIRMWARE macros	Lorenzo Bianconi
	Introduce missing MODULE_FIRMWARE definitions for firmware autoload. Fixes: 23290c7bc190d ("net: airoha: Introduce Airoha NPU support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250801-airoha-npu-missing-module-firmware-v2-1-e860c824d515@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04	net: devmem: fix DMA direction on unmapping	Jakub Kicinski
	Looks like we always unmap the DMA_BUF with DMA_FROM_DEVICE direction. While at it unexport __net_devmem_dmabuf_binding_free(), it's internal. Found by code inspection. Fixes: bd61848900bf ("net: devmem: Implement TX path") Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250801011335.2267515-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04	ipa: fix compile-testing with qcom-mdt=m	Arnd Bergmann
	There are multiple drivers that use the qualcomm mdt loader, but they have conflicting ideas of how to deal with that dependency when compile-testing for non-qualcomm targets: IPA only enables the MDT loader when the kernel config includes ARCH_QCOM, but the newly added ath12k support always enables it, which leads to a link failure with the combination of IPA=y and ATH12K=m: aarch64-linux-ld: drivers/net/ipa/ipa_main.o: in function `ipa_firmware_load': ipa_main.c:(.text.unlikely+0x134): undefined reference to `qcom_mdt_load The ATH12K method seems more reliable here, so change IPA over to do the same thing. Fixes: 38a4066f593c ("net: ipa: support COMPILE_TEST") Fixes: c0dd3f4f7091 ("wifi: ath12k: enable ath12k AHB support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20250731080024.2054904-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04	eth: fbnic: unlink NAPIs from queues on error to open	Jakub Kicinski
	CI hit a UaF in fbnic in the AF_XDP portion of the queues.py test. The UaF is in the __sk_mark_napi_id_once() call in xsk_bind(), NAPI has been freed. Looks like the device failed to open earlier, and we lack clearing the NAPI pointer from the queue. Fixes: 557d02238e05 ("eth: fbnic: centralize the queue count and NAPI<>queue setting") Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250728163129.117360-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>