summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2025-06-24usb: typec: altmodes/displayport: do not index invalid pin_assignmentsRD Babiera
A poorly implemented DisplayPort Alt Mode port partner can indicate that its pin assignment capabilities are greater than the maximum value, DP_PIN_ASSIGN_F. In this case, calls to pin_assignment_show will cause a BRK exception due to an out of bounds array access. Prevent for loop in pin_assignment_show from accessing invalid values in pin_assignments by adding DP_PIN_ASSIGN_MAX value in typec_dp.h and using i < DP_PIN_ASSIGN_MAX as a loop condition. Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode") Cc: stable <stable@kernel.org> Signed-off-by: RD Babiera <rdbabiera@google.com> Reviewed-by: Badhri Jagan Sridharan <badhri@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250618224943.3263103-2-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-24tty: fix tty_port_tty_*hangup() kernel-docJiri Slaby (SUSE)
The commit below added a new helper, but omitted to move (and add) the corressponding kernel-doc. Do it now. Signed-off-by: "Jiri Slaby (SUSE)" <jirislaby@kernel.org> Fixes: 2b5eac0f8c6e ("tty: introduce and use tty_port_tty_vhangup() helper") Link: https://lore.kernel.org/all/b23d566c-09dc-7374-cc87-0ad4660e8b2e@linux.intel.com/ Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Link: https://lore.kernel.org/r/20250624080641.509959-6-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-24fs: Remove three arguments from block_write_end()Matthew Wilcox (Oracle)
block_write_end() looks like it can be used as a ->write_end() implementation. However, it can't as it does not unlock nor put the folio. Since it does not use the 'file', 'mapping' nor 'fsdata' arguments, remove them. Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Link: https://lore.kernel.org/20250624132130.1590285-1-willy@infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-24net: pse-pd: tps23881: Clarify setup_pi_matrix callback documentationKory Maincent
Improve the setup_pi_matrix callback documentation to clarify its purpose and usage. The enhanced description explains that PSE PI devicetree nodes are pre-parsed before this callback is invoked, and drivers should utilize pcdev->pi[x]->pairset[y].np to map PSE controller hardware ports to their corresponding Power Interfaces. This clarification helps driver implementers understand the callback's role in establishing the hardware-to-PI relationship mapping. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20250620-poe_doc_improve-v1-2-96357bb95d52@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-06-24net: phy: Add interface types for 50G and 100GAlexander Duyck
Add support for 802.3cd based interface types 50GBASE-R and 100GBASE-P. This choice in naming is based on section 135 of the 802.3-2022 IEEE Standard. In addition it is adding support for what I am referring to as LAUI which is based on annex 135C of the IEEE Standard, and shares many similarities with the 25/50G consortium. The main difference between the two is that IEEE spec refers to LAUI as the AUI before the RS(544/514) FEC, whereas the 25/50G use this lane and frequency combination after going through RS(528/514), Base-R or no FEC at all. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/175028444205.625704.4191700324472974116.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-06-23PCI/pwrctrl: Fix the kerneldoc tag for private fieldsBartosz Golaszewski
The correct tag for marking private fields in kerneldoc is "private:", not capitalized "Private:". Fix the pwrctl struct to silence the following warnings: Warning: include/linux/pci-pwrctrl.h:45 struct member 'nb' not described in 'pci_pwrctrl' Warning: include/linux/pci-pwrctrl.h:45 struct member 'link' not described in 'pci_pwrctrl' Warning: include/linux/pci-pwrctrl.h:45 struct member 'work' not described in 'pci_pwrctrl' Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code") Reported-by: Bjorn Helgaas <helgaas@kernel.org> Closes: https://lore.kernel.org/all/20250617233539.GA1177120@bhelgaas/ Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250618091129.44810-1-brgl@bgdev.pl
2025-06-23workqueue: Remove unused work_on_cpu_safeDr. David Alan Gilbert
The last use of the work_on_cpu_safe() macro was removed recently by commit 9cda46babdfe ("crypto: n2 - remove Niagara2 SPU driver") Remove it, and the work_on_cpu_safe_key() function it calls. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-06-23replace collect_mounts()/drop_collected_mounts() with a safer variantAl Viro
collect_mounts() has several problems - one can't iterate over the results directly, so it has to be done with callback passed to iterate_mounts(); it has an oopsable race with d_invalidate(); it creates temporary clones of mounts invisibly for sync umount (IOW, you can have non-lazy umount succeed leaving filesystem not mounted anywhere and yet still busy). A saner approach is to give caller an array of struct path that would pin every mount in a subtree, without cloning any mounts. * collect_mounts()/drop_collected_mounts()/iterate_mounts() is gone * collect_paths(where, preallocated, size) gives either ERR_PTR(-E...) or a pointer to array of struct path, one for each chunk of tree visible under 'where' (i.e. the first element is a copy of where, followed by (mount,root) for everything mounted under it - the same set collect_mounts() would give). Unlike collect_mounts(), the mounts are *not* cloned - we just get pinning references to the roots of subtrees in the caller's namespace. Array is terminated by {NULL, NULL} struct path. If it fits into preallocated array (on-stack, normally), that's where it goes; otherwise it's allocated by kmalloc_array(). Passing 0 as size means that 'preallocated' is ignored (and expected to be NULL). * drop_collected_paths(paths, preallocated) is given the array returned by an earlier call of collect_paths() and the preallocated array passed to that call. All mount/dentry references are dropped and array is kfree'd if it's not equal to 'preallocated'. * instead of iterate_mounts(), users should just iterate over array of struct path - nothing exotic is needed for that. Existing users (all in audit_tree.c) are converted. [folded a fix for braino reported by Venkat Rao Bagalkote <venkat88@linux.ibm.com>] Fixes: 80b5dce8c59b0 ("vfs: Add a function to lazily unmount all mounts from any dentry") Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-23drm/xe/nvm: add support for non-posted eraseReuven Abliyev
Erase command is slow on discrete graphics storage and may overshot PCI completion timeout. BMG introduces the ability to have non-posted erase. Add driver support for non-posted erase with polling for erase completion. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Reuven Abliyev <reuven.abliyev@intel.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Link: https://lore.kernel.org/r/20250617145159.3803852-9-alexander.usyskin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-06-23mtd: add driver for intel graphics non-volatile memory deviceAlexander Usyskin
Add auxiliary driver for intel discrete graphics non-volatile memory device. CC: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Miquel Raynal <miquel.raynal@bootlin.com> Co-developed-by: Tomas Winkler <tomasw@gmail.com> Signed-off-by: Tomas Winkler <tomasw@gmail.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Link: https://lore.kernel.org/r/20250617145159.3803852-2-alexander.usyskin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-06-23sched/wait: Add a waitqueue helper for fully exclusive priority waitersSean Christopherson
Add a waitqueue helper to add a priority waiter that requires exclusive wakeups, i.e. that requires that it be the _only_ priority waiter. The API will be used by KVM to ensure that at most one of KVM's irqfds is bound to a single eventfd (across the entire kernel). Open code the helper instead of using __add_wait_queue() so that the common path doesn't need to "handle" impossible failures. Cc: K Prateek Nayak <kprateek.nayak@amd.com> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250522235223.3178519-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23KVM: Use a local struct to do the initial vfs_poll() on an irqfdSean Christopherson
Use a function-local struct for the poll_table passed to vfs_poll(), as nothing in the vfs_poll() callchain grabs a long-term reference to the structure, i.e. its lifetime doesn't need to be tied to the irqfd. Using a local structure will also allow propagating failures out of the polling callback without further polluting kvm_kernel_irqfd. Opportunstically rename irqfd_ptable_queue_proc() to kvm_irqfd_register() to capture what it actually does. Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250522235223.3178519-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: KVM: SVM: Allow KVM to control need for GA log interruptsSean Christopherson
Add plumbing to the AMD IOMMU driver to allow KVM to control whether or not an IRTE is configured to generate GA log interrupts. KVM only needs a notification if the target vCPU is blocking, so the vCPU can be awakened. If a vCPU is preempted or exits to userspace, KVM clears is_run, but will set the vCPU back to running when userspace does KVM_RUN and/or the vCPU task is scheduled back in, i.e. KVM doesn't need a notification. Unconditionally pass "true" in all KVM paths to isolate the IOMMU changes from the KVM changes insofar as possible. Opportunistically swap the ordering of parameters for amd_iommu_update_ga() so that the match amd_iommu_activate_guest_mode(). Note, as of this writing, the AMD IOMMU manual doesn't list GALogIntr as a non-cached field, but per AMD hardware architects, it's not cached and can be safely updated without an invalidation. Link: https://lore.kernel.org/all/b29b8c22-2fd4-4b5e-b755-9198874157c7@amd.com Cc: Vasant Hegde <vasant.hegde@amd.com> Cc: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20250611224604.313496-62-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: KVM: SVM: Set pCPU info in IRTE when setting vCPU affinitySean Christopherson
Now that setting vCPU affinity is guarded with ir_list_lock, i.e. now that avic_physical_id_entry can be safely accessed, set the pCPU info straight-away when setting vCPU affinity. Putting the IRTE into posted mode, and then immediately updating the IRTE a second time if the target vCPU is running is wasteful and confusing. This also fixes a flaw where a posted IRQ that arrives between putting the IRTE into guest_mode and setting the correct destination could cause the IOMMU to ring the doorbell on the wrong pCPU. Link: https://lore.kernel.org/r/20250611224604.313496-44-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: KVM: SVM: Infer IsRun from validity of pCPU destinationSean Christopherson
Infer whether or not a vCPU should be marked running from the validity of the pCPU on which it is running. amd_iommu_update_ga() already skips the IRTE update if the pCPU is invalid, i.e. passing %true for is_run with an invalid pCPU would be a blatant and egregrious KVM bug. Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-42-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23KVM: Fold kvm_arch_irqfd_route_changed() into kvm_arch_update_irqfd_routing()Sean Christopherson
Fold kvm_arch_irqfd_route_changed() into kvm_arch_update_irqfd_routing(). Calling arch code to know whether or not to call arch code is absurd. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250611224604.313496-35-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23KVM: Don't WARN if updating IRQ bypass route failsSean Christopherson
Don't bother WARNing if updating an IRTE route fails now that vendor code provides much more precise WARNs. The generic WARN doesn't provide enough information to actually debug the problem, and has obviously done nothing to surface the myriad bugs in KVM x86's implementation. Drop all of the associated return code plumbing that existed just so that common KVM could WARN. Link: https://lore.kernel.org/r/20250611224604.313496-34-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu: KVM: Split "struct vcpu_data" into separate AMD vs. Intel structsSean Christopherson
Split the vcpu_data structure that serves as a handoff from KVM to IOMMU drivers into vendor specific structures. Overloading a single structure makes the code hard to read and maintain, is *very* misleading as it suggests that mixing vendors is actually supported, and bastardizing Intel's posted interrupt descriptor address when AMD's IOMMU already has its own structure is quite unnecessary. Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-33-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23iommu/amd: KVM: SVM: Use pi_desc_addr to derive ga_root_ptrSean Christopherson
Use vcpu_data.pi_desc_addr instead of amd_iommu_pi_data.base to get the GA root pointer. KVM is the only source of amd_iommu_pi_data.base, and KVM's one and only path for writing amd_iommu_pi_data.base computes the exact same value for vcpu_data.pi_desc_addr and amd_iommu_pi_data.base, and fills amd_iommu_pi_data.base if and only if vcpu_data.pi_desc_addr is valid, i.e. amd_iommu_pi_data.base is fully redundant. Cc: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-23-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23io_uring: add struct io_cold_def->sqe_copy() methodJens Axboe
Will be called by the core of io_uring, if inline issue is not going to be tried for a request. Opcodes can define this handler to defer copying of SQE data that should remain stable. Only called if IO_URING_F_INLINE is set. If it isn't set, then there's a bug in the core handling of this, and -EFAULT will be returned instead to terminate the request. This will trigger a WARN_ON_ONCE(). Don't expect this to ever trigger, and down the line this can be removed. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-23io_uring: add IO_URING_F_INLINE issue flagJens Axboe
Set when the execution of the request is done inline from the system call itself. Any deferred issue will never have this flag set. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-23futex: Initialize futex_phash_new during fork().Sebastian Andrzej Siewior
During a hash resize operation the new private hash is stored in mm_struct::futex_phash_new if the current hash can not be immediately replaced. The new hash must not be copied during fork() into the new task. Doing so will lead to a double-free of the memory by the two tasks. Initialize the mm_struct::futex_phash_new during fork(). Closes: https://lore.kernel.org/all/aFBQ8CBKmRzEqIfS@mozart.vkv.me/ Fixes: bd54df5ea7cad ("futex: Allow to resize the private local hash") Reported-by: Calvin Owens <calvin@wbinvd.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Calvin Owens <calvin@wbinvd.org> Link: https://lkml.kernel.org/r/20250623083408.jTiJiC6_@linutronix.de
2025-06-23fs: introduce FALLOC_FL_WRITE_ZEROES to fallocateZhang Yi
With the development of flash-based storage devices, we can quickly write zeros to SSDs using the WRITE_ZERO command if the devices do not actually write physical zeroes to the media. Therefore, we can use this command to quickly preallocate a real all-zero file with written extents. This approach should be beneficial for subsequent pure overwriting within this file, as it can save on block allocation and, consequently, significant metadata changes, which should greatly improve overwrite performance on certain filesystems. Therefore, introduce a new operation FALLOC_FL_WRITE_ZEROES to fallocate. This flag is used to convert a specified range of a file to zeros by issuing a zeroing operation. Blocks should be allocated for the regions that span holes in the file, and the entire range is converted to written extents. If the underlying device supports the actual offload write zeroes command, the process of zeroing out operation can be accelerated. If it does not, we currently don't prevent the file system from writing actual zeros to the device. This provides users with a new method to quickly generate a zeroed file, users no longer need to write zero data to create a file with written extents. Users can determine whether a disk supports the unmap write zeroes feature through querying this sysfs interface: /sys/block/<disk>/queue/write_zeroes_unmap_max_hw_bytes Users can also enable or disable the unmap write zeroes operation through this sysfs interface: /sys/block/<disk>/queue/write_zeroes_unmap_max_bytes Finally, this flag cannot be specified in conjunction with the FALLOC_FL_KEEP_SIZE since allocating written extents beyond file EOF is not permitted. In addition, filesystems that always require out-of-place writes should not support this flag since they still need to allocated new blocks during subsequent overwrites. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Link: https://lore.kernel.org/20250619111806.3546162-7-yi.zhang@huaweicloud.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-23block: introduce max_{hw|user}_wzeroes_unmap_sectors to queue limitsZhang Yi
Currently, disks primarily implement the write zeroes command (aka REQ_OP_WRITE_ZEROES) through two mechanisms: the first involves physically writing zeros to the disk media (e.g., HDDs), while the second performs an unmap operation on the logical blocks, effectively putting them into a deallocated state (e.g., SSDs). The first method is generally slow, while the second method is typically very fast. For example, on certain NVMe SSDs that support NVME_NS_DEAC, submitting REQ_OP_WRITE_ZEROES requests with the NVME_WZ_DEAC bit can accelerate the write zeros operation by placing disk blocks into a deallocated state, which opportunistically avoids writing zeroes to media while still guaranteeing that subsequent reads from the specified block range will return zeroed data. This is a best-effort optimization, not a mandatory requirement, some devices may partially fall back to writing physical zeroes due to factors such as misalignment or being asked to clear a block range smaller than the device's internal allocation unit. Therefore, the speed of this operation is not guaranteed. It is difficult to determine whether the storage device supports unmap write zeroes operation. We cannot determine this by only querying bdev_limits(bdev)->max_write_zeroes_sectors. Therefore, first, add a new hardware queue limit parameters, max_hw_wzeroes_unmap_sectors, to indicate whether a device supports this unmap write zeroes operation. Then, add two new counterpart software queue limits, max_wzeroes_unmap_sectors and max_user_wzeroes_unmap_sectors, which allow users to disable this operation if the speed is very slow on some sepcial devices. Finally, for the stacked devices cases, initialize these two parameters to UINT_MAX. This operation should be enabled by both the stacking driver and all underlying devices. Thanks to Martin K. Petersen for optimizing the documentation of the write_zeroes_unmap sysfs interface. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Link: https://lore.kernel.org/20250619111806.3546162-2-yi.zhang@huaweicloud.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-23fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypassShivank Garg
Export anon_inode_make_secure_inode() to allow KVM guest_memfd to create anonymous inodes with proper security context. This replaces the current pattern of calling alloc_anon_inode() followed by inode_init_security_anon() for creating security context manually. This change also fixes a security regression in secretmem where the S_PRIVATE flag was not cleared after alloc_anon_inode(), causing LSM/SELinux checks to be bypassed for secretmem file descriptors. As guest_memfd currently resides in the KVM module, we need to export this symbol for use outside the core kernel. In the future, guest_memfd might be moved to core-mm, at which point the symbols no longer would have to be exported. When/if that happens is still unclear. Fixes: 2bfe15c52612 ("mm: create security context for memfd_secret inodes") Suggested-by: David Hildenbrand <david@redhat.com> Suggested-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Shivank Garg <shivankg@amd.com> Link: https://lore.kernel.org/20250620070328.803704-3-shivankg@amd.com Acked-by: "Mike Rapoport (Microsoft)" <rppt@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-23docs/vfs: update references to i_mutex to i_rwsemJunxuan Liao
VFS has switched to i_rwsem for ten years now (9902af79c01a: parallel lookups actual switch to rwsem), but the VFS documentation and comments still has references to i_mutex. Signed-off-by: Junxuan Liao <ljx@cs.wisc.edu> Link: https://lore.kernel.org/72223729-5471-474a-af3c-f366691fba82@cs.wisc.edu Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-23crypto: hisilicon - Use fine grained DMA mapping directionZenghui Yu
The following splat was triggered when booting the kernel built with arm64's defconfig + CRYPTO_SELFTESTS + DMA_API_DEBUG. ------------[ cut here ]------------ DMA-API: hisi_sec2 0000:75:00.0: cacheline tracking EEXIST, overlapping mappings aren't supported WARNING: CPU: 24 PID: 1273 at kernel/dma/debug.c:596 add_dma_entry+0x248/0x308 Call trace: add_dma_entry+0x248/0x308 (P) debug_dma_map_sg+0x208/0x3e4 __dma_map_sg_attrs+0xbc/0x118 dma_map_sg_attrs+0x10/0x24 hisi_acc_sg_buf_map_to_hw_sgl+0x80/0x218 [hisi_qm] sec_cipher_map+0xc4/0x338 [hisi_sec2] sec_aead_sgl_map+0x18/0x24 [hisi_sec2] sec_process+0xb8/0x36c [hisi_sec2] sec_aead_crypto+0xe4/0x264 [hisi_sec2] sec_aead_encrypt+0x14/0x20 [hisi_sec2] crypto_aead_encrypt+0x24/0x38 test_aead_vec_cfg+0x480/0x7e4 test_aead_vec+0x84/0x1b8 alg_test_aead+0xc0/0x498 alg_test.part.0+0x518/0x524 alg_test+0x20/0x64 cryptomgr_test+0x24/0x44 kthread+0x130/0x1fc ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- DMA-API: Mapped at: debug_dma_map_sg+0x234/0x3e4 __dma_map_sg_attrs+0xbc/0x118 dma_map_sg_attrs+0x10/0x24 hisi_acc_sg_buf_map_to_hw_sgl+0x80/0x218 [hisi_qm] sec_cipher_map+0xc4/0x338 [hisi_sec2] This occurs in selftests where the input and the output scatterlist point to the same underlying memory (e.g., when tested with INPLACE_TWO_SGLISTS mode). The problem is that the hisi_sec2 driver maps these two different scatterlists using the DMA_BIDIRECTIONAL flag which leads to overlapped write mappings which are not supported by the DMA layer. Fix it by using the fine grained and correct DMA mapping directions. While at it, switch the DMA directions used by the hisi_zip driver too. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Longfang Liu <liulongfang@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-06-23Merge 6.16-rc3 into driver-core-nextGreg Kroah-Hartman
We need the driver-core fixes that are in 6.16-rc3 into here as well to build on top of. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-22Merge tag 'x86_urgent_for_v6.16_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Make sure the array tracking which kernel text positions need to be alternatives-patched doesn't get mishandled by out-of-order modifications, leading to it overflowing and causing page faults when patching - Avoid an infinite loop when early code does a ranged TLB invalidation before the broadcast TLB invalidation count of how many pages it can flush, has been read from CPUID - Fix a CONFIG_MODULES typo - Disable broadcast TLB invalidation when PTI is enabled to avoid an overflow of the bitmap tracking dynamic ASIDs which need to be flushed when the kernel switches between the user and kernel address space - Handle the case of a CPU going offline and thus reporting zeroes when reading top-level events in the resctrl code * tag 'x86_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Fix int3 handling failure from broken text_poke array x86/mm: Fix early boot use of INVPLGB x86/its: Fix an ifdef typo in its_alloc() x86/mm: Disable INVLPGB when PTI is enabled x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl subsystem
2025-06-22Merge tag 'perf_urgent_for_v6.16_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Avoid a crash on a heterogeneous machine where not all cores support the same hw events features - Avoid a deadlock when throttling events - Document the perf event states more - Make sure a number of perf paths switching off or rescheduling events call perf_cgroup_event_disable() - Make sure perf does task sampling before its userspace mapping is torn down, and not after * tag 'perf_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix crash in icl_update_topdown_event() perf: Fix the throttle error of some clock events perf: Add comment to enum perf_event_state perf/core: Fix WARN in perf_cgroup_switch() perf: Fix dangling cgroup pointer in cpuctx perf: Fix cgroup state vs ERROR perf: Fix sample vs do_exit()
2025-06-22power: supply: core: rename power_supply_get_by_phandle to ↵Sebastian Reichel
power_supply_get_by_reference (devm_)power_supply_get_by_phandle now internally uses fwnode and are no longer DT specific. Thus drop the ifdef check for CONFIG_OF and rename to (devm_)power_supply_get_by_reference to avoid the DT terminology. Reviewed-by: Hans de Goede <hansg@kernel.org> Link: https://lore.kernel.org/r/20250430-psy-core-convert-to-fwnode-v2-5-f9643b958677@collabora.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2025-06-22power: supply: core: convert to fwnnodeSebastian Reichel
Replace any DT specific code with fwnode in the power-supply core. Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Hans de Goede <hansg@kernel.org> Link: https://lore.kernel.org/r/20250430-psy-core-convert-to-fwnode-v2-4-f9643b958677@collabora.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2025-06-22power: supply: core: remove of_node from power_supply_configSebastian Reichel
All drivers have been migrated from .of_node to .fwnode, so let's kill the former. Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Link: https://lore.kernel.org/r/20250430-psy-core-convert-to-fwnode-v2-2-f9643b958677@collabora.com Reviewed-by: Hans de Goede <hansg@kernel.org> Signed-off-by: Hans de Goede <hansg@kernel.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2025-06-21net: pse-pd: Fix ethnl_pse_send_ntf() stub parameter typeKory Maincent
The ethnl_pse_send_ntf() stub function has incorrect parameter type when CONFIG_ETHTOOL_NETLINK is disabled. The function should take a net_device pointer instead of phy_device pointer to match the actual implementation. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506200355.TqFiYUbN-lkp@intel.com/ Fixes: fc0e6db30941 ("net: pse-pd: Add support for reporting events") Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250620091641.2098028-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-21netmem: fix skb_frag_address_safe with unreadable skbsMina Almasry
skb_frag_address_safe() needs a check that the skb_frag_page exists check similar to skb_frag_address(). Cc: ap420073@gmail.com Signed-off-by: Mina Almasry <almasrymina@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250619175239.3039329-1-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-20Merge tag 'mtd/fixes-for-6.16-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull mtd fixes from Miquel Raynal: "The main fix that really needs to get in is the revert of the patch adding the new mtd_master class, because it entirely fails the partitioning if a specific Kconfig option is set. We need to think how to handle that differently, so let's revert it as we need to get back to the pen and paper situation again. Otherwise the definition of some Winbond SPI NAND chips are receiving some fixes (geometry and maximum frequency, mostly). And finally a small memory leak gets also fixed" * tag 'mtd/fixes-for-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: spinand: fix memory leak of ECC engine conf mtd: spinand: winbond: Prevent unsupported frequencies on dual/quad I/O variants mtd: spinand: winbond: Increase maximum frequency on an octal operation mtd: spinand: winbond: Fix W35N number of planes/LUN Revert "mtd: core: always create master device"
2025-06-20sched_ext: Add support for cgroup bandwidth control interfaceTejun Heo
From 077814f57f8acce13f91dc34bbd2b7e4911fbf25 Mon Sep 17 00:00:00 2001 From: Tejun Heo <tj@kernel.org> Date: Fri, 13 Jun 2025 15:06:47 -1000 - Add CONFIG_GROUP_SCHED_BANDWIDTH which is selected by both CONFIG_CFS_BANDWIDTH and EXT_GROUP_SCHED. - Put bandwidth control interface files for both cgroup v1 and v2 under CONFIG_GROUP_SCHED_BANDWIDTH. - Update tg_bandwidth() to fetch configuration parameters from fair if CONFIG_CFS_BANDWIDTH, SCX otherwise. - Update tg_set_bandwidth() to update the parameters for both fair and SCX. - Add bandwidth control parameters to struct scx_cgroup_init_args. - Add sched_ext_ops.cgroup_set_bandwidth() which is invoked on bandwidth control parameter updates. - Update scx_qmap and maximal selftest to test the new feature. Signed-off-by: Tejun Heo <tj@kernel.org>
2025-06-20sched_ext, sched/core: Factor out struct scx_task_groupTejun Heo
More sched_ext fields will be added to struct task_group. In preparation, factor out sched_ext fields into struct scx_task_group to reduce clutter in the common header. No functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2025-06-20sched_ext: Merge branch 'for-6.16-fixes' into for-6.17Tejun Heo
Pull sched_ext/for-6.16-fixes to receive: c50784e99f0e ("sched_ext: Make scx_group_set_weight() always update tg->scx.weight") 33796b91871a ("sched_ext, sched/core: Don't call scx_group_set_weight() prematurely from sched_create_group()") which are needed to implement CPU bandwidth control interface support. Signed-off-by: Tejun Heo <tj@kernel.org>
2025-06-20iommu/amd: KVM: SVM: Delete now-unused cached/previous GA tag fieldsSean Christopherson
Delete the amd_ir_data.prev_ga_tag field now that all usage is superfluous. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20KVM: SVM: Delete IRTE link from previous vCPU before setting new IRTESean Christopherson
Delete the previous per-vCPU IRTE link prior to modifying the IRTE. If forcing the IRTE back to remapped mode fails, the IRQ is already broken; keeping stale metadata won't change that, and the IOMMU should be sufficiently paranoid to sanitize the IRTE when the IRQ is freed and reallocated. This will allow hoisting the vCPU tracking to common x86, which in turn will allow most of the IRTE update code to be deduplicated. Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20KVM: SVM: Track per-vCPU IRTEs using kvm_kernel_irqfd structureSean Christopherson
Track the IRTEs that are posting to an SVM vCPU via the associated irqfd structure and GSI routing instead of dynamically allocating a separate data structure. In addition to eliminating an atomic allocation, this will allow hoisting much of the IRTE update logic to common x86. Cc: Sairaj Kodilkar <sarunkod@amd.com> Link: https://lore.kernel.org/r/20250611224604.313496-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20KVM: Pass new routing entries and irqfd when updating IRTEsSean Christopherson
When updating IRTEs in response to a GSI routing or IRQ bypass change, pass the new/current routing information along with the associated irqfd. This will allow KVM x86 to harden, simplify, and deduplicate its code. Since adding/removing a bypass producer is now conveniently protected with irqfds.lock, i.e. can't run concurrently with kvm_irq_routing_update(), use the routing information cached in the irqfd instead of looking up the information in the current GSI routing tables. Opportunistically convert an existing printk() to pr_info() and put its string onto a single line (old code that strictly adhered to 80 chars). Link: https://lore.kernel.org/r/20250611224604.313496-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20KVM: x86: Add CONFIG_KVM_IOAPIC to allow disabling in-kernel I/O APICSean Christopherson
Add a Kconfig to allow building KVM without support for emulating a I/O APIC, PIC, and PIT, which is desirable for deployments that effectively don't support a fully in-kernel IRQ chip, i.e. never expect any VMM to create an in-kernel I/O APIC. E.g. compiling out support eliminates a few thousand lines of guest-facing code and gives security folks warm fuzzies. As a bonus, wrapping relevant paths with CONFIG_KVM_IOAPIC #ifdefs makes it much easier for readers to understand which bits and pieces exist specifically for fully in-kernel IRQ chips. Opportunistically convert all two in-kernel uses of __KVM_HAVE_IOAPIC to CONFIG_KVM_IOAPIC, e.g. rather than add a second #ifdef to generate a stub for kvm_arch_post_irq_routing_update(). Acked-by: Kai Huang <kai.huang@intel.com> Link: https://lore.kernel.org/r/20250611213557.294358-15-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20KVM: x86: Hardcode the PIT IRQ source ID to '2'Sean Christopherson
Hardcode the PIT's source IRQ ID to '2' instead of "finding" that bit 2 is always the first available bit in irq_sources_bitmap. Bits 0 and 1 are set/reserved by kvm_arch_init_vm(), i.e. long before kvm_create_pit() can be invoked, and KVM allows at most one in-kernel PIT instance, i.e. it's impossible for the PIT to find a different free bit (there are no other users of kvm_request_irq_source_id(). Delete the now-defunct irq_sources_bitmap and all its associated code. Acked-by: Kai Huang <kai.huang@intel.com> Link: https://lore.kernel.org/r/20250611213557.294358-11-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20KVM: x86: Move kvm_{request,free}_irq_source_id() to i8254.c (PIT)Sean Christopherson
Move kvm_{request,free}_irq_source_id() to i8254.c, i.e. the dedicated PIT emulation file, in anticipation of removing them entirely in favor of hardcoding the PIT's "requested" source ID (the source ID can only ever be '2', and the request can never fail). No functional change intended. Acked-by: Kai Huang <kai.huang@intel.com> Link: https://lore.kernel.org/r/20250611213557.294358-10-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20KVM: x86: Trigger I/O APIC route rescan in kvm_arch_irq_routing_update()Sean Christopherson
Trigger the I/O APIC route rescan that's performed for a split IRQ chip after userspace updates IRQ routes in kvm_arch_irq_routing_update(), i.e. before dropping kvm->irq_lock. Calling kvm_make_all_cpus_request() under a mutex is perfectly safe, and the smp_wmb()+smp_mb__after_atomic() pair in __kvm_make_request()+kvm_check_request() ensures the new routing is visible to vCPUs prior to the request being visible to vCPUs. In all likelihood, commit b053b2aef25d ("KVM: x86: Add EOI exit bitmap inference") somewhat arbitrarily made the request outside of irq_lock to avoid holding irq_lock any longer than is strictly necessary. And then commit abdb080f7ac8 ("kvm/irqchip: kvm_arch_irq_routing_update renaming split") took the easy route of adding another arch hook instead of risking a functional change. Note, the call to synchronize_srcu_expedited() does NOT provide ordering guarantees with respect to vCPUs scanning the new routing; as above, the request infrastructure provides the necessary ordering. I.e. there's no need to wait for kvm_scan_ioapic_routes() to complete if it's actively running, because regardless of whether it grabs the old or new table, the vCPU will have another KVM_REQ_SCAN_IOAPIC pending, i.e. will rescan again and see the new mappings. Acked-by: Kai Huang <kai.huang@intel.com> Link: https://lore.kernel.org/r/20250611213557.294358-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20irqbypass: Require producers to pass in Linux IRQ number during registrationSean Christopherson
Pass in the Linux IRQ associated with an IRQ bypass producer instead of relying on the caller to set the field prior to registration, as there's no benefit to relying on callers to do the right thing. Take care to set producer->irq before __connect(), as KVM expects the IRQ to be valid as soon as a connection is possible. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20250516230734.2564775-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20irqbypass: Use xarray to track producers and consumersSean Christopherson
Track IRQ bypass producers and consumers using an xarray to avoid the O(2n) insertion time associated with walking a list to check for duplicate entries, and to search for an partner. At low (tens or few hundreds) total producer/consumer counts, using a list is faster due to the need to allocate backing storage for xarray. But as count creeps into the thousands, xarray wins easily, and can provide several orders of magnitude better latency at high counts. E.g. hundreds of nanoseconds vs. hundreds of milliseconds. Cc: Oliver Upton <oliver.upton@linux.dev> Cc: David Matlack <dmatlack@google.com> Cc: Like Xu <like.xu.linux@gmail.com> Cc: Binbin Wu <binbin.wu@linux.intel.com> Reported-by: Yong He <alexyonghe@tencent.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217379 Link: https://lore.kernel.org/all/20230801115646.33990-1-likexu@tencent.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20250516230734.2564775-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20irqbypass: Explicitly track producer and consumer bindingsSean Christopherson
Explicitly track IRQ bypass producer:consumer bindings. This will allow making removal an O(1) operation; searching through the list to find information that is trivially tracked (and useful for debug) is wasteful. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20250516230734.2564775-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>