summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2025-02-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Large set of fixes for vector handling, especially in the interactions between host and guest state. This fixes a number of bugs affecting actual deployments, and greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland for dealing with this thankless task. - Fix an ugly race between vcpu and vgic creation/init, resulting in unexpected behaviours - Fix use of kernel VAs at EL2 when emulating timers with nVHE - Small set of pKVM improvements and cleanups x86: - Fix broken SNP support with KVM module built-in, ensuring the PSP module is initialized before KVM even when the module infrastructure cannot be used to order initcalls - Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being emulated by KVM to fix a NULL pointer dereference - Enter guest mode (L2) from KVM's perspective before initializing the vCPU's nested NPT MMU so that the MMU is properly tagged for L2, not L1 - Load the guest's DR6 outside of the innermost .vcpu_run() loop, as the guest's value may be stale if a VM-Exit is handled in the fastpath" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits) x86/sev: Fix broken SNP support with KVM module built-in KVM: SVM: Ensure PSP module is initialized if KVM module is built-in crypto: ccp: Add external API interface for PSP module initialization KVM: arm64: vgic: Hoist SGI/PPI alloc from vgic_init() to kvm_create_vgic() KVM: arm64: timer: Drop warning on failed interrupt signalling KVM: arm64: Fix alignment of kvm_hyp_memcache allocations KVM: arm64: Convert timer offset VA when accessed in HYP code KVM: arm64: Simplify warning in kvm_arch_vcpu_load_fp() KVM: arm64: Eagerly switch ZCR_EL{1,2} KVM: arm64: Mark some header functions as inline KVM: arm64: Refactor exit handlers KVM: arm64: Refactor CPTR trap deactivation KVM: arm64: Remove VHE host restore of CPACR_EL1.SMEN KVM: arm64: Remove VHE host restore of CPACR_EL1.ZEN KVM: arm64: Remove host FPSIMD saving for non-protected KVM KVM: arm64: Unconditionally save+flush host FPSIMD/SVE/SME state KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop KVM: nSVM: Enter guest mode before initializing nested NPT MMU KVM: selftests: Add CPUID tests for Hyper-V features that need in-kernel APIC KVM: selftests: Manage CPUID array in Hyper-V CPUID test's core helper ...
2025-02-16sched/topology: Introduce for_each_node_numadist() iteratorAndrea Righi
Introduce the new helper for_each_node_numadist() to iterate over node IDs in order of increasing NUMA distance from a given starting node. This iterator is somehow similar to for_each_numa_hop_mask(), but instead of providing a cpumask at each iteration, it provides a node ID. Example usage: nodemask_t unvisited = NODE_MASK_ALL; int node, start = cpu_to_node(smp_processor_id()); node = start; for_each_node_numadist(node, unvisited) pr_info("node (%d, %d) -> %d\n", start, node, node_distance(start, node)); On a system with equidistant nodes: $ numactl -H ... node distances: node 0 1 2 3 0: 10 20 20 20 1: 20 10 20 20 2: 20 20 10 20 3: 20 20 20 10 Output of the example above (on node 0): [ 7.367022] node (0, 0) -> 10 [ 7.367151] node (0, 1) -> 20 [ 7.367186] node (0, 2) -> 20 [ 7.367247] node (0, 3) -> 20 On a system with non-equidistant nodes (simulated using virtme-ng): $ numactl -H ... node distances: node 0 1 2 3 0: 10 51 31 41 1: 51 10 21 61 2: 31 21 10 11 3: 41 61 11 10 Output of the example above (on node 0): [ 8.953644] node (0, 0) -> 10 [ 8.953712] node (0, 2) -> 31 [ 8.953764] node (0, 3) -> 41 [ 8.953817] node (0, 1) -> 51 Suggested-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Acked-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-16mm/numa: Introduce nearest_node_nodemask()Andrea Righi
Introduce the new helper nearest_node_nodemask() to find the closest node in a specified nodemask from a given starting node. Returns MAX_NUMNODES if no node is found. Suggested-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Acked-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-16nodemask: numa: reorganize inclusion pathYury Norov
Nodemasks now pull linux/numa.h for MAX_NUMNODES and NUMA_NO_NODE macros. This series makes numa.h depending on nodemasks, so we hit a circular dependency. Nodemasks library is highly employed by NUMA code, and it would be logical to resolve the circular dependency by making NUMA headers dependent nodemask.h. Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-16nodemask: add nodes_copy()Yury Norov
Nodemasks API misses the plain nodes_copy() which is required in this series. Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-16iio: backend: add API for oversamplingAntoniu Miclaus
Add backend support for setting oversampling ratio. Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com> Link: https://patch.msgid.link/20250214131955.31973-4-antoniu.miclaus@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-02-16iio: backend: add support for data size setAntoniu Miclaus
Add backend support for setting the data size used. This setting can be adjusted within the IP cores interfacing devices. Reviewed-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com> Link: https://patch.msgid.link/20250214131955.31973-3-antoniu.miclaus@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-02-16iio: backend: add API for interface getAntoniu Miclaus
Add backend support for obtaining the interface type used. Reviewed-by: Nuno Sa <nuno.sa@analog.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Antoniu Miclaus <antoniu.miclaus@analog.com> Link: https://patch.msgid.link/20250214131955.31973-2-antoniu.miclaus@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-02-16firmware: add Exynos ACPM protocol driverTudor Ambarus
Alive Clock and Power Manager (ACPM) Message Protocol is defined for the purpose of communication between the ACPM firmware and masters (AP, AOC, ...). ACPM firmware operates on the Active Power Management (APM) module that handles overall power activities. ACPM and masters regard each other as independent hardware component and communicate with each other using mailbox messages and shared memory. This protocol driver provides the interface for all the client drivers making use of the features offered by the APM. Add ACPM protocol support. Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://lore.kernel.org/r/20250213-gs101-acpm-v9-2-8b0281b93c8b@linaro.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2025-02-15kernfs: Use RCU to access kernfs_node::name.Sebastian Andrzej Siewior
Using RCU lifetime rules to access kernfs_node::name can avoid the trouble with kernfs_rename_lock in kernfs_name() and kernfs_path_from_node() if the fs was created with KERNFS_ROOT_INVARIANT_PARENT. This is usefull as it allows to implement kernfs_path_from_node() only with RCU protection and avoiding kernfs_rename_lock. The lock is only required if the __parent node can be changed and the function requires an unchanged hierarchy while it iterates from the node to its parent. The change is needed to allow the lookup of the node's path (kernfs_path_from_node()) from context which runs always with disabled preemption and or interrutps even on PREEMPT_RT. The problem is that kernfs_rename_lock becomes a sleeping lock on PREEMPT_RT. I went through all ::name users and added the required access for the lookup with a few extensions: - rdtgroup_pseudo_lock_create() drops all locks and then uses the name later on. resctrl supports rename with different parents. Here I made a temporal copy of the name while it is used outside of the lock. - kernfs_rename_ns() accepts NULL as new_parent. This simplifies sysfs_move_dir_ns() where it can set NULL in order to reuse the current name. - kernfs_rename_ns() is only using kernfs_rename_lock if the parents are different. All users use either kernfs_rwsem (for stable path view) or just RCU for the lookup. The ::name uses always RCU free. Use RCU lifetime guarantees to access kernfs_node::name. Suggested-by: Tejun Heo <tj@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Reported-by: syzbot+6ea37e2e6ffccf41a7e6@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/67251dc6.050a0220.529b6.015e.GAE@google.com/ Reported-by: Hillf Danton <hdanton@sina.com> Closes: https://lore.kernel.org/20241102001224.2789-1-hdanton@sina.com Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20250213145023.2820193-7-bigeasy@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-15kernfs: Use RCU to access kernfs_node::parent.Sebastian Andrzej Siewior
kernfs_rename_lock is used to obtain stable kernfs_node::{name|parent} pointer. This is a preparation to access kernfs_node::parent under RCU and ensure that the pointer remains stable under the RCU lifetime guarantees. For a complete path, as it is done in kernfs_path_from_node(), the kernfs_rename_lock is still required in order to obtain a stable parent relationship while computing the relevant node depth. This must not change while the nodes are inspected in order to build the path. If the kernfs user never moves the nodes (changes the parent) then the kernfs_rename_lock is not required and the RCU guarantees are sufficient. This "restriction" can be set with KERNFS_ROOT_INVARIANT_PARENT. Otherwise the lock is required. Rename kernfs_node::parent to kernfs_node::__parent to denote the RCU access and use RCU accessor while accessing the node. Make cgroup use KERNFS_ROOT_INVARIANT_PARENT since the parent here can not change. Acked-by: Tejun Heo <tj@kernel.org> Cc: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20250213145023.2820193-6-bigeasy@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-14net/mlx4_core: Avoid impossible mlx4_db_alloc() order valueKees Cook
GCC can see that the value range for "order" is capped, but this leads it to consider that it might be negative, leading to a false positive warning (with GCC 15 with -Warray-bounds -fdiagnostics-details): ../drivers/net/ethernet/mellanox/mlx4/alloc.c:691:47: error: array subscript -1 is below array bounds of 'long unsigned int *[2]' [-Werror=array-bounds=] 691 | i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o); | ~~~~~~~~~~~^~~ 'mlx4_alloc_db_from_pgdir': events 1-2 691 | i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (2) out of array bounds here | (1) when the condition is evaluated to true In file included from ../drivers/net/ethernet/mellanox/mlx4/mlx4.h:53, from ../drivers/net/ethernet/mellanox/mlx4/alloc.c:42: ../include/linux/mlx4/device.h:664:33: note: while referencing 'bits' 664 | unsigned long *bits[2]; | ^~~~ Switch the argument to unsigned int, which removes the compiler needing to consider negative values. Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20250210174504.work.075-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14net: remove phylink_pcs .neg_mode booleanRussell King (Oracle)
As all PCS are using the neg_mode parameter rather than the legacy an_mode, remove the ability to use the legacy an_mode. We remove the tests in the phylink code, unconditionally passing the PCS neg_mode parameter to PCS methods, and remove setting the flag from drivers. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tidPn-0040hd-2R@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14net: phy: remove helper phy_is_internalHeiner Kallweit
Helper phy_is_internal() is just used in two places phylib-internally. So let's remove it from the API. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/f3f35265-80a9-4ed7-ad78-ae22c21e288b@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14net: phy: stop exporting phy_queue_state_machineHeiner Kallweit
phy_queue_state_machine() isn't used outside phy.c, so stop exporting it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/16986d3d-7baf-4b02-a641-e2916d491264@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14net: phy: stop exporting feature arrays which aren't used outside phylibHeiner Kallweit
Stop exporting feature arrays which aren't used outside phylib. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/01886672-4880-4ca8-b7b0-94d40f6e0ec5@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14net: phy: remove fixup-related definitions from phy.h which are not used ↵Heiner Kallweit
outside phylib Certain fixup-related definitions aren't used outside phy_device.c. So make them private and remove them from phy.h. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/ea6fde13-9183-4c7c-8434-6c0eb64fc72c@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14Merge tag 'kvm-x86-fixes-6.14-rcN' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM fixes for 6.14 part 1 - Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being emulated by KVM to fix a NULL pointer dereference. - Enter guest mode (L2) from KVM's perspective before initializing the vCPU's nested NPT MMU so that the MMU is properly tagged for L2, not L1. - Load the guest's DR6 outside of the innermost .vcpu_run() loop, as the guest's value may be stale if a VM-Exit is handled in the fastpath.
2025-02-14crypto: ccp: Add external API interface for PSP module initializationSean Christopherson
KVM is dependent on the PSP SEV driver and PSP SEV driver needs to be loaded before KVM module. In case of module loading any dependent modules are automatically loaded but in case of built-in modules there is no inherent mechanism available to specify dependencies between modules and ensure that any dependent modules are loaded implicitly. Add a new external API interface for PSP module initialization which allows PSP SEV driver to be loaded explicitly if KVM is built-in. Signed-off-by: Sean Christopherson <seanjc@google.com> Co-developed-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Message-ID: <15279ca0cad56a07cf12834ec544310f85ff5edc.1739226950.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-14Merge tag 'efi-fixes-for-v6.14-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: "Take the newly introduced EFI_MEMORY_HOT_PLUGGABLE memory attribute into account when placing the kernel image in memory at boot. Otherwise, the presence of the kernel image could prevent such a memory region from being unplugged at runtime if it was 'cold plugged', i.e., already plugged in at boot time (and exposed via the EFI memory map). This should ensure that the new EFI_MEMORY_HOT_PLUGGABLE memory attribute is used consistently by Linux before it ever turns up in production, ensuring that we can make meaningful use of it without running the risk of regressing existing users" * tag 'efi-fixes-for-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Use BIT_ULL() constants for memory attributes efi: Avoid cold plugged memory for placing the kernel
2025-02-14net: xpcs: remove xpcs_config_eee() from global scopeRussell King (Oracle)
Make xpcs_config_eee() private to the XPCS driver, called only from the phylink pcs_disable_eee() and pcs_enable_eee() methods. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1thRQT-003w7O-Ec@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14net: xpcs: add function to configure EEE clock multiplying factorRussell King (Oracle)
Add a function to separate out the EEE clock multiplying factor. This will be called by the stmmac driver to configure this value. It would have been better had the driver used the CLK API to retrieve this clock, get its rate and calculate the appropriate multiplier, but that door has closed. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1thRQ8-003w70-VT@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14net: phylink: add support for notifying PCS about EEERussell King (Oracle)
There are hooks in the stmmac driver into XPCS to control the EEE settings when LPI is configured at the MAC. This bypasses the layering. To allow this to be removed from the stmmac driver, add two new methods for PCS to inform them when the LPI/EEE enablement state changes at the MAC. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1thRQ3-003w6u-RH@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14net: stmmac: refactor clock management in EQoS driverSwathi K S
Refactor clock management in EQoS driver for code reuse and to avoid redundancy. This way, only minimal changes are required when a new platform is added. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Swathi K S <swathi.ks@samsung.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250213041559.106111-1-swathi.ks@samsung.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14Merge tag 'block-6.14-20250214' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for request rejection for batch addition - Fix a few issues for bogus mac partition tables * tag 'block-6.14-20250214' of git://git.kernel.dk/linux: partitions: mac: fix handling of bogus partition table block: cleanup and fix batch completion adding conditions
2025-02-14Merge tag 'cgroup-for-6.14-rc2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix a race window where a newly forked task could escape cgroup.kill - Remove incorrectly included steal time from cpu.stat::usage_usec - Minor update in selftest * tag 'cgroup-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Remove steal time from usage_usec selftests/cgroup: use bash in test_cpuset_v1_hp.sh cgroup: fix race between fork and cgroup.kill
2025-02-14iavf: add support for negotiating flexible RXDID formatJacob Keller
Enable support for VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC, to enable the VF driver the ability to determine what Rx descriptor formats are available. This requires sending an additional message during initialization and reset, the VIRTCHNL_OP_GET_SUPPORTED_RXDIDS. This operation requests the supported Rx descriptor IDs available from the PF. This is treated the same way that VLAN V2 capabilities are handled. Add a new set of extended capability flags, used to process send and receipt of the VIRTCHNL_OP_GET_SUPPORTED_RXDIDS message. This ensures we finish negotiating for the supported descriptor formats prior to beginning configuration of receive queues. This change stores the supported format bitmap into the iavf_adapter structure. Additionally, if VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC is enabled by the PF, we need to make sure that the Rx queue configuration specifies the format. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14virtchnl: add enumeration for the rxdid formatJacob Keller
Support for allowing VF to negotiate the descriptor format requires that the VF specify which descriptor format to use when requesting Rx queues. The VF is supposed to request the set of supported formats via the new VIRTCHNL_OP_GET_SUPPORTED_RXDIDS, and then set one of the supported formats in the rxdid field of the virtchnl_rxq_info structure. The virtchnl.h header does not provide an enumeration of the format values. The existing implementations in the PF directly use the values from the DDP package. Make the formats explicit by defining an enumeration of the RXDIDs. Provide an enumeration for the values as well as the bit positions as returned by the supported_rxdids data from the VIRTCHNL_OP_GET_SUPPORTED_RXDIDS. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14ice: support Rx timestamp on flex descriptorSimei Su
To support Rx timestamp offload, VIRTCHNL_OP_1588_PTP_CAPS is sent by the VF to request PTP capability and responded by the PF what capability is enabled for that VF. Hardware captures timestamps which contain only 32 bits of nominal nanoseconds, as opposed to the 64bit timestamps that the stack expects. To convert 32b to 64b, we need a current PHC time. VIRTCHNL_OP_1588_PTP_GET_TIME is sent by the VF and responded by the PF with the current PHC time. Signed-off-by: Simei Su <simei.su@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14virtchnl: add support for enabling PTP on iAVFJacob Keller
Add support for allowing a VF to enable PTP feature - Rx timestamps The new capability is gated by VIRTCHNL_VF_CAP_PTP, which must be set by the VF to request access to the new operations. In addition, the VIRTCHNL_OP_1588_PTP_CAPS command is used to determine the specific capabilities available to the VF. This support includes the following additional capabilities: * Rx timestamps enabled in the Rx queues (when using flexible advanced descriptors) * Read access to PHC time over virtchnl using VIRTCHNL_OP_1588_PTP_GET_TIME Extra space is reserved in most structures to allow for future extension (like set clock, Tx timestamps). Additional opcode numbers are reserved and space in the virtchnl_ptp_caps structure is specifically set aside for this. Additionally, each structure has some space reserved for future extensions to allow some flexibility. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-14KVM: Allow lockless walk of SPTEs when handing aging mmu_notifier eventJames Houghton
It is possible to correctly do aging without taking the KVM MMU lock, or while taking it for read; add a Kconfig to let architectures do so. Architectures that select KVM_MMU_LOCKLESS_AGING are responsible for correctness. Suggested-by: Yu Zhao <yuzhao@google.com> Signed-off-by: James Houghton <jthoughton@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20250204004038.1680123-3-jthoughton@google.com [sean: massage shortlog+changelog, fix Kconfig goof and shorten name] Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-14x86/boot: Mark start_secondary() with __noendbrPeter Zijlstra
The handoff between the boot stubs and start_secondary() are before IBT is enabled and is definitely not subject to kCFI. As such, suppress all that for this function. Notably when the ENDBR poison would become fatal (ud1 instead of nop) this will trigger a tripple fault because we haven't set up the IDT to handle #UD yet. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20250207122546.509520369@infradead.org
2025-02-14x86/cfi: Clean up linkagePeter Zijlstra
With the introduction of kCFI the addition of ENDBR to SYM_FUNC_START* no longer suffices to make the function indirectly callable. This now requires the use of SYM_TYPED_FUNC_START. As such, remove the implicit ENDBR from SYM_FUNC_START* and add some explicit annotations to fix things up again. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20250207122546.409116003@infradead.org
2025-02-14Merge branch 'x86/mm'Peter Zijlstra
Depends on the simplifications from commit 1d7e707af446 ("Revert "x86/module: prepare module loading for ROX allocations of text"") Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2025-02-14Revert "kernel/debug: Mask KGDB NMI upon entry"Douglas Anderson
This reverts commit 5a14fead07bcf4e0acc877a8d9e1d1f40a441153. No architectures ever implemented `enable_nmi` since the later patches in the series adding it never landed. It's been a long time. Drop it. NOTE: this is not a clean revert due to changes in the file in the meantime. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250129082535.3.I2254953cd852f31f354456689d68b2d910de3fbe@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-14Revert "tty/serial: Add kgdb_nmi driver"Douglas Anderson
This reverts commit 0c57dfcc6c1d037243c2f8fbf62eab3633326ec0. The functionality was supoosed to be used by a later patch in the series that never landed [1]. Drop it. NOTE: part of functionality was already reverted by commit 39d0be87438a ("serial: kgdb_nmi: Remove unused knock code"). Also note that this revert is not a clean revert given code changes that have happened in the meantime. It's obvious that nobody is using this code since the two exposed functions (kgdb_register_nmi_console() and kgdb_unregister_nmi_console()) are both no-ops if "arch_kgdb_ops.enable_nmi" is not defined. No architectures define it. [1] https://lore.kernel.org/lkml/1348522080-32629-9-git-send-email-anton.vorontsov@linaro.org/ Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: URL [1] Link: https://lore.kernel.org/r/20250129082535.1.Ia095eac1ae357f87d23e7af2206741f5d40788f1@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-14platform/chrome: add PD_EVENT_INIT bit definitionJameson Thies
Update cros_ec_commands.h to include a definition for PD_EVENT_INIT. On platforms supporting UCSI, this host event type is sent when the PPM initializes. Signed-off-by: Jameson Thies <jthies@google.com> Reviewed-by: Benson Leung <bleung@chromium.org> Acked-by: Tzung-Bi Shih <tzungbi@kernel.org> Reviewed-by: Łukasz Bartosik <ukaszb@chromium.org> Link: https://lore.kernel.org/r/20250204024600.4138776-2-jthies@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-13bpf: fs/xattr: Add BPF kfuncs to set and remove xattrsSong Liu
Add the following kfuncs to set and remove xattrs from BPF programs: bpf_set_dentry_xattr bpf_remove_dentry_xattr bpf_set_dentry_xattr_locked bpf_remove_dentry_xattr_locked The _locked version of these kfuncs are called from hooks where dentry->d_inode is already locked. Instead of requiring the user to know which version of the kfuncs to use, the verifier will pick the proper kfunc based on the calling hook. Signed-off-by: Song Liu <song@kernel.org> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Matt Bobrowski <mattbobrowski@google.com> Link: https://lore.kernel.org/r/20250130213549.3353349-5-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-13drivers/hv: introduce vmbus_channel_set_cpu()Hamza Mahfooz
The core functionality in target_cpu_store() is also needed in a subsequent patch for automatically changing the CPU when taking a CPU offline. As such, factor out the body of target_cpu_store() into new function vmbus_channel_set_cpu() that can also be used elsewhere. No functional change is intended. Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Michael Kelley <mhklinux@outlook.com> Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250117203309.192072-2-hamzamahfooz@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250117203309.192072-2-hamzamahfooz@linux.microsoft.com>
2025-02-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.14-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13Merge tag 'net-6.14-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, wireless and bluetooth. Kalle Valo steps down after serving as the WiFi driver maintainer for over a decade. Current release - fix to a fix: - vsock: orphan socket after transport release, avoid null-deref - Bluetooth: L2CAP: fix corrupted list in hci_chan_del Current release - regressions: - eth: - stmmac: correct Rx buffer layout when SPH is enabled - iavf: fix a locking bug in an error path - rxrpc: fix alteration of headers whilst zerocopy pending - s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH - Revert "netfilter: flowtable: teardown flow if cached mtu is stale" Current release - new code bugs: - rxrpc: fix ipv6 path MTU discovery, only ipv4 worked - pse-pd: fix deadlock in current limit functions Previous releases - regressions: - rtnetlink: fix netns refleak with rtnl_setlink() - wifi: brcmfmac: use random seed flag for BCM4355 and BCM4364 firmware Previous releases - always broken: - add missing RCU protection of struct net throughout the stack - can: rockchip: bail out if skb cannot be allocated - eth: ti: am65-cpsw: base XDP support fixes Misc: - ethtool: tsconfig: update the format of hwtstamp flags, changes the uAPI but this uAPI was not in any release yet" * tag 'net-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) net: pse-pd: Fix deadlock in current limit functions rxrpc: Fix ipv6 path MTU discovery Reapply "net: skb: introduce and use a single page frag cache" s390/qeth: move netif_napi_add_tx() and napi_enable() from under BH mlxsw: Add return value check for mlxsw_sp_port_get_stats_raw() ipv6: mcast: add RCU protection to mld_newpack() team: better TEAM_OPTION_TYPE_STRING validation Bluetooth: L2CAP: Fix corrupted list in hci_chan_del Bluetooth: btintel_pcie: Fix a potential race condition Bluetooth: L2CAP: Fix slab-use-after-free Read in l2cap_send_cmd net: ethernet: ti: am65_cpsw: fix tx_cleanup for XDP case net: ethernet: ti: am65-cpsw: fix RX & TX statistics for XDP_TX case net: ethernet: ti: am65-cpsw: fix memleak in certain XDP cases vsock/test: Add test for SO_LINGER null ptr deref vsock: Orphan socket after transport release MAINTAINERS: Add sctp headers to the general netdev entry Revert "netfilter: flowtable: teardown flow if cached mtu is stale" iavf: Fix a locking bug in an error path rxrpc: Fix alteration of headers whilst zerocopy pending net: phylink: make configuring clock-stop dependent on MAC support ...
2025-02-13ext4: introduce linear search for dentriesTheodore Ts'o
This patch addresses an issue where some files in case-insensitive directories become inaccessible due to changes in how the kernel function, utf8_casefold(), generates case-folded strings from the commit 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points"). There are good reasons why this change should be made; it's actually quite stupid that Unicode seems to think that the characters ❤ and ❤️ should be casefolded. Unfortimately because of the backwards compatibility issue, this commit was reverted in 231825b2e1ff. This problem is addressed by instituting a brute-force linear fallback if a lookup fails on case-folded directory, which does result in a performance hit when looking up files affected by the changing how thekernel treats ignorable Uniode characters, or when attempting to look up non-existent file names. So this fallback can be disabled by setting an encoding flag if in the future, the system administrator or the manufacturer of a mobile handset or tablet can be sure that there was no opportunity for a kernel to insert file names with incompatible encodings. Fixes: 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points") Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
2025-02-13Reapply "net: skb: introduce and use a single page frag cache"Jakub Kicinski
This reverts commit 011b0335903832facca86cd8ed05d7d8d94c9c76. Sabrina reports that the revert may trigger warnings due to intervening changes, especially the ability to rise MAX_SKB_FRAGS. Let's drop it and revisit once that part is also ironed out. Fixes: 011b03359038 ("Revert "net: skb: introduce and use a single page frag cache"") Reported-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/6bf54579233038bc0e76056c5ea459872ce362ab.1739375933.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13sctp: Remove commented out codeThorsten Blum
Remove commented out code. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://patch.msgid.link/20250211102057.587182-1-thorsten.blum@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13soundwire: amd: add support for ACP7.0 & ACP7.1 platformsVijendar Mukunda
Add SoundWire support for ACP7.0 and ACP7.1 platforms. Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Link: https://lore.kernel.org/r/20250207065841.4718-4-Vijendar.Mukunda@amd.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-02-13driver core: add a faux bus for use when a simple device/bus is neededGreg Kroah-Hartman
Many drivers abuse the platform driver/bus system as it provides a simple way to create and bind a device to a driver-specific set of probe/release functions. Instead of doing that, and wasting all of the memory associated with a platform device, here is a "faux" bus that can be used instead. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/2025021026-atlantic-gibberish-3f0c@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-13i2c: Unexport i2c_of_match_device()Andy Shevchenko
i2c_of_match_device() is not used anymore outside of I²C framework, unexport it. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2025-02-13block: cleanup and fix batch completion adding conditionsJens Axboe
The conditions for whether or not a request is allowed adding to a completion batch are a bit hard to read, and they also have a few issues. One is that ioerror may indeed be a random value on passthrough, and it's being checked unconditionally of whether or not the given request is a passthrough request or not. Rewrite the conditions to be separate for easier reading, and only check ioerror for non-passthrough requests. This fixes an issue with bio unmapping on passthrough, where it fails getting added to a batch. This both leads to suboptimal performance, and may trigger a potential schedule-under-atomic condition for polled passthrough IO. Fixes: f794f3351f26 ("block: add support for blk_mq_end_request_batch()") Link: https://lore.kernel.org/r/20575f0a-656e-4bb3-9d82-dec6c7e3a35c@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-13netfs: Fix a number of read-retry hangsDavid Howells
Fix a number of hangs in the netfslib read-retry code, including: (1) netfs_reissue_read() doubles up the getting of references on subrequests, thereby leaking the subrequest and causing inode eviction to wait indefinitely. This can lead to the kernel reporting a hang in the filesystem's evict_inode(). Fix this by removing the get from netfs_reissue_read() and adding one to netfs_retry_read_subrequests() to deal with the one place that didn't double up. (2) The loop in netfs_retry_read_subrequests() that retries a sequence of failed subrequests doesn't record whether or not it retried the one that the "subreq" pointer points to when it leaves the loop. It may not if renegotiation/repreparation of the subrequests means that fewer subrequests are needed to span the cumulative range of the sequence. Because it doesn't record this, the piece of code that discards now-superfluous subrequests doesn't know whether it should discard the one "subreq" points to - and so it doesn't. Fix this by noting whether the last subreq it examines is superfluous and if it is, then getting rid of it and all subsequent subrequests. If that one one wasn't superfluous, then we would have tried to go round the previous loop again and so there can be no further unretried subrequests in the sequence. (3) netfs_retry_read_subrequests() gets yet an extra ref on any additional subrequests it has to get because it ran out of ones it could reuse to to renegotiation/repreparation shrinking the subrequests. Fix this by removing that extra ref. (4) In netfs_retry_reads(), it was using wait_on_bit() to wait for NETFS_SREQ_IN_PROGRESS to be cleared on all subrequests in the sequence - but netfs_read_subreq_terminated() is now using a wait queue on the request instead and so this wait will never finish. Fix this by waiting on the wait queue instead. To make this work, a new flag, NETFS_RREQ_RETRYING, is now set around the wait loop to tell the wake-up code to wake up the wait queue rather than requeuing the request's work item. Note that this flag replaces the NETFS_RREQ_NEED_RETRY flag which is no longer used. (5) Whilst not strictly anything to do with the hang, netfs_retry_read_subrequests() was also doubly incrementing the subreq_counter and re-setting the debug index, leaving a gap in the trace. This is also fixed. One of these hangs was observed with 9p and with cifs. Others were forced by manual code injection into fs/afs/file.c. Firstly, afs_prepare_read() was created to provide an changing pattern of maximum subrequest sizes: static int afs_prepare_read(struct netfs_io_subrequest *subreq) { struct netfs_io_request *rreq = subreq->rreq; if (!S_ISREG(subreq->rreq->inode->i_mode)) return 0; if (subreq->retry_count < 20) rreq->io_streams[0].sreq_max_len = umax(200, 2222 - subreq->retry_count * 40); else rreq->io_streams[0].sreq_max_len = 3333; return 0; } and pointed to by afs_req_ops. Then the following: struct netfs_io_subrequest *subreq = op->fetch.subreq; if (subreq->error == 0 && S_ISREG(subreq->rreq->inode->i_mode) && subreq->retry_count < 20) { subreq->transferred = subreq->already_done; __clear_bit(NETFS_SREQ_HIT_EOF, &subreq->flags); __set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags); afs_fetch_data_notify(op); return; } was inserted into afs_fetch_data_success() at the beginning and struct netfs_io_subrequest given an extra field, "already_done" that was set to the value in "subreq->transferred" by netfs_reissue_read(). When reading a 4K file, the subrequests would get gradually smaller, a new subrequest would be allocated around the 3rd retry and then eventually be rendered superfluous when the 20th retry was hit and the limit on the first subrequest was eased. Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20250212222402.3618494-2-dhowells@redhat.com Tested-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Steve French <stfrench@microsoft.com> cc: Ihor Solodrai <ihor.solodrai@pm.me> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Paulo Alcantara <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: v9fs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-13pmdomain: core: Introduce dev_pm_genpd_rpm_always_on()Ulf Hansson
For some usecases a consumer driver requires its device to remain power-on from the PM domain perspective during runtime. Using dev PM qos along with the genpd governors, doesn't work for this case as would potentially prevent the device from being runtime suspended too. To support these usecases, let's introduce dev_pm_genpd_rpm_always_on() to allow consumers drivers to dynamically control the behaviour in genpd for a device that is attached to it. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/1738736156-119203-4-git-send-email-shawn.lin@rock-chips.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>