summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2025-03-19netconsole: allow selection of egress interface via MAC addressUday Shankar
Currently, netconsole has two methods of configuration - module parameter and configfs. The former interface allows for netconsole activation earlier during boot (by specifying the module parameter on the kernel command line), so it is preferred for debugging issues which arise before userspace is up/the configfs interface can be used. The module parameter syntax requires specifying the egress interface name. This requirement makes it hard to use for a couple reasons: - The egress interface name can be hard or impossible to predict. For example, installing a new network card in a system can change the interface names assigned by the kernel. - When constructing the module parameter, one may have trouble determining the original (kernel-assigned) name of the interface (which is the name that should be given to netconsole) if some stable interface naming scheme is in effect. A human can usually look at kernel logs to determine the original name, but this is very painful if automation is constructing the parameter. For these reasons, allow selection of the egress interface via MAC address when configuring netconsole using the module parameter. Update the netconsole documentation with an example of the new syntax. Selection of egress interface by MAC address via configfs is far less interesting (since when this interface can be used, one should be able to easily convert between MAC address and interface name), so it is left unimplemented. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Breno Leitao <leitao@debian.org> Tested-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250312-netconsole-v6-2-3437933e79b8@purestorage.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-19net, treewide: define and use MAC_ADDR_STR_LENUday Shankar
There are a few places in the tree which compute the length of the string representation of a MAC address as 3 * ETH_ALEN - 1. Define a constant for this and use it where relevant. No functionality changes are expected. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@verge.net.au> Link: https://patch.msgid.link/20250312-netconsole-v6-1-3437933e79b8@purestorage.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-19net: reorder dev_addr_sem lockStanislav Fomichev
Lockdep complains about circular lock in 1 -> 2 -> 3 (see below). Change the lock ordering to be: - rtnl_lock - dev_addr_sem - netdev_ops (only for lower devices!) - team_lock (or other per-upper device lock) 1. rtnl_lock -> netdev_ops -> dev_addr_sem rtnl_setlink rtnl_lock do_setlink IFLA_ADDRESS on lower netdev_ops dev_addr_sem 2. rtnl_lock -> team_lock -> netdev_ops rtnl_newlink rtnl_lock do_setlink IFLA_MASTER on lower do_set_master team_add_slave team_lock team_port_add dev_set_mtu netdev_ops 3. rtnl_lock -> dev_addr_sem -> team_lock rtnl_newlink rtnl_lock do_setlink IFLA_ADDRESS on upper dev_addr_sem netif_set_mac_address team_set_mac_address team_lock 4. rtnl_lock -> netdev_ops -> dev_addr_sem rtnl_lock dev_ifsioc dev_set_mac_address_user __tun_chr_ioctl rtnl_lock dev_set_mac_address_user tap_ioctl rtnl_lock dev_set_mac_address_user dev_set_mac_address_user netdev_lock_ops netif_set_mac_address_user dev_addr_sem v2: - move lock reorder to happen after kmalloc (Kuniyuki) Cc: Kohei Enju <enjuk@amazon.com> Fixes: df43d8bf1031 ("net: replace dev_addr_sem with netdev instance lock") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250312190513.1252045-3-sdf@fomichev.me Tested-by: Lei Yang <leiyang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-19Revert "net: replace dev_addr_sem with netdev instance lock"Stanislav Fomichev
This reverts commit df43d8bf10316a7c3b1e47e3cc0057a54df4a5b8. Cc: Kohei Enju <enjuk@amazon.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Fixes: df43d8bf1031 ("net: replace dev_addr_sem with netdev instance lock") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250312190513.1252045-2-sdf@fomichev.me Tested-by: Lei Yang <leiyang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-19net: stmmac: allow platforms to use PHY tx clock stop capabilityRussell King (Oracle)
Allow platform glue to instruct stmmac to make use of the PHY transmit clock stop capability when deciding whether to allow the transmit clock from the DWMAC core to be stopped. Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tsITp-005vG9-Px@rmk-PC.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-19io_uring: rename the data cmd cachePavel Begunkov
Pick a more descriptive name for the cmd async data cache. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20250319061251.21452-2-sidong.yang@furiosa.ai Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-19bpf: Maintain FIFO property for rqspinlock unlockKumar Kartikeya Dwivedi
Since out-of-order unlocks are unsupported for rqspinlock, and irqsave variants enforce strict FIFO ordering anyway, make the same change for normal non-irqsave variants, such that FIFO ordering is enforced. Two new verifier state fields (active_lock_id, active_lock_ptr) are used to denote the top of the stack, and prev_id and prev_ptr are ascertained whenever popping the topmost entry through an unlock. Take special care to make these fields part of the state comparison in refsafe. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-25-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19bpf: Implement verifier support for rqspinlockKumar Kartikeya Dwivedi
Introduce verifier-side support for rqspinlock kfuncs. The first step is allowing bpf_res_spin_lock type to be defined in map values and allocated objects, so BTF-side is updated with a new BPF_RES_SPIN_LOCK field to recognize and validate. Any object cannot have both bpf_spin_lock and bpf_res_spin_lock, only one of them (and at most one of them per-object, like before) must be present. The bpf_res_spin_lock can also be used to protect objects that require lock protection for their kfuncs, like BPF rbtree and linked list. The verifier plumbing to simulate success and failure cases when calling the kfuncs is done by pushing a new verifier state to the verifier state stack which will verify the failure case upon calling the kfunc. The path where success is indicated creates all lock reference state and IRQ state (if necessary for irqsave variants). In the case of failure, the state clears the registers r0-r5, sets the return value, and skips kfunc processing, proceeding to the next instruction. When marking the return value for success case, the value is marked as 0, and for the failure case as [-MAX_ERRNO, -1]. Then, in the program, whenever user checks the return value as 'if (ret)' or 'if (ret < 0)' the verifier never traverses such branches for success cases, and would be aware that the lock is not held in such cases. We push the kfunc state in check_kfunc_call whenever rqspinlock kfuncs are invoked. We introduce a kfunc_class state to avoid mixing lock irqrestore kfuncs with IRQ state created by bpf_local_irq_save. With all this infrastructure, these kfuncs become usable in programs while satisfying all safety properties required by the kernel. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-24-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19bpf: Introduce rqspinlock kfuncsKumar Kartikeya Dwivedi
Introduce four new kfuncs, bpf_res_spin_lock, and bpf_res_spin_unlock, and their irqsave/irqrestore variants, which wrap the rqspinlock APIs. bpf_res_spin_lock returns a conditional result, depending on whether the lock was acquired (NULL is returned when lock acquisition succeeds, non-NULL upon failure). The memory pointed to by the returned pointer upon failure can be dereferenced after the NULL check to obtain the error code. Instead of using the old bpf_spin_lock type, introduce a new type with the same layout, and the same alignment, but a different name to avoid type confusion. Preemption is disabled upon successful lock acquisition, however IRQs are not. Special kfuncs can be introduced later to allow disabling IRQs when taking a spin lock. Resilient locks are safe against AA deadlocks, hence not disabling IRQs currently does not allow violation of kernel safety. __irq_flag annotation is used to accept IRQ flags for the IRQ-variants, with the same semantics as existing bpf_local_irq_{save, restore}. These kfuncs will require additional verifier-side support in subsequent commits, to allow programs to hold multiple locks at the same time. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-23-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19rqspinlock: Add entry to Makefile, MAINTAINERSKumar Kartikeya Dwivedi
Ensure that the rqspinlock code is only built when the BPF subsystem is compiled in. Depending on queued spinlock support, we may or may not end up building the queued spinlock slowpath, and instead fallback to the test-and-set implementation. Also add entries to MAINTAINERS file. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-18-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19rqspinlock: Add macros for rqspinlock usageKumar Kartikeya Dwivedi
Introduce helper macros that wrap around the rqspinlock slow path and provide an interface analogous to the raw_spin_lock API. Note that in case of error conditions, preemption and IRQ disabling is automatically unrolled before returning the error back to the caller. Ensure that in absence of CONFIG_QUEUED_SPINLOCKS support, we fallback to the test-and-set implementation. Add some comments describing the subtle memory ordering logic during unlock, and why it's safe. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-17-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19rqspinlock: Add basic support for CONFIG_PARAVIRTKumar Kartikeya Dwivedi
We ripped out PV and virtualization related bits from rqspinlock in an earlier commit, however, a fair lock performs poorly within a virtual machine when the lock holder is preempted. As such, retain the virt_spin_lock fallback to test and set lock, but with timeout and deadlock detection. We can do this by simply depending on the resilient_tas_spin_lock implementation from the previous patch. We don't integrate support for CONFIG_PARAVIRT_SPINLOCKS yet, as that requires more involved algorithmic changes and introduces more complexity. It can be done when the need arises in the future. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-15-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19rqspinlock: Add a test-and-set fallbackKumar Kartikeya Dwivedi
Include a test-and-set fallback when queued spinlock support is not available. Introduce a rqspinlock type to act as a fallback when qspinlock support is absent. Include ifdef guards to ensure the slow path in this file is only compiled when CONFIG_QUEUED_SPINLOCKS=y. Subsequent patches will add further logic to ensure fallback to the test-and-set implementation when queued spinlock support is unavailable on an architecture. Unlike other waiting loops in rqspinlock code, the one for test-and-set has no theoretical upper bound under contention, therefore we need a longer timeout than usual. Bump it up to a second in this case. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-14-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19rqspinlock: Add deadlock detection and recoveryKumar Kartikeya Dwivedi
While the timeout logic provides guarantees for the waiter's forward progress, the time until a stalling waiter unblocks can still be long. The default timeout of 1/4 sec can be excessively long for some use cases. Additionally, custom timeouts may exacerbate recovery time. Introduce logic to detect common cases of deadlocks and perform quicker recovery. This is done by dividing the time from entry into the locking slow path until the timeout into intervals of 1 ms. Then, after each interval elapses, deadlock detection is performed, while also polling the lock word to ensure we can quickly break out of the detection logic and proceed with lock acquisition. A 'held_locks' table is maintained per-CPU where the entry at the bottom denotes a lock being waited for or already taken. Entries coming before it denote locks that are already held. The current CPU's table can thus be looked at to detect AA deadlocks. The tables from other CPUs can be looked at to discover ABBA situations. Finally, when a matching entry for the lock being taken on the current CPU is found on some other CPU, a deadlock situation is detected. This function can take a long time, therefore the lock word is constantly polled in each loop iteration to ensure we can preempt detection and proceed with lock acquisition, using the is_lock_released check. We set 'spin' member of rqspinlock_timeout struct to 0 to trigger deadlock checks immediately to perform faster recovery. Note: Extending lock word size by 4 bytes to record owner CPU can allow faster detection for ABBA. It is typically the owner which participates in a ABBA situation. However, to keep compatibility with existing lock words in the kernel (struct qspinlock), and given deadlocks are a rare event triggered by bugs, we choose to favor compatibility over faster detection. The release_held_lock_entry function requires an smp_wmb, while the release store on unlock will provide the necessary ordering for us. Add comments to document the subtleties of why this is correct. It is possible for stores to be reordered still, but in the context of the deadlock detection algorithm, a release barrier is sufficient and needn't be stronger for unlock's case. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-13-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19rqspinlock: Protect pending bit owners from stallsKumar Kartikeya Dwivedi
The pending bit is used to avoid queueing in case the lock is uncontended, and has demonstrated benefits for the 2 contender scenario, esp. on x86. In case the pending bit is acquired and we wait for the locked bit to disappear, we may get stuck due to the lock owner not making progress. Hence, this waiting loop must be protected with a timeout check. To perform a graceful recovery once we decide to abort our lock acquisition attempt in this case, we must unset the pending bit since we own it. All waiters undoing their changes and exiting gracefully allows the lock word to be restored to the unlocked state once all participants (owner, waiters) have been recovered, and the lock remains usable. Hence, set the pending bit back to zero before returning to the caller. Introduce a lockevent (rqspinlock_lock_timeout) to capture timeout event statistics. Reviewed-by: Barret Rhoden <brho@google.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-10-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19rqspinlock: Add support for timeoutsKumar Kartikeya Dwivedi
Introduce policy macro RES_CHECK_TIMEOUT which can be used to detect when the timeout has expired for the slow path to return an error. It depends on being passed two variables initialized to 0: ts, ret. The 'ts' parameter is of type rqspinlock_timeout. This macro resolves to the (ret) expression so that it can be used in statements like smp_cond_load_acquire to break the waiting loop condition. The 'spin' member is used to amortize the cost of checking time by dispatching to the implementation every 64k iterations. The 'timeout_end' member is used to keep track of the timestamp that denotes the end of the waiting period. The 'ret' parameter denotes the status of the timeout, and can be checked in the slow path to detect timeouts after waiting loops. The 'duration' member is used to store the timeout duration for each waiting loop. The default timeout value defined in the header (RES_DEF_TIMEOUT) is 0.25 seconds. This macro will be used as a condition for waiting loops in the slow path. Since each waiting loop applies a fresh timeout using the same rqspinlock_timeout, we add a new RES_RESET_TIMEOUT as well to ensure the values can be easily reinitialized to the default state. Reviewed-by: Barret Rhoden <brho@google.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-8-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19rqspinlock: Add rqspinlock.h headerKumar Kartikeya Dwivedi
This header contains the public declarations usable in the rest of the kernel for rqspinlock. Let's also type alias qspinlock to rqspinlock_t to ensure consistent use of the new lock type. We want to remove dependence on the qspinlock type in later patches as we need to provide a test-and-set fallback, hence begin abstracting away from now onwards. Reviewed-by: Barret Rhoden <brho@google.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-6-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19Merge tag 'ata-6.14-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Niklas Cassel: - Fix a regression on ATI AHCI controllers, where certain Samsung drives fails to be detected on a warm boot when LPM is enabled. LPM on ATI AHCI works fine with other drives. Likewise, the Samsung drives works fine with LPM with other AHI controllers. Thus, just like the weirdo ATA_QUIRK_NO_NCQ_ON_ATI quirk, add a new ATA_QUIRK_NO_LPM_ON_ATI quirk to disable LPM only on ATI AHCI controllers. * tag 'ata-6.14-final' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata-core: Add ATA_QUIRK_NO_LPM_ON_ATI for certain Samsung SSDs
2025-03-19pidfs: ensure that PIDFS_INFO_EXIT is availableChristian Brauner
When we currently create a pidfd we check that the task hasn't been reaped right before we create the pidfd. But it is of course possible that by the time we return the pidfd to userspace the task has already been reaped since we don't check again after having created a dentry for it. This was fine until now because that race was meaningless. But now that we provide PIDFD_INFO_EXIT it is a problem because it is possible that the kernel returns a reaped pidfd and it depends on the race whether PIDFD_INFO_EXIT information is available. This depends on if the task gets reaped before or after a dentry has been attached to struct pid. Make this consistent and only returned pidfds for reaped tasks if PIDFD_INFO_EXIT information is available. This is done by performing another check whether the task has been reaped right after we attached a dentry to struct pid. Since pidfs_exit() is called before struct pid's task linkage is removed the case where the task got reaped but a dentry was already attached to struct pid and exit information was recorded and published can be handled correctly. In that case we do return a pidfd for a reaped task like we would've before. Link: https://lore.kernel.org/r/20250316-kabel-fehden-66bdb6a83436@brauner Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-19Merge tag 'kvm-x86-mmu-6.15' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM x86/mmu changes for 6.15 Add support for "fast" aging of SPTEs in both the TDP MMU and Shadow MMU, where "fast" means "without holding mmu_lock". Not taking mmu_lock allows multiple aging actions to run in parallel, and more importantly avoids stalling vCPUs, e.g. due to holding mmu_lock for an extended duration while a vCPU is faulting in memory. For the TDP MMU, protect aging via RCU; the page tables are RCU-protected and KVM doesn't need to access any metadata to age SPTEs. For the Shadow MMU, use bit 1 of rmap pointers (bit 0 is used to terminate a list of rmaps) to implement a per-rmap single-bit spinlock. When aging a gfn, acquire the rmap's spinlock with read-only permissions, which allows hardening and optimizing the locking and aging, e.g. locking an rmap for write requires mmu_lock to also be held. The lock is NOT a true R/W spinlock, i.e. multiple concurrent readers aren't supported. To avoid forcing all SPTE updates to use atomic operations (clearing the Accessed bit out of mmu_lock makes it inherently volatile), rework and rename spte_has_volatile_bits() to spte_needs_atomic_update() and deliberately exclude the Accessed bit. KVM (and mm/) already tolerates false positives/negatives for Accessed information, and all testing has shown that reducing the latency of aging is far more beneficial to overall system performance than providing "perfect" young/old information.
2025-03-19ASoC: ops: Remove snd_soc_put_volsw_range()Charles Keepax
With the addition of the soc_mixer_ctl_to_reg() helper it is now very clear that the only difference between snd_soc_put_volsw() and snd_soc_put_volsw_range() is that the former supports double controls with both values in the same register. As such we can combine both functions. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250318171459.3203730-11-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-19ASoC: ops: Remove snd_soc_get_volsw_range()Charles Keepax
With the addition of the soc_mixer_reg_to_ctl() helper it is now very clear that the only difference between snd_soc_get_volsw() and snd_soc_get_volsw_range() is that the former supports double controls with both values in the same register. As such we can combine both functions. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250318171459.3203730-10-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-19ASoC: ops: Remove snd_soc_info_volsw_range()Charles Keepax
The only difference between snd_soc_info_volsw() and snd_soc_info_volsw_range() is that the later will not force a 2 value control to be of type integer if the name ends in "Volume". The kernel currently contains no users of snd_soc_info_volsw_range() that would return a boolean control with this code, so the risk is quite low and it seems appropriate that it should contain volume control detection. So remove snd_soc_info_volsw_range() and point its users at snd_soc_info_volsw(). Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250318171459.3203730-9-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-19x86/cpu: Add cpu_type to struct x86_cpu_idPawan Gupta
In addition to matching vendor/family/model/feature, for hybrid variants it is required to also match cpu-type. For example, some CPU vulnerabilities like RFDS only affect a specific cpu-type. To be able to also match CPUs based on their type, add a new field "type" to struct x86_cpu_id which is used by the CPU-matching tables. Introduce X86_CPU_TYPE_ANY for the cases that don't care about the cpu-type. [ bp: Massage commit message. ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-3-e8514dcaaff2@linux.intel.com
2025-03-19Merge tag 'v6.14-rc7' into x86/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-19ASoC: tas2781: Support dsp firmware Alpha and Beta seaiesShenghao Ding
For calibration, basic version does not contain any calibration addresses, it depends on calibration tool to convey the addresses to the driver. Since Alpha and Beta firmware, all the calibration addresses are saved into the firmware. Signed-off-by: Shenghao Ding <shenghao-ding@ti.com> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250313093238.1184-1-shenghao-ding@ti.com
2025-03-19Merge branch 'for-linus' into for-nextTakashi Iwai
Back-merge of 6.14 devel branch for further developments of TAS codecsBack-merge of 6.14 devel branch for further developments. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-03-18i2c: Introduce i2c_10bit_addr_*_from_msg() helpersAndy Shevchenko
There are already a lot of drivers that have been using i2c_8bit_addr_from_msg() for 7-bit addresses, now it's time to have the similar for 10-bit addresses. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20250213141045.2716943-2-andriy.shevchenko@linux.intel.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-03-18btrfs: defrag: extend ioctl to accept compression levelsDaniel Vacek
The zstd and zlib compression types support setting compression levels. Enhance the defrag interface to specify the levels as well. For zstd the negative (realtime) levels are also accepted. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Daniel Vacek <neelx@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-03-18locking: Move MCS struct definition to public headerKumar Kartikeya Dwivedi
Move the definition of the struct mcs_spinlock from the private mcs_spinlock.h header in kernel/locking to the mcs_spinlock.h asm-generic header, since we will need to reference it from the qspinlock.h header in subsequent commits. Reviewed-by: Barret Rhoden <brho@google.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-18bpf: Make perf_event_read_output accessible in all program types.Emil Tsalapatis
The perf_event_read_event_output helper is currently only available to tracing protrams, but is useful for other BPF programs like sched_ext schedulers. When the helper is available, provide its bpf_func_proto directly from the bpf base_proto. Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20250318030753.10949-1-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-18iommu/arm-smmu-v3: Report events that belong to devices attached to vIOMMUNicolin Chen
Aside from the IOPF framework, iommufd provides an additional pathway to report hardware events, via the vEVENTQ of vIOMMU infrastructure. Define an iommu_vevent_arm_smmuv3 uAPI structure, and report stage-1 events in the threaded IRQ handler. Also, add another four event record types that can be forwarded to a VM. Link: https://patch.msgid.link/r/5cf6719682fdfdabffdb08374cdf31ad2466d75a.1741719725.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-18iommufd/viommu: Add iommufd_viommu_report_event helperNicolin Chen
Similar to iommu_report_device_fault, this allows IOMMU drivers to report vIOMMU events from threaded IRQ handlers to user space hypervisors. Link: https://patch.msgid.link/r/44be825042c8255e75d0151b338ffd8ba0e4920b.1741719725.git.nicolinc@nvidia.com Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-18iommufd/viommu: Add iommufd_viommu_get_vdev_id helperNicolin Chen
This is a reverse search v.s. iommufd_viommu_find_dev, as drivers may want to convert a struct device pointer (physical) to its virtual device ID for an event injection to the user space VM. Again, this avoids exposing more core structures to the drivers, than the iommufd_viommu alone. Link: https://patch.msgid.link/r/18b8e8bc1b8104d43b205d21602c036fd0804e56.1741719725.git.nicolinc@nvidia.com Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-18iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOCNicolin Chen
Introduce a new IOMMUFD_OBJ_VEVENTQ object for vIOMMU Event Queue that provides user space (VMM) another FD to read the vIOMMU Events. Allow a vIOMMU object to allocate vEVENTQs, with a condition that each vIOMMU can only have one single vEVENTQ per type. Add iommufd_veventq_alloc() with iommufd_veventq_ops for the new ioctl. Link: https://patch.msgid.link/r/21acf0751dd5c93846935ee06f93b9c65eff5e04.1741719725.git.nicolinc@nvidia.com Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-03-18virtchnl: make proto and filter action count unsignedJan Glaza
The count field in virtchnl_proto_hdrs and virtchnl_filter_action_set should never be negative while still being valid. Changing it from int to u32 ensures proper handling of values in virtchnl messages in driverrs and prevents unintended behavior. In its current signed form, a negative count does not trigger an error in ice driver but instead results in it being treated as 0. This can lead to unexpected outcomes when processing messages. By using u32, any invalid values will correctly trigger -EINVAL, making error detection more robust. Fixes: 1f7ea1cd6a374 ("ice: Enable FDIR Configure for AVF") Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jan Glaza <jan.glaza@intel.com> Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-03-18mtd: nand: Fix a kdoc commentMiquel Raynal
The max_bad_eraseblocks_per_lun member of nand_device obviously describes a number of *maximum* number of bad eraseblocks per LUN. Fix this obvious typo. Fixes: 377e517b5fa5 ("mtd: nand: Add max_bad_eraseblocks_per_lun info to memorg") Cc: <stable+noautosel@kernel.org> # fix kdoc comment Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2025-03-18mtd: spinand: Improve spinand_info macros styleMiquel Raynal
Let's assume all these macros should not have a trailing comma, this way the caller can use a more formal and usual C writing style, as reflected in the Macronix driver. Acked-by: Pratyush Yadav <pratyush@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2025-03-18fs: dedup handling of struct filename init and refcounts bumpsMateusz Guzik
No functional changes. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250313142744.1323281-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-18tcp: cache RTAX_QUICKACK metric in a hot cache lineEric Dumazet
tcp_in_quickack_mode() is called from input path for small packets. It calls __sk_dst_get() which reads sk->sk_dst_cache which has been put in sock_read_tx group (for good reasons). Then dst_metric(dst, RTAX_QUICKACK) also needs extra cache line misses. Cache RTAX_QUICKACK in icsk->icsk_ack.dst_quick_ack to no longer pull these cache lines for the cases a delayed ACK is scheduled. After this patch TCP receive path does not longer access sock_read_tx group. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250312083907.1931644-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18spi: Merge up fixesMark Brown
They are a dependency for applying some changes to the MAINTAINERS file.
2025-03-18inet: frags: change inet_frag_kill() to defer refcount updatesEric Dumazet
In the following patch, we no longer assume inet_frag_kill() callers own a reference. Consuming two refcounts from inet_frag_kill() would lead in UAF. Propagate the pointer to the refs that will be consumed later by the final inet_frag_putn() call. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250312082250.1803501-4-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18inet: frags: add inet_frag_putn() helperEric Dumazet
inet_frag_putn() can release multiple references in one step. Use it in inet_frags_free_cb(). Replace inet_frag_put(X) with inet_frag_putn(X, 1) Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250312082250.1803501-2-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: skbuff: Remove unused skb_add_data()Yue Haibing
Since commit a4ea4c477619 ("rxrpc: Don't use a ring buffer for call Tx queue") this function is not used anymore. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250312063450.183652-1-yuehaibing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18Merge tag 'batadv-next-pullrequest-20250313' of ↵Paolo Abeni
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - drop batadv_priv_debug_log struct, by Sven Eckelmann - adopt netdev_hold() / netdev_put(), by Eric Dumazet - add support for jumbo frames, by Sven Eckelmann - use consistent name for mesh interface, by Sven Eckelmann - cleanup B.A.T.M.A.N. IV OGM aggregation handling, by Sven Eckelmann (4 patches) - add missing newlines for log macros, by Sven Eckelmann * tag 'batadv-next-pullrequest-20250313' of git://git.open-mesh.org/linux-merge: batman-adv: add missing newlines for log macros batman-adv: Limit aggregation size to outgoing MTU batman-adv: Use actual packet count for aggregated packets batman-adv: Switch to bitmap helper for aggregation handling batman-adv: Limit number of aggregated packets directly batman-adv: Use consistent name for mesh interface batman-adv: Add support for jumbo frames batman-adv: adopt netdev_hold() / netdev_put() batman-adv: Drop batadv_priv_debug_log struct batman-adv: Start new development cycle ==================== Link: https://patch.msgid.link/20250313164519.72808-1-sw@simonwunderlich.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18udp_tunnel: use static call for GRO hooks when possiblePaolo Abeni
It's quite common to have a single UDP tunnel type active in the whole system. In such a case we can replace the indirect call for the UDP tunnel GRO callback with a static call. Add the related accounting in the control path and switch to static call when possible. To keep the code simple use a static array for the registered tunnel types, and size such array based on the kernel config. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/6fd1f9c7651151493ecab174e7b8386a1534170d.1741718157.git.pabeni@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18udp_tunnel: create a fastpath GRO lookup.Paolo Abeni
Most UDP tunnels bind a socket to a local port, with ANY address, no peer and no interface index specified. Additionally it's quite common to have a single tunnel device per namespace. Track in each namespace the UDP tunnel socket respecting the above. When only a single one is present, store a reference in the netns. When such reference is not NULL, UDP tunnel GRO lookup just need to match the incoming packet destination port vs the socket local port. The tunnel socket never sets the reuse[port] flag[s]. When bound to no address and interface, no other socket can exist in the same netns matching the specified local port. Matching packets with non-local destination addresses will be aggregated, and eventually segmented as needed - no behavior changes intended. Note that the UDP tunnel socket reference is stored into struct netns_ipv4 for both IPv4 and IPv6 tunnels. That is intentional to keep all the fastpath-related netns fields in the same struct and allow cacheline-based optimization. Currently both the IPv4 and IPv6 socket pointer share the same cacheline as the `udp_table` field. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/4d5c319c4471161829f50cb8436841de81a5edae.1741718157.git.pabeni@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: mana: Support holes in device list reply msgHaiyang Zhang
According to GDMA protocol, holes (zeros) are allowed at the beginning or middle of the gdma_list_devices_resp message. The existing code cannot properly handle this, and may miss some devices in the list. To fix, scan the entire list until the num_of_devs are found, or until the end of the list. Cc: stable@vger.kernel.org Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Reviewed-by: Shradha Gupta <shradhagupta@microsoft.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1741723974-1534-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18RDMA/mlx5: Support optional-counters binding for QPsPatrisious Haddad
Add support to allow optional-counters binding to a QP, whereas when a bind operation is requested depending on the counter optional-counter binding state the driver will determine if to also add optional-counters to this QP binding. The optional-counter binding is done by simply adding a steering rule for the specific optional-counter condition with the additional match over that QP number. Note that optional-counters per QP rules are handled on an earlier prio than per device counters, and per device counter correctness is maintained by core whereas it is responsible to sum active counters when checking device counter and to add them to history count when they are deallocated. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/2cad1b891a6641ae61fe8d92f867e1059121813a.1741875070.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-18RDMA/core: Pass port to counter bind/unbind operationsPatrisious Haddad
This will be useful for the next patches in the series since port number is needed for optional counters binding and unbinding. Note that this change is needed since when the operation is done qp->port isn't necessarily initialized yet and can't be used. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/b6f6797844acbd517358e8d2a270ea9b3e6ecba1.1741875070.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>