linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-03-13	net/mlx5: Fix incorrect IRQ pool usage when releasing IRQs	Shay Drory
	mlx5_irq_pool_get() is a getter for completion IRQ pool only. However, after the cited commit, mlx5_irq_pool_get() is called during ctrl IRQ release flow to retrieve the pool, resulting in the use of an incorrect IRQ pool. Hence, use the newly introduced mlx5_irq_get_pool() getter to retrieve the correct IRQ pool based on the IRQ itself. While at it, rename mlx5_irq_pool_get() to mlx5_irq_table_get_comp_irq_pool() which accurately reflects its purpose and improves code readability. Fixes: 0477d5168bbb ("net/mlx5: Expose SFs IRQs") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1741644104-97767-4-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	net/mlx5: HWS, Rightsize bwc matcher priority	Vlad Dogaru
	The bwc layer was clamping the matcher priority from 32 bits to 16 bits. This didn't show up until a matcher was resized, since the initial native matcher was created using the correct 32 bit value. The fix also reorders fields to avoid some padding. Fixes: 2111bb970c78 ("net/mlx5: HWS, added backward-compatible API handling") Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1741644104-97767-3-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	net/mlx5: DR, use the right action structs for STEv3	Yevgeny Kliteynik
	Some actions in ConnectX-8 (STEv3) have different structure, and they are handled separately in ste_ctx_v3. This separate handling was missing two actions: INSERT_HDR and REMOVE_HDR, which broke SWS for Linux Bridge. This patch resolves the issue by introducing dedicated callbacks for the insert and remove header functions, with version-specific implementations for each STE variant. Fixes: 4d617b57574f ("net/mlx5: DR, add support for ConnectX-8 steering") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1741644104-97767-2-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	Merge branch 'mlx5-next' of ↵	Paolo Abeni
	git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-03-10 The following pull-request contains common mlx5 updates for your net-next tree. Please pull and let me know of any problem. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add IFC bits for PPCNT recovery counters group net/mlx5: fs, add RDMA TRANSPORT steering domain support net/mlx5: Query ADV_RDMA capabilities net/mlx5: Limit non-privileged commands net/mlx5: Allow the throttle mechanism to be more dynamic net/mlx5: Add RDMA_CTRL HW capabilities ==================== Link: https://patch.msgid.link/1741608293-41436-1-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	dt-bindings: net: Define interrupt constraints for DWMAC vendor bindings	Lad Prabhakar
	The `snps,dwmac.yaml` binding currently sets `maxItems: 3` for the `interrupts` and `interrupt-names` properties, but vendor bindings selecting `snps,dwmac.yaml` do not impose these limits. Define constraints for `interrupts` and `interrupt-names` properties in various DWMAC vendor bindings to ensure proper validation and consistency. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Link: https://patch.msgid.link/20250309003301.1152228-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	Merge branch 'net-stmmac-dwmac-rk-validate-grf-and-peripheral-grf-during-probe'	Paolo Abeni
	Jonas Karlman says: ==================== net: stmmac: dwmac-rk: Validate GRF and peripheral GRF during probe All Rockchip GMAC variants typically write to GRF regs to control e.g. interface mode, speed and MAC rx/tx delay. Newer SoCs such as RK3576 and RK3588 use a mix of GRF and peripheral GRF regs. These syscon regmaps is located with help of a rockchip,grf and rockchip,php-grf phandle. However, validating the rockchip,grf and rockchip,php-grf syscon regmap is deferred until e.g. interface mode or speed is configured. This series change to validate the GRF and peripheral GRF syscon regmap at probe time to help simplify the SoC specific operations. This should not introduce any backward compatibility issues as all GMAC nodes have been added together with a rockchip,grf phandle (and rockchip,php-grf where required) in their initial commit. ==================== Link: https://patch.msgid.link/20250308213720.2517944-1-jonas@kwiboo.se Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	net: stmmac: dwmac-rk: Remove unneeded GRF and peripheral GRF checks	Jonas Karlman
	Now that GRF, and peripheral GRF where needed, is validated at probe time there is no longer any need to check and log an error in each SoC specific operation. Remove unneeded IS_ERR() checks and early bail out from each SoC specific operation. Signed-off-by: Jonas Karlman <jonas@kwiboo.se> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250308213720.2517944-4-jonas@kwiboo.se Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	net: stmmac: dwmac-rk: Validate GRF and peripheral GRF during probe	Jonas Karlman
	All Rockchip GMAC variants typically write to GRF regs to control e.g. interface mode, speed and MAC rx/tx delay. Newer SoCs such as RK3576 and RK3588 use a mix of GRF and peripheral GRF regs. These syscon regmaps is located with help of a rockchip,grf and rockchip,php-grf phandle. However, validating the rockchip,grf and rockchip,php-grf syscon regmap is deferred until e.g. interface mode or speed is configured, inside the individual SoC specific operations. Change to validate the rockchip,grf and rockchip,php-grf syscon regmap at probe time to simplify all SoC specific operations. This should not introduce any backward compatibility issues as all GMAC nodes have been added together with a rockchip,grf phandle (and rockchip,php-grf where required) in their initial commit. Signed-off-by: Jonas Karlman <jonas@kwiboo.se> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250308213720.2517944-3-jonas@kwiboo.se Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	dt-bindings: net: rockchip-dwmac: Require rockchip,grf and rockchip,php-grf	Jonas Karlman
	All Rockchip GMAC variants typically write to GRF regs to control e.g. interface mode, speed and MAC rx/tx delay. Newer SoCs such as RK3562, RK3576 and RK3588 use a mix of GRF and peripheral GRF regs. Prior to the commit b331b8ef86f0 ("dt-bindings: net: convert rockchip-dwmac to json-schema") the property rockchip,grf was listed under "Required properties". During the conversion this was lost and rockchip,grf has since then incorrectly been treated as optional and not as required. Similarly, when rockchip,php-grf was added to the schema in the commit a2b77831427c ("dt-bindings: net: rockchip-dwmac: add rk3588 gmac compatible") it also incorrectly has been treated as optional for all GMAC variants, when it should have been required for RK3588, and later also for RK3576. Update this binding to require rockchip,grf and rockchip,php-grf to properly reflect that GRF (and peripheral GRF for RK3576/RK3588) is required to control part of GMAC. This should not introduce any breakage as all Rockchip GMAC nodes have been added together with a rockchip,grf phandle (and rockchip,php-grf where required) in their initial commit. Signed-off-by: Jonas Karlman <jonas@kwiboo.se> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250308213720.2517944-2-jonas@kwiboo.se Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	Revert "openvswitch: switch to per-action label counting in conntrack"	Xin Long
	Currently, ovs_ct_set_labels() is only called for confirmed conntrack entries (ct) within ovs_ct_commit(). However, if the conntrack entry does not have the labels_ext extension, attempting to allocate it in ovs_ct_get_conn_labels() for a confirmed entry triggers a warning in nf_ct_ext_add(): WARN_ON(nf_ct_is_confirmed(ct)); This happens when the conntrack entry is created externally before OVS increments net->ct.labels_used. The issue has become more likely since commit fcb1aa5163b1 ("openvswitch: switch to per-action label counting in conntrack"), which changed to use per-action label counting and increment net->ct.labels_used when a flow with ct action is added. Since there’s no straightforward way to fully resolve this issue at the moment, this reverts the commit to avoid breaking existing use cases. Fixes: fcb1aa5163b1 ("openvswitch: switch to per-action label counting in conntrack") Reported-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Aaron Conole <aconole@redhat.com> Link: https://patch.msgid.link/1bdeb2f3a812bca016a225d3de714427b2cd4772.1741457143.git.lucien.xin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	net: openvswitch: remove misbehaving actions length check	Ilya Maximets
	The actions length check is unreliable and produces different results depending on the initial length of the provided netlink attribute and the composition of the actual actions inside of it. For example, a user can add 4088 empty clone() actions without triggering -EMSGSIZE, on attempt to add 4089 such actions the operation will fail with the -EMSGSIZE verdict. However, if another 16 KB of other actions will be appended to the previous 4089 clone() actions, the check passes and the flow is successfully installed into the openvswitch datapath. The reason for a such a weird behavior is the way memory is allocated. When ovs_flow_cmd_new() is invoked, it calls ovs_nla_copy_actions(), that in turn calls nla_alloc_flow_actions() with either the actual length of the user-provided actions or the MAX_ACTIONS_BUFSIZE. The function adds the size of the sw_flow_actions structure and then the actually allocated memory is rounded up to the closest power of two. So, if the user-provided actions are larger than MAX_ACTIONS_BUFSIZE, then MAX_ACTIONS_BUFSIZE + sizeof(*sfa) rounded up is 32K + 24 -> 64K. Later, while copying individual actions, we look at ksize(), which is 64K, so this way the MAX_ACTIONS_BUFSIZE check is not actually triggered and the user can easily allocate almost 64 KB of actions. However, when the initial size is less than MAX_ACTIONS_BUFSIZE, but the actions contain ones that require size increase while copying (such as clone() or sample()), then the limit check will be performed during the reserve_sfa_size() and the user will not be allowed to create actions that yield more than 32 KB internally. This is one part of the problem. The other part is that it's not actually possible for the userspace application to know beforehand if the particular set of actions will be rejected or not. Certain actions require more space in the internal representation, e.g. an empty clone() takes 4 bytes in the action list passed in by the user, but it takes 12 bytes in the internal representation due to an extra nested attribute, and some actions require less space in the internal representations, e.g. set(tunnel(..)) normally takes 64+ bytes in the action list provided by the user, but only needs to store a single pointer in the internal implementation, since all the data is stored in the tunnel_info structure instead. And the action size limit is applied to the internal representation, not to the action list passed by the user. So, it's not possible for the userpsace application to predict if the certain combination of actions will be rejected or not, because it is not possible for it to calculate how much space these actions will take in the internal representation without knowing kernel internals. All that is causing random failures in ovs-vswitchd in userspace and inability to handle certain traffic patterns as a result. For example, it is reported that adding a bit more than a 1100 VMs in an OpenStack setup breaks the network due to OVS not being able to handle ARP traffic anymore in some cases (it tries to install a proper datapath flow, but the kernel rejects it with -EMSGSIZE, even though the action list isn't actually that large.) Kernel behavior must be consistent and predictable in order for the userspace application to use it in a reasonable way. ovs-vswitchd has a mechanism to re-direct parts of the traffic and partially handle it in userspace if the required action list is oversized, but that doesn't work properly if we can't actually tell if the action list is oversized or not. Solution for this is to check the size of the user-provided actions instead of the internal representation. This commit just removes the check from the internal part because there is already an implicit size check imposed by the netlink protocol. The attribute can't be larger than 64 KB. Realistically, we could reduce the limit to 32 KB, but we'll be risking to break some existing setups that rely on the fact that it's possible to create nearly 64 KB action lists today. Vast majority of flows in real setups are below 100-ish bytes. So removal of the limit will not change real memory consumption on the system. The absolutely worst case scenario is if someone adds a flow with 64 KB of empty clone() actions. That will yield a 192 KB in the internal representation consuming 256 KB block of memory. However, that list of actions is not meaningful and also a no-op. Real world very large action lists (that can occur for a rare cases of BUM traffic handling) are unlikely to contain a large number of clones and will likely have a lot of tunnel attributes making the internal representation comparable in size to the original action list. So, it should be fine to just remove the limit. Commit in the 'Fixes' tag is the first one that introduced the difference between internal representation and the user-provided action lists, but there were many more afterwards that lead to the situation we have today. Fixes: 7d5437c709de ("openvswitch: Add tunneling interface.") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Aaron Conole <aconole@redhat.com> Link: https://patch.msgid.link/20250308004609.2881861-1-i.maximets@ovn.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	Merge branch 'gre-fix-regressions-in-ipv6-link-local-address-generation'	Paolo Abeni
	Guillaume Nault says: ==================== gre: Fix regressions in IPv6 link-local address generation. IPv6 link-local address generation has some special cases for GRE devices. This has led to several regressions in the past, and some of them are still not fixed. This series fixes the remaining problems, like the ipv6.conf.<dev>.addr_gen_mode sysctl being ignored and the router discovery process not being started (see details in patch 1). To avoid any further regressions, patch 2 adds selftests covering IPv4 and IPv6 gre/gretap devices with all combinations of currently supported addr_gen_mode values. ==================== Link: https://patch.msgid.link/cover.1741375285.git.gnault@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	selftests: Add IPv6 link-local address generation tests for GRE devices.	Guillaume Nault
	GRE devices have their special code for IPv6 link-local address generation that has been the source of several regressions in the past. Add selftest to check that all gre, ip6gre, gretap and ip6gretap get an IPv6 link-link local address in accordance with the net.ipv6.conf.<dev>.addr_gen_mode sysctl. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/2d6772af8e1da9016b2180ec3f8d9ee99f470c77.1741375285.git.gnault@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	gre: Fix IPv6 link-local address generation.	Guillaume Nault
	Use addrconf_addr_gen() to generate IPv6 link-local addresses on GRE devices in most cases and fall back to using add_v4_addrs() only in case the GRE configuration is incompatible with addrconf_addr_gen(). GRE used to use addrconf_addr_gen() until commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address") restricted this use to gretap and ip6gretap devices, and created add_v4_addrs() (borrowed from SIT) for non-Ethernet GRE ones. The original problem came when commit 9af28511be10 ("addrconf: refuse isatap eui64 for INADDR_ANY") made __ipv6_isatap_ifid() fail when its addr parameter was 0. The commit says that this would create an invalid address, however, I couldn't find any RFC saying that the generated interface identifier would be wrong. Anyway, since gre over IPv4 devices pass their local tunnel address to __ipv6_isatap_ifid(), that commit broke their IPv6 link-local address generation when the local address was unspecified. Then commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address") tried to fix that case by defining add_v4_addrs() and calling it to generate the IPv6 link-local address instead of using addrconf_addr_gen() (apart for gretap and ip6gretap devices, which would still use the regular addrconf_addr_gen(), since they have a MAC address). That broke several use cases because add_v4_addrs() isn't properly integrated into the rest of IPv6 Neighbor Discovery code. Several of these shortcomings have been fixed over time, but add_v4_addrs() remains broken on several aspects. In particular, it doesn't send any Router Sollicitations, so the SLAAC process doesn't start until the interface receives a Router Advertisement. Also, add_v4_addrs() mostly ignores the address generation mode of the interface (/proc/sys/net/ipv6/conf//addr_gen_mode), thus breaking the IN6_ADDR_GEN_MODE_RANDOM and IN6_ADDR_GEN_MODE_STABLE_PRIVACY cases. Fix the situation by using add_v4_addrs() only in the specific scenario where the normal method would fail. That is, for interfaces that have all of the following characteristics: run over IPv4, * transport IP packets directly, not Ethernet (that is, not gretap interfaces), * tunnel endpoint is INADDR_ANY (that is, 0), * device address generation mode is EUI64. In all other cases, revert back to the regular addrconf_addr_gen(). Also, remove the special case for ip6gre interfaces in add_v4_addrs(), since ip6gre devices now always use addrconf_addr_gen() instead. Fixes: e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address") Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/559c32ce5c9976b269e6337ac9abb6a96abe5096.1741375285.git.gnault@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	net: hsr: Add KUnit test for PRP	Jaakko Karrenpalo
	Add unit tests for the PRP duplicate detection Signed-off-by: Jaakko Karrenpalo <jkarrenpalo@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250307161700.1045-2-jkarrenpalo@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	net: hsr: Fix PRP duplicate detection	Jaakko Karrenpalo
	Add PRP specific function for handling duplicate packets. This is needed because of potential L2 802.1p prioritization done by network switches. The L2 prioritization can re-order the PRP packets from a node causing the existing implementation to discard the frame(s) that have been received 'late' because the sequence number is before the previous received packet. This can happen if the node is sending multiple frames back-to-back with different priority. Signed-off-by: Jaakko Karrenpalo <jkarrenpalo@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250307161700.1045-1-jkarrenpalo@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	netfilter: nft_exthdr: fix offset with ipv4_find_option()	Alexey Kashavkin
	There is an incorrect calculation in the offset variable which causes the nft_skb_copy_to_reg() function to always return -EFAULT. Adding the start variable is redundant. In the __ip_options_compile() function the correct offset is specified when finding the function. There is no need to add the size of the iphdr structure to the offset. Fixes: dbb5281a1f84 ("netfilter: nf_tables: add support for matching IPv4 options") Signed-off-by: Alexey Kashavkin <akashavkin@gmail.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-03-13	net: cn23xx: fix typos	Janik Haag
	This patch fixes a few typos, spelling mistakes, and a bit of grammar, increasing the comments readability. Signed-off-by: Janik Haag <janik@aq0.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250307145648.1679912-2-janik@aq0.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	net: hns3: use string choices helper	Jian Shen
	Use string choices helper for better readability. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250307113733.819448-1-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13	reset: mchp: sparx5: Fix for lan966x	Horatiu Vultur
	With the blamed commit it seems that lan966x doesn't seem to boot anymore when the internal CPU is used. The reason seems to be the usage of the devm_of_iomap, if we replace this with devm_ioremap, this seems to fix the issue as we use the same region also for other devices. Fixes: 0426a920d6269c ("reset: mchp: sparx5: Map cpu-syscon locally in case of LAN966x") Reviewed-by: Herve Codina <herve.codina@bootlin.com> Tested-by: Herve Codina <herve.codina@bootlin.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Link: https://lore.kernel.org/r/20250227105502.25125-1-horatiu.vultur@microchip.com Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2025-03-13	gpio: cdev: use raw notifier for line state events	Bartosz Golaszewski
	We use a notifier to implement the mechanism of informing the user-space about changes in GPIO line status. We register with the notifier when the GPIO character device file is opened and unregister when the last reference to the associated file descriptor is dropped. Since commit fcc8b637c542 ("gpiolib: switch the line state notifier to atomic") we use the atomic notifier variant. Atomic notifiers call rcu_synchronize in atomic_notifier_chain_unregister() which caused a significant performance regression in some circumstances, observed by user-space when calling close() on the GPIO device file descriptor. Replace the atomic notifier with the raw variant and provide synchronization with a read-write spinlock. Fixes: fcc8b637c542 ("gpiolib: switch the line state notifier to atomic") Reported-by: David Jander <david@protonic.nl> Closes: https://lore.kernel.org/all/20250311110034.53959031@erd003.prtnl/ Tested-by: David Jander <david@protonic.nl> Tested-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20250311-gpiolib-line-state-raw-notifier-v2-1-138374581e1e@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-03-13	gpiolib: don't check the retval of get_direction() when registering a chip	Bartosz Golaszewski
	During chip registration we should neither check the return value of gc->get_direction() nor hold the SRCU lock when calling it. The former is because pin controllers may have pins set to alternate functions and return errors from their get_direction() callbacks. That's alright - we should default to the safe INPUT state and not bail-out. The latter is not needed because we haven't registered the chip yet so there's nothing to protect against dynamic removal. In fact: we currently hit a lockdep splat. Revert to calling the gc->get_direction() callback directly and not checking its value. Fixes: 9d846b1aebbe ("gpiolib: check the return value of gpio_chip::get_direction()") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Closes: https://lore.kernel.org/all/81f890fc-6688-42f0-9756-567efc8bb97a@samsung.com/ Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20250226-retval-fixes-v2-1-c8dc57182441@linaro.org Tested-by: Gene C <arch@sapience.com> Link: https://lore.kernel.org/r/20250311175631.83779-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-03-13	Merge tag 'asoc-fix-v6.14-rc6' of ↵	Takashi Iwai
	https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.14 The bulk of this is driver specific fixes, mostly unremarkable. There's also one core fix from Charles, fixing up confusion around the limiting of maximum control values.
2025-03-13	bcachefs: fix tiny leak in bch2_dev_add()	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-12	Merge tag 'sched_ext-for-6.14-rc6-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fix from Tejun Heo: "BPF schedulers could trigger a crash by passing in an invalid CPU to the scx_bpf_select_cpu_dfl() helper. Fix it by verifying input validity" * tag 'sched_ext-for-6.14-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Validate prev_cpu in scx_bpf_select_cpu_dfl()
2025-03-12	Merge tag 'spi-fix-v6.14-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A couple of driver specific fixes, an error handling fix for the Atmel QuadSPI driver and a fix for a nasty synchronisation issue in the data path for the Microchip driver which affects larger transfers. There's also a MAINTAINERS update for the Samsung driver" * tag 'spi-fix-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: microchip-core: prevent RX overflows when transmit size > FIFO size MAINTAINERS: add tambarus as R for Samsung SPI spi: atmel-quadspi: remove references to runtime PM on error path
2025-03-12	netdevsim: 'support' multi-buf XDP	Jakub Kicinski
	Don't error out on large MTU if XDP is multi-buf. The ping test now tests ping with XDP and high MTU. netdevsim doesn't actually run the prog (yet?) so it doesn't matter if the prog was multi-buf.. Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/20250311092820.542148-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	Merge branch 'net-remove-rtnl_lock-from-the-callers-of-queue-apis'	Jakub Kicinski
	Stanislav Fomichev says: ==================== net: remove rtnl_lock from the callers of queue APIs All drivers that use queue management APIs already depend on the netdev lock. Ultimately, we want to have most of the paths that work with specific netdev to be rtnl_lock-free (ethtool mostly in particular). Queue API currently has a much smaller API surface, so start with rtnl_lock from it: - add mutex to each dmabuf binding (to replace rtnl_lock) - move netdev lock management to the callers of netdev_rx_queue_restart and drop rtnl_lock ==================== Link: https://patch.msgid.link/20250311144026.4154277-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	net: drop rtnl_lock for queue_mgmt operations	Stanislav Fomichev
	All drivers that use queue API are already converted to use netdev instance lock. Move netdev instance lock management to the netlink layer and drop rtnl_lock. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Mina Almasry. <almasrymina@google.com> Link: https://patch.msgid.link/20250311144026.4154277-4-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	net: add granular lock for the netdev netlink socket	Stanislav Fomichev
	As we move away from rtnl_lock for queue ops, introduce per-netdev_nl_sock lock. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250311144026.4154277-3-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	net: create netdev_nl_sock to wrap bindings list	Stanislav Fomichev
	No functional changes. Next patches will add more granular locking to netdev_nl_sock. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250311144026.4154277-2-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	net/mlx5: Avoid unnecessary use of comma operator	Simon Horman
	Although it does not seem to have any untoward side-effects, the use of ';' to separate to assignments seems more appropriate than ','. Flagged by clang-19 -Wcomma No functional change intended. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250307-mlx5-comma-v1-1-934deb6927bb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	selftests: net: bump GRO timeout for gro/setup_veth	Jakub Kicinski
	Commit 51bef03e1a71 ("selftests/net: deflake GRO tests") recently switched to NAPI suspension, and lowered the timeout from 1ms to 100us. This started causing flakes in netdev-run CI. Let's bump it to 200us. In a quick test of a debug kernel I see failures with 100us, with 200us in 5 runs I see 2 completely clean runs and 3 with a single retry (GRO test will retry up to 5 times). Reviewed-by: Kevin Krakauer <krakauer@google.com> Link: https://patch.msgid.link/20250310110821.385621-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	eth: bnxt: add missing netdev lock management to bnxt_dl_reload_up	Stanislav Fomichev
	bnxt_dl_reload_up is completely missing instance lock management which can result in `devlink dev reload` leaving with instance lock held. Add the missing calls. Also add netdev_assert_locked to make it clear that the up() method is running with the instance lock grabbed. v2: - add net/netdev_lock.h include to bnxt_devlink.c for netdev_assert_locked Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250309215851.2003708-3-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	eth: bnxt: request unconditional ops lock	Stanislav Fomichev
	netdev_lock_ops conditionally grabs instance lock when queue_mgmt_ops is defined. However queue_mgmt_ops support is signaled via FW so we can sometimes boot without queue_mgmt_ops being set. This will result in bnxt running without instance lock which the driver now heavily depends on. Set request_ops_lock to true unconditionally to always request netdev instance lock. Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250309215851.2003708-2-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	eth: bnxt: switch to netif_close	Stanislav Fomichev
	All (error) paths that call dev_close are already holding instance lock, so switch to netif_close to avoid the deadlock. v2: - add missing EXPORT_MODULE for netif_close Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250309215851.2003708-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	net: revert to lockless TC_SETUP_BLOCK and TC_SETUP_FT	Stanislav Fomichev
	There is a couple of places from which we can arrive to ndo_setup_tc with TC_SETUP_BLOCK/TC_SETUP_FT: - netlink - netlink notifier - netdev notifier Locking netdev too deep in this call chain seems to be problematic (especially assuming some/all of the call_netdevice_notifiers NETDEV_UNREGISTER) might soon be running with the instance lock). Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT framework already takes care of most of the locking. Document the assumptions. ndo_setup_tc TC_SETUP_BLOCK nft_block_offload_cmd nft_chain_offload_cmd nft_flow_block_chain nft_flow_offload_chain nft_flow_rule_offload_abort nft_flow_rule_offload_commit nft_flow_rule_offload_commit nf_tables_commit nfnetlink_rcv_batch nfnetlink_rcv_skb_batch nfnetlink_rcv nft_offload_netdev_event NETDEV_UNREGISTER notifier ndo_setup_tc TC_SETUP_FT nf_flow_table_offload_cmd nf_flow_table_offload_setup nft_unregister_flowtable_hook nft_register_flowtable_net_hooks nft_flowtable_update nf_tables_newflowtable nfnetlink_rcv_batch (.call NFNL_CB_BATCH) nft_flowtable_update nf_tables_newflowtable nft_flowtable_event nf_tables_flowtable_event NETDEV_UNREGISTER notifier __nft_unregister_flowtable_net_hooks nft_unregister_flowtable_net_hooks nf_tables_commit nfnetlink_rcv_batch (.call NFNL_CB_BATCH) __nf_tables_abort nf_tables_abort nfnetlink_rcv_batch __nft_release_hook __nft_release_hooks nf_tables_pre_exit_net -> module unload nft_rcv_nl_event netlink_register_notifier (oh boy) nft_register_flowtable_net_hooks nft_flowtable_update nf_tables_newflowtable nf_tables_newflowtable Fixes: c4f0f30b424e ("net: hold netdev instance lock during nft ndo_setup_tc") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reported-by: syzbot+0afb4bcf91e5a1afdcad@syzkaller.appspotmail.com Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250308044726.1193222-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	Merge branch 'net_sched-prevent-creation-of-classes-with-tc_h_root'	Jakub Kicinski
	Cong Wang says: ==================== net_sched: Prevent creation of classes with TC_H_ROOT This patchset contains a bug fix and its TDC test case. ==================== Link: https://patch.msgid.link/20250306232355.93864-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	selftests/tc-testing: Add a test case for DRR class with TC_H_ROOT	Cong Wang
	Integrate the reproduer from Mingi to TDC. All test results: 1..4 ok 1 0385 - Create DRR with default setting ok 2 2375 - Delete DRR with handle ok 3 3092 - Show DRR class ok 4 4009 - Reject creation of DRR class with classid TC_H_ROOT Cc: Mingi Cho <mincho@theori.io> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250306232355.93864-3-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	net_sched: Prevent creation of classes with TC_H_ROOT	Cong Wang
	The function qdisc_tree_reduce_backlog() uses TC_H_ROOT as a termination condition when traversing up the qdisc tree to update parent backlog counters. However, if a class is created with classid TC_H_ROOT, the traversal terminates prematurely at this class instead of reaching the actual root qdisc, causing parent statistics to be incorrectly maintained. In case of DRR, this could lead to a crash as reported by Mingi Cho. Prevent the creation of any Qdisc class with classid TC_H_ROOT (0xFFFFFFFF) across all qdisc types, as suggested by Jamal. Reported-by: Mingi Cho <mincho@theori.io> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 066a3b5b2346 ("[NET_SCHED] sch_api: fix qdisc_tree_decrease_qlen() loop") Link: https://patch.msgid.link/20250306232355.93864-2-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12	drm/amdgpu: NULL-check BO's backing store when determining GFX12 PTE flags	Natalie Vock
	PRT BOs may not have any backing store, so bo->tbo.resource will be NULL. Check for that before dereferencing. Fixes: 0cce5f285d9a ("drm/amdkfd: Check correct memory types for is_system variable") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Natalie Vock <natalie.vock@gmx.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3e3fcd29b505cebed659311337ea03b7698767fc) Cc: stable@vger.kernel.org # 6.12.x
2025-03-12	drm/amd/amdkfd: Evict all queues even HWS remove queue failed	Yifan Zha
	[Why] If reset is detected and kfd need to evict working queues, HWS moving queue will be failed. Then remaining queues are not evicted and in active state. After reset done, kfd uses HWS to termination remaining activated queues but HWS is resetted. So remove queue will be failed again. [How] Keep removing all queues even if HWS returns failed. It will not affect cpsch as it checks reset_domain->sem. v2: If any queue failed, evict queue returns error. v3: Declare err inside the if-block. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 42c854b8fb0cce512534aa2b7141948e80c6ebb0) Cc: stable@vger.kernel.org
2025-03-12	RDMA/hns: Fix wrong value of max_sge_rd	Junxian Huang
	There is no difference between the sge of READ and non-READ operations in hns RoCE. Set max_sge_rd to the same value as max_send_sge. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20250311084857.3803665-8-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-12	RDMA/hns: Fix missing xa_destroy()	Junxian Huang
	Add xa_destroy() for xarray in driver. Fixes: 5c1f167af112 ("RDMA/hns: Init SRQ table for hip08") Fixes: 27e19f451089 ("RDMA/hns: Convert cq_table to XArray") Fixes: 736b5a70db98 ("RDMA/hns: Convert qp_table_tree to XArray") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20250311084857.3803665-7-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-12	RDMA/hns: Fix a missing rollback in error path of hns_roce_create_qp_common()	Junxian Huang
	When ib_copy_to_udata() fails in hns_roce_create_qp_common(), hns_roce_qp_remove() should be called in the error path to clean up resources in hns_roce_qp_store(). Fixes: 0f00571f9433 ("RDMA/hns: Use new SQ doorbell register for HIP09") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20250311084857.3803665-6-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-12	RDMA/hns: Fix invalid sq params not being blocked	Junxian Huang
	SQ params from userspace are checked in by set_user_sq_size(). But when the check fails, the function doesn't return but instead keep running and overwrite 'ret'. As a result, the invalid params will not get blocked actually. Add a return right after the failed check. Besides, although the check result of kernel sq params will not be overwritten, to keep coding style unified, move default_congest_type() before set_kernel_sq_size(). Fixes: 6ec429d5887a ("RDMA/hns: Support userspace configuring congestion control algorithm with QP granularity") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20250311084857.3803665-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-12	RDMA/hns: Fix unmatched condition in error path of alloc_user_qp_db()	Junxian Huang
	Currently the condition of unmapping sdb in error path is not exactly the same as the condition of mapping in alloc_user_qp_db(). This may cause a problem of unmapping an unmapped db in some case, such as when the QP is XRC TGT. Unified the two conditions. Fixes: 90ae0b57e4a5 ("RDMA/hns: Combine enable flags of qp") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20250311084857.3803665-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-12	RDMA/hns: Fix soft lockup during bt pages loop	Junxian Huang
	Driver runs a for-loop when allocating bt pages and mapping them with buffer pages. When a large buffer (e.g. MR over 100GB) is being allocated, it may require a considerable loop count. This will lead to soft lockup: watchdog: BUG: soft lockup - CPU#27 stuck for 22s! ... Call trace: hem_list_alloc_mid_bt+0x124/0x394 [hns_roce_hw_v2] hns_roce_hem_list_request+0xf8/0x160 [hns_roce_hw_v2] hns_roce_mtr_create+0x2e4/0x360 [hns_roce_hw_v2] alloc_mr_pbl+0xd4/0x17c [hns_roce_hw_v2] hns_roce_reg_user_mr+0xf8/0x190 [hns_roce_hw_v2] ib_uverbs_reg_mr+0x118/0x290 watchdog: BUG: soft lockup - CPU#35 stuck for 23s! ... Call trace: hns_roce_hem_list_find_mtt+0x7c/0xb0 [hns_roce_hw_v2] mtr_map_bufs+0xc4/0x204 [hns_roce_hw_v2] hns_roce_mtr_create+0x31c/0x3c4 [hns_roce_hw_v2] alloc_mr_pbl+0xb0/0x160 [hns_roce_hw_v2] hns_roce_reg_user_mr+0x108/0x1c0 [hns_roce_hw_v2] ib_uverbs_reg_mr+0x120/0x2bc Add a cond_resched() to fix soft lockup during these loops. In order not to affect the allocation performance of normal-size buffer, set the loop count of a 100GB MR as the threshold to call cond_resched(). Fixes: 38389eaa4db1 ("RDMA/hns: Add mtr support for mixed multihop addressing") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20250311084857.3803665-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-12	fsnotify: add pre-content hooks on mmap()	Amir Goldstein
	Pre-content hooks in page faults introduces potential deadlock of HSM handler in userspace with filesystem freezing. The requirement with pre-content event is that for every accessed file range an event covering at least this range will be generated at least once before the file data is accesses. In preparation to disabling pre-content event hooks on page faults, add pre-content hooks at mmap() variants for the entire mmaped range, so HSM can fill content when user requests to map a portion of the file. Note that exec() variant also calls vm_mmap_pgoff() internally to map code sections, so pre-content hooks are also generated in this case. Link: https://lore.kernel.org/linux-fsdevel/7ehxrhbvehlrjwvrduoxsao5k3x4aw275patsb3krkwuq573yv@o2hskrfawbnc/ Suggested-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250312073852.2123409-2-amir73il@gmail.com
2025-03-12	USB: serial: ftdi_sio: add support for Altera USB Blaster 3	Boon Khai Ng
	The Altera USB Blaster 3, available as both a cable and an on-board solution, is primarily used for programming and debugging FPGAs. It interfaces with host software such as Quartus Programmer, System Console, SignalTap, and Nios Debugger. The device utilizes either an FT2232 or FT4232 chip. Enabling the support for various configurations of the on-board USB Blaster 3 by including the appropriate VID/PID pairs, allowing it to function as a serial device via ftdi_sio. Note that this check-in does not include support for the cable solution, as it does not support UART functionality. The supported configurations are determined by the hardware design and include: 1) PID 0x6022, FT2232, 1 JTAG port (Port A) + Port B as UART 2) PID 0x6025, FT4232, 1 JTAG port (Port A) + Port C as UART 3) PID 0x6026, FT4232, 1 JTAG port (Port A) + Port C, D as UART 4) PID 0x6029, FT4232, 1 JTAG port (Port B) + Port C as UART 5) PID 0x602a, FT4232, 1 JTAG port (Port B) + Port C, D as UART 6) PID 0x602c, FT4232, 1 JTAG port (Port A) + Port B as UART 7) PID 0x602d, FT4232, 1 JTAG port (Port A) + Port B, C as UART 8) PID 0x602e, FT4232, 1 JTAG port (Port A) + Port B, C, D as UART These configurations allow for flexibility in how the USB Blaster 3 is used, depending on the specific needs of the hardware design. Signed-off-by: Boon Khai Ng <boon.khai.ng@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>