linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-03-18	net: phylink: Use phy_caps to get an interface's capabilities and modes	Maxime Chevallier
	Phylink has internal code to get the MAC capabilities of a given PHY interface (what are the supported speed and duplex). Extract that into phy_caps, but use the link_capa for conversion. Add an internal phylink helper for the link caps -> mac caps conversion, and use this in phylink_caps_to_linkmodes(). Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-14-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phylink: Convert capabilities to linkmodes using phy_caps	Maxime Chevallier
	phylink_caps_to_linkmodes() is used to derive a list of linkmodes that can be conceivably exposed using a given set of speeds and duplex through phylink's MAC capabilities. This list can be derived from the link_caps array in phy_caps, provided we convert the MAC capabilities into a LINK_CAPA bitmask first. Introduce an internal phylink helper phylink_caps_to_link_caps() to convert from MAC capabilities into phy_caps, then phy_caps_linkmodes() to do the link_caps -> linkmodes conversion. This avoids having to update phylink for every new linkmode. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-13-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phylink: Add a mapping between MAC_CAPS and LINK_CAPS	Maxime Chevallier
	phylink allows MAC drivers to report the capabilities in terms of speed, duplex and pause support. This is done through a dedicated set of enum values in the form of the MAC_ capabilities. They are very close to what the LINK_CAPA_xxx can express, with the difference that LINK_CAPA don't have any information about Pause/Asym Pause support. To prepare converting phylink to using the phy_caps, add the mapping between MAC capabilities and phy_caps. While doing so, we move the phylink_caps_params array up a bit to simplify future commits. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-12-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phy: drop phy_settings and the associated lookup helpers	Maxime Chevallier
	The phy_settings array is no longer relevant as it has now been replaced by the link_caps array and associated phy_caps helpers. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-11-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phylink: Use phy_caps_lookup for fixed-link configuration	Maxime Chevallier
	When phylink creates a fixed-link configuration, it finds a matching linkmode to set as the advertised, lp_advertising and supported modes based on the speed and duplex of the fixed link. Use the newly introduced phy_caps_lookup to get these modes instead of phy_lookup_settings(). This has the side effect that the matched settings and configured linkmodes may now contain several linkmodes (the intersection of supported linkmodes from the phylink settings and the linkmodes that match speed/duplex) instead of the one from phy_lookup_settings(). Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-10-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phy: phy_device: Use link_capabilities lookup for PHY aneg config	Maxime Chevallier
	When configuring PHY advertising with autoneg disabled, we lookd for an exact linkmode to advertise and configure for the requested Speed and Duplex, specially at or over 1G. Using phy_caps_lookup allows us to build a list of the supported linkmodes at that speed that we can advertise instead of the first mode that matches. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-9-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phy: phy_caps: Allow looking-up link caps based on speed and duplex	Maxime Chevallier
	As the link_caps array is efficient for <speed,duplex> lookups, implement a function for speed/duplex lookups that matches a given mask. This replicates to some extent the phy_lookup_settings() behaviour, matching full link_capabilities instead of a single linkmode. phy.c's phy_santize_settings() and phylink's phylink_ethtool_ksettings_set() performs such lookup using the phy_settings table, but are only interested in the actual speed/duplex that were matched, rathet than the individual linkmode. Similar to phy_lookup_settings(), the newly introduced phy_caps_lookup() will run through the link_caps[] array by descending speed/duplex order. If the link_capabilities for a given <speed/duplex> tuple intersects the passed linkmodes, we consider that a match. Similar to phy_lookup_settings(), we also allow passing an 'exact' boolean, allowing non-exact match. Here, we MUST always match the linkmodes mask, but we allow matching on lower speed settings. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-8-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phy: phy_caps: Implement link_capabilities lookup by linkmode	Maxime Chevallier
	In several occasions, phylib needs to lookup a set of matching speed and duplex against a given linkmode set. Instead of relying on the phy_settings array and thus iterate over the whole linkmodes list, use the link_capabilities array to lookup these matches, as we aren't interested in the actual link setting that matches but rather the speed and duplex for that setting. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-7-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phy: phy_caps: Introduce phy_caps_valid	Maxime Chevallier
	With the link_capabilities array, it's trivial to validate a given mask againts a <speed, duplex> tuple. Create a helper for that purpose, and use it to replace a phy_settings lookup in phy_check_valid(); Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-6-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phy: phy_caps: Move __set_linkmode_max_speed to phy_caps	Maxime Chevallier
	Convert the __set_linkmode_max_speed to use the link_capabilities array. This makes it easy to clamp the linkmodes to a given max speed. Introduce a new helper phy_caps_linkmode_max_speed to replace the previous one that used phy_settings. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-5-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phy: phy_caps: Move phy_speeds to phy_caps	Maxime Chevallier
	Use the newly introduced link_capabilities array to derive the list of possible speeds when given a combination of linkmodes. As link_capabilities is indexed by speed, we don't have to iterate the whole phy_settings array. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-4-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: phy: Use an internal, searchable storage for the linkmodes	Maxime Chevallier
	The canonical definition for all the link modes is in linux/ethtool.h, which is complemented by the link_mode_params array stored in net/ethtool/common.h . That array contains all the metadata about each of these modes, including the Speed and Duplex information. Phylib and phylink needs that information as well for internal management of the link, which was done by duplicating that information in locally-stored arrays and lookup functions. This makes it easy for developpers adding new modes to forget modifying phylib and phylink accordingly. However, the link_mode_params array in net/ethtool/common.c is fairly inefficient to search through, as it isn't sorted in any manner. Phylib and phylink perform a lot of lookup operations, mostly to filter modes by speed and/or duplex. We therefore introduce the link_caps private array in phy_caps.c, that indexes linkmodes in a more efficient manner. Each element associated a tuple <speed, duplex> to a bitfield of all the linkmodes runs at these speed/duplex. We end-up with an array that's fairly short, easily addressable and that it optimised for the typical use-cases of phylib/phylink. That array is initialized at the same time as phylib. As the link_mode_params array is part of the net stack, which phylink depends on, it should always be accessible from phylib. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-3-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	net: ethtool: Export the link_mode_params definitions	Maxime Chevallier
	link_mode_params contains a lookup table of all 802.3 link modes that are currently supported with structured data about each mode's speed, duplex, number of lanes and mediums. As a preparation for a port representation, export that table for the rest of the net stack to use. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-2-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18	efivarfs: fix NULL dereference on resume	James Bottomley
	LSMs often inspect the path.mnt of files in the security hooks, and this causes a NULL deref in efivarfs_pm_notify() because the path is constructed with a NULL path.mnt. Fix by obtaining from vfs_kern_mount() instead, and being very careful to ensure that deactivate_super() (potentially triggered by a racing userspace umount) is not called directly from the notifier, because it would deadlock when efivarfs_kill_sb() tried to unregister the notifier chain. [ Al notes: Umm... That's probably safe, but not as a long-term solution - it's too intimately dependent upon fs/super.c internals. The reasons why you can't run into ->s_umount deadlock here are non-trivial... ] Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Link: https://lore.kernel.org/r/e54e6a2f-1178-4980-b771-4d9bafc2aa47@tnxip.de Link: https://lore.kernel.org/r/3e998bf87638a442cbc6864cdcd3d8d9e08ce3e3.camel@HansenPartnership.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-03-17	Merge tag 'mm-hotfixes-stable-2025-03-17-20-09' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "15 hotfixes. 7 are cc:stable and the remainder address post-6.13 issues or aren't considered necessary for -stable kernels. 13 are for MM and the other two are for squashfs and procfs. All are singletons. Please see the individual changelogs for details" * tag 'mm-hotfixes-stable-2025-03-17-20-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/page_alloc: fix memory accept before watermarks gets initialized mm: decline to manipulate the refcount on a slab page memcg: drain obj stock on cpu hotplug teardown mm/huge_memory: drop beyond-EOF folios with the right number of refs selftests/mm: run_vmtests.sh: fix half_ufd_size_MB calculation mm: fix error handling in __filemap_get_folio() with FGP_NOWAIT mm: memcontrol: fix swap counter leak from offline cgroup mm/vma: do not register private-anon mappings with khugepaged during mmap squashfs: fix invalid pointer dereference in squashfs_cache_delete mm/migrate: fix shmem xarray update during migration mm/hugetlb: fix surplus pages in dissolve_free_huge_page() mm/damon/core: initialize damos->walk_completed in damon_new_scheme() mm/damon: respect core layer filters' allowance decision on ops layer filemap: move prefaulting out of hot write path proc: fix UAF in proc_get_inode()
2025-03-17	MAINTAINERS: Remove myself	Eric W. Biederman
	Unfortunately I no longer have time to meaningfully take part in the linux kernel development. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-17	Merge tag 'soc-fixes-6.14-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "The majority of these last fixes are for devicetree files. These address two important regressions for the Qualcomm SMMU and the Raspberry Pi 4 USB controller, as well as a larger number of patches fixing minor mistakes in board specific files for Rockchips, i.MX, starfive and broadcom. The non-DT changes are - A fix for an old boot regression on Renesas shmobile chips - Another boot time regression for for the Qualcomm PDR SoC driver, among a few other Qualcomm firmware driver fixes for efivars and tzmem - Minor Kconfig fixes for davinci and OMAP1 - Minor code fixes for sparx5 reset controllers, OMAP memory controller, i.MX SCU, cpufreq and SoC drivers and a Hisilicon SoC driver - One more update to the Asahi maintainers, adding Neal Gompa as a reviewer" * tag 'soc-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (35 commits) ARM: davinci: da850: fix selecting ARCH_DAVINCI_DA8XX soc: hisilicon: kunpeng_hccs: Fix incorrect string assembly memory: omap-gpmc: drop no compatible check reset: mchp: sparx5: Fix for lan966x ARM: shmobile: smp: Enforce shmobile_smp_* alignment MAINTAINERS: Add myself (Neal Gompa) as a reviewer for ARM Apple support MAINTAINERS: Add apple-spi driver & binding files arm64: dts: rockchip: slow down emmc freq for rock 5 itx ARM: dts: BCM5301X: Fix switch port labels of ASUS RT-AC3200 ARM: dts: BCM5301X: Fix switch port labels of ASUS RT-AC5300 ARM: dts: bcm2711: Don't mark timer regs unconfigured ARM: OMAP1: select CONFIG_GENERIC_IRQ_CHIP arm64: dts: rockchip: Add missing PCIe supplies to RockPro64 board dtsi arm64: dts: rockchip: Add avdd HDMI supplies to RockPro64 board dtsi arm64: dts: rockchip: Remove undocumented sdmmc property from lubancat-1 arm64: dts: rockchip: fix pinmux of UART5 for PX30 Ringneck on Haikou arm64: dts: rockchip: fix pinmux of UART0 for PX30 Ringneck on Haikou arm64: dts: rockchip: fix u2phy1_host status for NanoPi R4S arm64: dts: bcm2712: PL011 UARTs are actually r1p5 ARM: dts: bcm2711: PL011 UARTs are actually r1p5 ...
2025-03-17	Merge tag 'probes-fixes-v6.14-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - Clean up tprobe correctly when module unload Tracepoint probes do not set TRACEPOINT_STUB on the 'tpoint' pointer when unloading a module, thus they show as a normal 'fprobe' instead of 'tprobe' and never come back - Fix leakage of tprobe module refcount When a tprobe's target module is loaded, it gets the module's refcount in the module notifier but forgot to put it after registering the probe on it. Fix it by getting the refcount only when registering tprobe. * tag 'probes-fixes-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: tprobe-events: Fix leakage of module refcount tracing: tprobe-events: Fix to clean up tprobe correctly when module unload
2025-03-17	Merge branch ↵	Paolo Abeni
	'net-stmmac-avoid-unnecessary-work-in-stmmac_release-stmmac_dvr_remove' Russell King says: ==================== net: stmmac: avoid unnecessary work in stmmac_release()/stmmac_dvr_remove() This small series is a subset of a RFC I sent earlier. These two patches remove code that is unnecessary and/or wrong in these paths. Details in each commit. ==================== Link: https://patch.msgid.link/Z87bpDd7QYYVU0ML@shell.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	net: stmmac: remove unnecessary stmmac_mac_set() in stmmac_release()	Russell King (Oracle)
	stmmac_release() calls phylink_stop() and then goes on to call stmmac_mac_set(, false). However, phylink_stop() will call stmmac_mac_link_down() before returning, which will do this work. Remove this unnecessary call. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Furong Xu <0x1207@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1trcI6-005rn8-GV@rmk-PC.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	net: stmmac: remove redundant racy tear-down in stmmac_dvr_remove()	Russell King (Oracle)
	While the network device is registered, it is published to userspace, and thus userspace can change its state. This means calling functions such as stmmac_stop_all_dma() and stmmac_mac_set() are racy. Moreover, unregister_netdev() will unpublish the network device, and then if appropriate call the .ndo_stop() method, which is stmmac_release(). This will first call phylink_stop() which will synchronously take the link down, resulting in stmmac_mac_link_down() and stmmac_mac_set(, false) being called. stmmac_release() will also call stmmac_stop_all_dma(). Consequently, neither of these two functions need to called prior to unregister_netdev() as that will safely call paths that will result in this work being done if necessary. Remove these redundant racy calls. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Furong Xu <0x1207@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1trcI1-005rn2-CZ@rmk-PC.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	net: phylink: expand on .pcs_config() method documentation	Russell King (Oracle)
	Expand on the requirements of the .pcs_config() method documentation, specifically mentioning that it should cause minimal disruption to an established link, and that it should return a positive non-zero value when requiring the .pcs_an_restart() method to be called. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1trb24-005oVq-Is@rmk-PC.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	cdc_ether\|r8152: ThinkPad Hybrid USB-C/A Dock quirk	Philipp Hahn
	Lenovo ThinkPad Hybrid USB-C with USB-A Dock (17ef:a359) is affected by the same problem as the Lenovo Powered USB-C Travel Hub (17ef:721e): Both are based on the Realtek RTL8153B chip used to use the cdc_ether driver. However, using this driver, with the system suspended the device constantly sends pause-frames as soon as the receive buffer fills up. This causes issues with other devices, where some Ethernet switches stop forwarding packets altogether. Using the Realtek driver (r8152) fixes this issue. Pause frames are no longer sent while the host system is suspended. Cc: Leon Schuermann <leon@is.currently.online> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Oliver Neukum <oliver@neukum.org> (maintainer:USB CDC ETHERNET DRIVER) Cc: netdev@vger.kernel.org (open list:NETWORKING DRIVERS) Link: https://git.kernel.org/netdev/net/c/cb82a54904a9 Link: https://git.kernel.org/netdev/net/c/2284bbd0cf39 Link: https://www.lenovo.com/de/de/p/accessories-and-software/docking/docking-usb-docks/40af0135eu Signed-off-by: Philipp Hahn <phahn-oss@avm.de> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/484336aad52d14ccf061b535bc19ef6396ef5120.1741601523.git.p.hahn@avm.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	stmmac: intel: Fix warning message for return value in ↵	Choong Yong Liang
	intel_tsn_lane_is_available() Fix the warning "warn: missing error code? 'ret'" in the intel_tsn_lane_is_available() function. The function now returns 0 to indicate that a TSN lane was found and returns -EINVAL when it is not found. Fixes: a42f6b3f1cc1 ("net: stmmac: configure SerDes according to the interface mode") Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250310050835.808870-1-yong.liang.choong@linux.intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	Merge branch 'net-phy-clean-up-phy-package-mmd-access-functions'	Paolo Abeni
	Heiner Kallweit says: ==================== net: phy: clean up PHY package MMD access functions Move declarations of the functions with users to phylib.h, and remove unused functions. ==================== Link: https://patch.msgid.link/b624fcb7-b493-461a-a0b5-9ca7e9d767bc@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	net: phy: remove unused functions phy_package_[read\|write]_mmd	Heiner Kallweit
	These functions have never had a user, so remove them. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/5792e2cd-6f0a-4f7d-a5ef-b932f94d82f3@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	net: phy: move PHY package MMD access function declarations from phy.h to ↵	Heiner Kallweit
	phylib.h These functions are used by PHY drivers only, therefore move their declaration to phylib.h. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/406c8a20-b62e-4ee3-b174-b566724a0876@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	Merge branch 'mlx5-support-hws-flow-meter-sampler-actions-in-fs-core'	Paolo Abeni
	Tariq Toukan says: ==================== mlx5: Support HWS flow meter/sampler actions in FS core This series by Moshe adds support for flow meter and flow sampler HW Steering actions in FS core level. As these actions can be shared by multiple rules, these patches use refcounts to manage the HWS actions sharing in FS core level. ==================== Link: https://patch.msgid.link/1741543663-22123-1-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	net/mlx5: fs, add support for dest flow sampler HWS action	Moshe Shemesh
	Add support for HW Steering action of flow sampler destination. For each flow sampler created cache the hws action by sampler id as a key. Hold refcount for each rule using the cached action. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1741543663-22123-4-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	net/mlx5: fs, add support for flow meters HWS action	Moshe Shemesh
	Add support for HW Steering action of flow meter range. Flow meters range can use one HWS action for the whole range. Thus, share a cached HWS action among rules that use same flow meter object range. Hold refcount for each rule using the cached action. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1741543663-22123-3-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	net/mlx5: fs, add API for sharing HWS action by refcount	Moshe Shemesh
	Counters HWS actions are shared using refcount, to create action on demand by flow steering rule and destroy only when no rules are using the action. The method is extensible to other HWS action types, such as flow meter and sampler actions, in the downstream patches. Add an API to facilitate the reuse of get/put logic for HWS actions shared by refcount. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1741543663-22123-2-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-17	efivarfs: use I_MUTEX_CHILD nested lock to traverse variables on resume	Ard Biesheuvel
	syzbot warns about a potential deadlock, but this is a false positive resulting from a missing lockdep annotation: iterate_dir() locks the parent whereas the inode_lock() it warns about locks the child, which is guaranteed to be a different lock. So use inode_lock_nested() instead with the appropriate lock class. Reported-by: syzbot+019072ad24ab1d948228@syzkaller.appspotmail.com Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-03-17	Merge branch 'tcp-accecn'	David S. Miller
	Chia-Yu Chang says: ==================== AccECN protocol preparation patch series Please find the v7 v7 (03-Mar-2025) - Move 2 new patches added in v6 to the next AccECN patch series v6 (27-Dec-2024) - Avoid removing removing the potential CA_ACK_WIN_UPDATE in ack_ev_flags of patch #1 (Eric Dumazet <edumazet@google.com>) - Add reviewed-by tag in patches #2, #3, #4, #5, #6, #7, #8, #12, #14 - Foloiwng 2 new pathces are added after patch #9 (Patch that adds SKB_GSO_TCP_ACCECN) * New patch #10 to replace exisiting SKB_GSO_TCP_ECN with SKB_GSO_TCP_ACCECN in the driver to avoid CWR flag corruption * New patch #11 adds AccECN for virtio by adding new negotiation flag (VIRTIO_NET_F_HOST/GUEST_ACCECN) in feature handshake and translating Accurate ECN GSO flag between virtio_net_hdr (VIRTIO_NET_HDR_GSO_ACCECN) and skb header (SKB_GSO_TCP_ACCECN) - Add detailed changelog and comments in #13 (Eric Dumazet <edumazet@google.com>) - Move patch #14 to the next AccECN patch series (Eric Dumazet <edumazet@google.com>) v5 (5-Nov-2024) - Add helper function "tcp_flags_ntohs" to preserve last 2 bytes of TCP flags of patch #4 (Paolo Abeni <pabeni@redhat.com>) - Fix reverse X-max tree order of patches #4, #11 (Paolo Abeni <pabeni@redhat.com>) - Rename variable "delta" as "timestamp_delta" of patch #2 fo clariety - Remove patch #14 in this series (Paolo Abeni <pabeni@redhat.com>, Joel Granados <joel.granados@kernel.org>) v4 (21-Oct-2024) - Fix line length warning of patches #2, #4, #8, #10, #11, #14 - Fix spaces preferred around '\|' (ctx:VxV) warning of patch #7 - Add missing CC'ed of patches #4, #12, #14 v3 (19-Oct-2024) - Fix build error in v2 v2 (18-Oct-2024) - Fix warning caused by NETIF_F_GSO_ACCECN_BIT in patch #9 (Jakub Kicinski <kuba@kernel.org>) The full patch series can be found in https://github.com/L4STeam/linux-net-next/commits/upstream_l4steam/ The Accurate ECN draft can be found in https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-accurate-ecn-28 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	tcp: Pass flags to __tcp_send_ack	Ilpo Järvinen
	Accurate ECN needs to send custom flags to handle IP-ECN field reflection during handshake. Signed-off-by: Ilpo Järvinen <ij@kernel.org> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	tcp: add new TCP_TW_ACK_OOW state and allow ECN bits in TOS	Ilpo Järvinen
	ECN bits in TOS are always cleared when sending in ACKs in TW. Clearing them is problematic for TCP flows that used Accurate ECN because ECN bits decide which service queue the packet is placed into (L4S vs Classic). Effectively, TW ACKs are always downgraded from L4S to Classic queue which might impact, e.g., delay the ACK will experience on the path compared with the other packets of the flow. Change the TW ACK sending code to differentiate: - In tcp_v4_send_reset(), commit ba9e04a7ddf4f ("ip: fix tos reflection in ack and reset packets") cleans ECN bits for TW reset and this is not affected. - In tcp_v4_timewait_ack(), ECN bits for all TW ACKs are cleaned. But now only ECN bits of ACKs for oow data or paws_reject are cleaned, and ECN bits of other ACKs will not be cleaned. - In tcp_v4_reqsk_send_ack(), commit 66b13d99d96a1 ("ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT") did not clean ECN bits of ACKs for oow data or paws_reject. But now the ECN bits rae cleaned for these ACKs. Signed-off-by: Ilpo Järvinen <ij@kernel.org> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	tcp: AccECN support to tcp_add_backlog	Ilpo Järvinen
	AE flag needs to be preserved for AccECN. Signed-off-by: Ilpo Järvinen <ij@kernel.org> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	gro: prevent ACE field corruption & better AccECN handling	Ilpo Järvinen
	There are important differences in how the CWR field behaves in RFC3168 and AccECN. With AccECN, CWR flag is part of the ACE counter and its changes are important so adjust the flags changed mask accordingly. Also, if CWR is there, set the Accurate ECN GSO flag to avoid corrupting CWR flag somewhere. Signed-off-by: Ilpo Järvinen <ij@kernel.org> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	gso: AccECN support	Ilpo Järvinen
	Handling the CWR flag differs between RFC 3168 ECN and AccECN. With RFC 3168 ECN aware TSO (NETIF_F_TSO_ECN) CWR flag is cleared starting from 2nd segment which is incompatible how AccECN handles the CWR flag. Such super-segments are indicated by SKB_GSO_TCP_ECN. With AccECN, CWR flag (or more accurately, the ACE field that also includes ECE & AE flags) changes only when new packet(s) with CE mark arrives so the flag should not be changed within a super-skb. The new skb/feature flags are necessary to prevent such TSO engines corrupting AccECN ACE counters by clearing the CWR flag (if the CWR handling feature cannot be turned off). If NIC is completely unaware of RFC3168 ECN (doesn't support NETIF_F_TSO_ECN) or its TSO engine can be set to not touch CWR flag despite supporting also NETIF_F_TSO_ECN, TSO could be safely used with AccECN on such NIC. This should be evaluated per NIC basis (not done in this patch series for any NICs). For the cases, where TSO cannot keep its hands off the CWR flag, a GSO fallback is provided by this patch. Signed-off-by: Ilpo Järvinen <ij@kernel.org> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	tcp: helpers for ECN mode handling	Ilpo Järvinen
	Create helpers for TCP ECN modes. No functional changes. Signed-off-by: Ilpo Järvinen <ij@kernel.org> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	tcp: rework {__,}tcp_ecn_check_ce() -> tcp_data_ecn_check()	Ilpo Järvinen
	Rename tcp_ecn_check_ce to tcp_data_ecn_check as it is called only for data segments, not for ACKs (with AccECN, also ACKs may get ECN bits). The extra "layer" in tcp_ecn_check_ce() function just checks for ECN being enabled, that can be moved into tcp_ecn_field_check rather than having the __ variant. No functional changes. Signed-off-by: Ilpo Järvinen <ij@kernel.org> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	tcp: extend TCP flags to allow AE bit/ACE field	Ilpo Järvinen
	With AccECN, there's one additional TCP flag to be used (AE) and ACE field that overloads the definition of AE, CWR, and ECE flags. As tcp_flags was previously only 1 byte, the byte-order stuff needs to be added to it's handling. Signed-off-by: Ilpo Järvinen <ij@kernel.org> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	tcp: use BIT() macro in include/net/tcp.h	Chia-Yu Chang
	Use BIT() macro for TCP flags field and TCP congestion control flags that will be used by the congestion control algorithm. No functional changes. Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Reviewed-by: Ilpo Järvinen <ij@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	tcp: create FLAG_TS_PROGRESS	Ilpo Järvinen
	Whenever timestamp advances, it declares progress which can be used by the other parts of the stack to decide that the ACK is the most recent one seen so far. AccECN will use this flag when deciding whether to use the ACK to update AccECN state or not. Signed-off-by: Ilpo Järvinen <ij@kernel.org> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	tcp: reorganize tcp_in_ack_event() and tcp_count_delivered()	Ilpo Järvinen
	- Move tcp_count_delivered() earlier and split tcp_count_delivered_ce() out of it - Move tcp_in_ack_event() later - While at it, remove the inline from tcp_in_ack_event() and let the compiler to decide Accurate ECN's heuristics does not know if there is going to be ACE field based CE counter increase or not until after rtx queue has been processed. Only then the number of ACKed bytes/pkts is available. As CE or not affects presence of FLAG_ECE, that information for tcp_in_ack_event is not yet available in the old location of the call to tcp_in_ack_event(). Signed-off-by: Ilpo Järvinen <ij@kernel.org> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17	hwmon: (nct6775-core) Fix out of bounds access for NCT679{8,9}	Tasos Sahanidis
	pwm_num is set to 7 for these chips, but NCT6776_REG_PWM_MODE and NCT6776_PWM_MODE_MASK only contain 6 values. Fix this by adding another 0 to the end of each array. Signed-off-by: Tasos Sahanidis <tasos@tasossah.com> Link: https://lore.kernel.org/r/20250312030832.106475-1-tasos@tasossah.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-03-17	MAINTAINERS: correct list and scope of LTC4286 HARDWARE MONITOR	Wolfram Sang
	This entry has a wrong list, i2c instead of hwmon. Also, it states to maintain Kconfig and Makefile which is not suitable for a single driver. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Fixes: 7707cf82e138 ("dt-bindings: hwmon: Add lltc ltc4286 driver bindings") Link: https://lore.kernel.org/r/20250317091459.41462-2-wsa+renesas@sang-engineering.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-03-17	mmc: sdhci-brcmstb: add cqhci suspend/resume to PM ops	Kamal Dasu
	cqhci timeouts observed on brcmstb platforms during suspend: ... [ 164.832853] mmc0: cqhci: timeout for tag 18 ... Adding cqhci_suspend()/resume() calls to disable cqe in sdhci_brcmstb_suspend()/resume() respectively to fix CQE timeouts seen on PM suspend. Fixes: d46ba2d17f90 ("mmc: sdhci-brcmstb: Add support for Command Queuing (CQE)") Cc: stable@vger.kernel.org Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20250311165946.28190-1-kamal.dasu@broadcom.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-03-16	mm/page_alloc: fix memory accept before watermarks gets initialized	Kirill A. Shutemov
	Watermarks are initialized during the postcore initcall. Until then, all watermarks are set to zero. This causes cond_accept_memory() to incorrectly skip memory acceptance because a watermark of 0 is always met. This can lead to a premature OOM on boot. To ensure progress, accept one MAX_ORDER page if the watermark is zero. Link: https://lkml.kernel.org/r/20250310082855.2587122-1-kirill.shutemov@linux.intel.com Fixes: dcdfdd40fa82 ("mm: Add support for unaccepted memory") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: Farrah Chen <farrah.chen@intel.com> Reported-by: Farrah Chen <farrah.chen@intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Cc: Ashish Kalra <ashish.kalra@amd.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> Cc: Thomas Lendacky <thomas.lendacky@amd.com> Cc: <stable@vger.kernel.org> [6.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16	mm: decline to manipulate the refcount on a slab page	Matthew Wilcox (Oracle)
	Slab pages now have a refcount of 0, so nobody should be trying to manipulate the refcount on them. Doing so has little effect; the object could be freed and reallocated to a different purpose, although the slab itself would not be until the refcount was put making it behave rather like TYPESAFE_BY_RCU. Unfortunately, __iov_iter_get_pages_alloc() does take a refcount. Fix that to not change the refcount, and make put_page() silently not change the refcount. get_page() warns so that we can fix any other callers that need to be changed. Long-term, networking needs to stop taking a refcount on the pages that it uses and rely on the caller to hold whatever references are necessary to make the memory stable. In the medium term, more page types are going to hav a zero refcount, so we'll want to move get_page() and put_page() out of line. Link: https://lkml.kernel.org/r/20250310143544.1216127-1-willy@infradead.org Fixes: 9aec2fb0fd5e (slab: allocate frozen pages) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Hannes Reinecke <hare@suse.de> Closes: https://lore.kernel.org/all/08c29e4b-2f71-4b6d-8046-27e407214d8c@suse.com/ Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16	memcg: drain obj stock on cpu hotplug teardown	Shakeel Butt
	Currently on cpu hotplug teardown, only memcg stock is drained but we need to drain the obj stock as well otherwise we will miss the stats accumulated on the target cpu as well as the nr_bytes cached. The stats include MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. In addition we are leaking reference to struct obj_cgroup object. Link: https://lkml.kernel.org/r/20250310230934.2913113-1-shakeel.butt@linux.dev Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>