summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-09net: dwmac-rk: Use helper rgmii_clockJan Petrous (OSS)
Utilize a new helper function rgmii_clock(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jan Petrous (OSS) <jan.petrous@oss.nxp.com> Link: https://patch.msgid.link/20241205-upstream_s32cc_gmac-v8-8-ec1d180df815@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: dwmac-intel-plat: Use helper rgmii_clockJan Petrous (OSS)
Utilize a new helper function rgmii_clock(). When in, remove dead code in kmb_eth_fix_mac_speed(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jan Petrous (OSS) <jan.petrous@oss.nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20241205-upstream_s32cc_gmac-v8-7-ec1d180df815@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: dwmac-imx: Use helper rgmii_clockJan Petrous (OSS)
Utilize a new helper function rgmii_clock(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jan Petrous (OSS) <jan.petrous@oss.nxp.com> Link: https://patch.msgid.link/20241205-upstream_s32cc_gmac-v8-6-ec1d180df815@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: dwmac-dwc-qos-eth: Use helper rgmii_clockJan Petrous (OSS)
Utilize a new helper function rgmii_clock(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jan Petrous (OSS) <jan.petrous@oss.nxp.com> Link: https://patch.msgid.link/20241205-upstream_s32cc_gmac-v8-5-ec1d180df815@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: phy: Add helper for mapping RGMII link speed to clock rateJan Petrous (OSS)
The RGMII interface supports three data rates: 10/100 Mbps and 1 Gbps. These speeds correspond to clock frequencies of 2.5/25 MHz and 125 MHz, respectively. Many Ethernet drivers, including glues in stmmac, follow a similar pattern of converting RGMII speed to clock frequency. To simplify code, define the helper rgmii_clock(speed) to convert connection speed to clock frequency. Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jan Petrous (OSS) <jan.petrous@oss.nxp.com> Link: https://patch.msgid.link/20241205-upstream_s32cc_gmac-v8-4-ec1d180df815@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: stmmac: Fix clock rate variables sizeJan Petrous (OSS)
The clock API clk_get_rate() returns unsigned long value. Expand affected members of stmmac platform data and convert the stmmac_clk_csr_set() and dwmac4_core_init() methods to defining the unsigned long clk_rate local variables. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jan Petrous (OSS) <jan.petrous@oss.nxp.com> Link: https://patch.msgid.link/20241205-upstream_s32cc_gmac-v8-3-ec1d180df815@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: stmmac: Extend CSR calc supportJan Petrous (OSS)
Add support for CSR clock range up to 800 MHz. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jan Petrous (OSS) <jan.petrous@oss.nxp.com> Link: https://patch.msgid.link/20241205-upstream_s32cc_gmac-v8-2-ec1d180df815@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: stmmac: Fix CSR divider commentJan Petrous (OSS)
The comment in declaration of STMMAC_CSR_250_300M incorrectly describes the constant as '/* MDC = clk_scr_i/122 */' but the DWC Ether QOS Handbook version 5.20a says it is CSR clock/124. Signed-off-by: Jan Petrous (OSS) <jan.petrous@oss.nxp.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20241205-upstream_s32cc_gmac-v8-1-ec1d180df815@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: renesas: rswitch: remove speed from gwca structureNikita Yushchenko
This field is set but never used. GWCA is rswitch CPU interface module which connects rswitch to the host over AXI bus. Speed of the switch ports is not anyhow related to GWCA operation. Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20241206192140.1714-2-nikita.yoush@cogentembedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: renesas: rswitch: do not deinit disabled portsNikita Yushchenko
In rswitch_ether_port_init_all(), only enabled ports are initialized. Then, rswitch_ether_port_deinit_all() shall also only deinitialize enabled ports. Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20241206192140.1714-1-nikita.yoush@cogentembedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09Merge branch 'qca_spi-fix-spi-specific-issues'Jakub Kicinski
Stefan Wahren says: ==================== qca_spi: Fix SPI specific issues This small series address two annoying SPI specific issues of the qca_spi driver. ==================== Link: https://patch.msgid.link/20241206184643.123399-1-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09qca_spi: Make driver probing reliableStefan Wahren
The module parameter qcaspi_pluggable controls if QCA7000 signature should be checked at driver probe (current default) or not. Unfortunately this could fail in case the chip is temporary in reset, which isn't under total control by the Linux host. So disable this check per default in order to avoid unexpected probe failures. Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://patch.msgid.link/20241206184643.123399-3-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09qca_spi: Fix clock speed for multiple QCA7000Stefan Wahren
Storing the maximum clock speed in module parameter qcaspi_clkspeed has the unintended side effect that the first probed instance defines the value for all other instances. Fix this issue by storing it in max_speed_hz of the relevant SPI device. This fix keeps the priority of the speed parameter (module parameter, device tree property, driver default). Btw this uses the opportunity to get the rid of the unused member clkspeed. Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://patch.msgid.link/20241206184643.123399-2-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-10power: supply: cros_charge-control: hide start threshold on v2 cmdThomas Weißschuh
ECs implementing the v2 command will not stop charging when the end threshold is reached. Instead they will begin discharging until the start threshold is reached, leading to permanent charge and discharge cycles. This defeats the point of the charge control mechanism. Avoid the issue by hiding the start threshold on v2 systems. Instead on those systems program the EC with start == end which forces the EC to reach and stay at that level. v1 does not support thresholds and v3 works correctly, at least judging from the code. Reported-by: Thomas Koch <linrunner@gmx.net> Fixes: c6ed48ef5259 ("power: supply: add ChromeOS EC based charge control driver") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241208-cros_charge-control-v2-v1-3-8d168d0f08a3@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-12-10power: supply: cros_charge-control: allow start_threshold == end_thresholdThomas Weißschuh
Allow setting the start and stop thresholds to the same value. There is no reason to disallow it. Suggested-by: Thomas Koch <linrunner@gmx.net> Fixes: c6ed48ef5259 ("power: supply: add ChromeOS EC based charge control driver") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241208-cros_charge-control-v2-v1-2-8d168d0f08a3@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-12-10power: supply: cros_charge-control: add mutex for driver dataThomas Weißschuh
Concurrent accesses through sysfs may lead to inconsistent state in the priv data. Introduce a mutex to avoid this. Fixes: c6ed48ef5259 ("power: supply: add ChromeOS EC based charge control driver") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241208-cros_charge-control-v2-v1-1-8d168d0f08a3@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-12-10power: supply: gpio-charger: Fix set charge current limitsDimitri Fedrau
Fix set charge current limits for devices which allow to set the lowest charge current limit to be greater zero. If requested charge current limit is below lowest limit, the index equals current_limit_map_size which leads to accessing memory beyond allocated memory. Fixes: be2919d8355e ("power: supply: gpio-charger: add charge-current-limit feature") Cc: stable@vger.kernel.org Signed-off-by: Dimitri Fedrau <dimitri.fedrau@liebherr.com> Link: https://lore.kernel.org/r/20241209-fix-charge-current-limit-v1-1-760d9b8f2af3@liebherr.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-12-09octeon_ep: add ndo ops for VFs in PF driverShinas Rasheed
These APIs are needed to support applications that use netlink to get VF information from a PF driver. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Link: https://patch.msgid.link/20241206064135.2331790-1-srasheed@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09cxgb4: use port number to set mac addrAnumula Murali Mohan Reddy
t4_set_vf_mac_acl() uses pf to set mac addr, but t4vf_get_vf_mac_acl() uses port number to get mac addr, this leads to error when an attempt to set MAC address on VF's of PF2 and PF3. This patch fixes the issue by using port number to set mac address. Fixes: e0cdac65ba26 ("cxgb4vf: configure ports accessible by the VF") Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241206062014.49414-1-anumula@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-10rust: kbuild: set `bindgen`'s Rust target versionMiguel Ojeda
Each `bindgen` release may upgrade the list of Rust targets. For instance, currently, in their master branch [1], the latest ones are: Nightly => { vectorcall_abi: #124485, ptr_metadata: #81513, layout_for_ptr: #69835, }, Stable_1_77(77) => { offset_of: #106655 }, Stable_1_73(73) => { thiscall_abi: #42202 }, Stable_1_71(71) => { c_unwind_abi: #106075 }, Stable_1_68(68) => { abi_efiapi: #105795 }, By default, the highest stable release in their list is used, and users are expected to set one if they need to support older Rust versions (e.g. see [2]). Thus, over time, new Rust features are used by default, and at some point, it is likely that `bindgen` will emit Rust code that requires a Rust version higher than our minimum (or perhaps enabling an unstable feature). Currently, there is no problem because the maximum they have, as seen above, is Rust 1.77.0, and our current minimum is Rust 1.78.0. Therefore, set a Rust target explicitly now to prevent going forward in time too much and thus getting potential build failures at some point. Since we also support a minimum `bindgen` version, and since `bindgen` does not support passing unknown Rust target versions, we need to use the list of our minimum `bindgen` version, rather than the latest. So, since `bindgen` 0.65.1 had this list [3], we need to use Rust 1.68.0: /// Rust stable 1.64 /// * `core_ffi_c` ([Tracking issue](https://github.com/rust-lang/rust/issues/94501)) => Stable_1_64 => 1.64; /// Rust stable 1.68 /// * `abi_efiapi` calling convention ([Tracking issue](https://github.com/rust-lang/rust/issues/65815)) => Stable_1_68 => 1.68; /// Nightly rust /// * `thiscall` calling convention ([Tracking issue](https://github.com/rust-lang/rust/issues/42202)) /// * `vectorcall` calling convention (no tracking issue) /// * `c_unwind` calling convention ([Tracking issue](https://github.com/rust-lang/rust/issues/74990)) => Nightly => nightly; ... /// Latest stable release of Rust pub const LATEST_STABLE_RUST: RustTarget = RustTarget::Stable_1_68; Thus add the `--rust-target 1.68` parameter. Add a comment as well explaining this. An alternative would be to use the currently running (i.e. actual) `rustc` and `bindgen` versions to pick a "better" Rust target version. However, that would introduce more moving parts depending on the user setup and is also more complex to implement. Starting with `bindgen` 0.71.0 [4], we will be able to set any future Rust version instead, i.e. we will be able to set here our minimum supported Rust version. Christian implemented it [5] after seeing this patch. Thanks! Cc: Christian Poveda <git@pvdrz.com> Cc: Emilio Cobos Álvarez <emilio@crisal.io> Cc: stable@vger.kernel.org # needed for 6.12.y; unneeded for 6.6.y; do not apply to 6.1.y Fixes: c844fa64a2d4 ("rust: start supporting several `bindgen` versions") Link: https://github.com/rust-lang/rust-bindgen/blob/21c60f473f4e824d4aa9b2b508056320d474b110/bindgen/features.rs#L97-L105 [1] Link: https://github.com/rust-lang/rust-bindgen/issues/2960 [2] Link: https://github.com/rust-lang/rust-bindgen/blob/7d243056d335fdc4537f7bca73c06d01aae24ddc/bindgen/features.rs#L131-L150 [3] Link: https://github.com/rust-lang/rust-bindgen/blob/main/CHANGELOG.md#0710-2024-12-06 [4] Link: https://github.com/rust-lang/rust-bindgen/pull/2993 [5] Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20241123180323.255997-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-12-09iommu/tegra241-cmdqv: do not use smp_processor_id in preemptible contextLuis Claudio R. Goncalves
During boot some of the calls to tegra241_cmdqv_get_cmdq() will happen in preemptible context. As this function calls smp_processor_id(), if CONFIG_DEBUG_PREEMPT is enabled, these calls will trigger a series of "BUG: using smp_processor_id() in preemptible" backtraces. As tegra241_cmdqv_get_cmdq() only calls smp_processor_id() to use the CPU number as a factor to balance out traffic on cmdq usage, it is safe to use raw_smp_processor_id() here. Cc: <stable@vger.kernel.org> Fixes: 918eb5c856f6 ("iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace) CMDQV") Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/Z1L1mja3nXzsJ0Pk@uudg.org Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drm/panic: remove spurious empty line to clean warningMiguel Ojeda
Clippy in the upcoming Rust 1.83.0 spots a spurious empty line since the `clippy::empty_line_after_doc_comments` warning is now enabled by default given it is part of the `suspicious` group [1]: error: empty line after doc comment --> drivers/gpu/drm/drm_panic_qr.rs:931:1 | 931 | / /// They must remain valid for the duration of the function call. 932 | | | |_ 933 | #[no_mangle] 934 | / pub unsafe extern "C" fn drm_panic_qr_generate( 935 | | url: *const i8, 936 | | data: *mut u8, 937 | | data_len: usize, ... | 940 | | tmp_size: usize, 941 | | ) -> u8 { | |_______- the comment documents this function | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#empty_line_after_doc_comments = note: `-D clippy::empty-line-after-doc-comments` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::empty_line_after_doc_comments)]` = help: if the empty line is unintentional remove it Thus remove the empty line. Cc: stable@vger.kernel.org Fixes: cb5164ac43d0 ("drm/panic: Add a QR code panic screen") Link: https://github.com/rust-lang/rust-clippy/pull/13091 [1] Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20241125233332.697497-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-12-09perf test hwmon_pmu: Fix event file locationIan Rogers
The temp directory is made and a known fake hwmon PMU created within it. Prior to this fix the events were being incorrectly written to the temp directory rather than the fake PMU directory. This didn't impact the test as the directory fd matched the wrong location, but it doesn't mirror what a hwmon PMU would actually look like. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20241206042306.1055913-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-09perf hwmon_pmu: Use openat rather than dup to refresh directoryIan Rogers
The hwmon PMU test will make a temp directory, open the directory with O_DIRECTORY then fill it with contents. As the open is before the filling the contents the later fdopendir may reflect the initial empty state, meaning no events are seen. Change to re-open the directory, rather than dup the fd, so the latest contents are seen. Minor tweaks/additions to debug messages. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20241206042306.1055913-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-09Merge branch 'vxlan-support-user-defined-reserved-bits'Jakub Kicinski
Petr Machata says: ==================== vxlan: Support user-defined reserved bits Currently the VXLAN header validation works by vxlan_rcv() going feature by feature, each feature clearing the bits that it consumes. If anything is left unparsed at the end, the packet is rejected. Unfortunately there are machines out there that send VXLAN packets with reserved bits set, even if they are configured to not use the corresponding features. One such report is here[1], and we have heard similar complaints from our customers as well. This patchset adds an attribute that makes it configurable which bits the user wishes to tolerate and which they consider reserved. This was recommended in [1] as well. A knob like that inevitably allows users to set as reserved bits that are in fact required for the features enabled by the netdevice, such as GPE. This is detected, and such configurations are rejected. In patches #1..#7, the reserved bits validation code is gradually moved away from the unparsed approach described above, to one where a given set of valid bits is precomputed and then the packet is validated against that. In patch #8, this precomputed set is made configurable through a new attribute IFLA_VXLAN_RESERVED_BITS. Patches #9 and #10 massage the testsuite a bit, so that patch #11 can introduce a selftest for the resreved bits feature. The corresponding iproute2 support is available in [2]. [1] https://lore.kernel.org/netdev/db8b9e19-ad75-44d3-bfb2-46590d426ff5@proxmox.com/ [2] https://github.com/pmachata/iproute2/commits/vxlan_reserved_bits/ ==================== Link: https://patch.msgid.link/cover.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09selftests: forwarding: Add a selftest for the new reserved_bits UAPIPetr Machata
Run VXLAN packets through a gateway. Flip individual bits of the packet and/or reserved bits of the gateway, and check that the gateway treats the packets as expected. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/388bef3c30ebc887d4e64cd86a362e2df2f2d2e1.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09selftests: net: lib: Add several autodefer helpersPetr Machata
Add ip_link_set_addr(), ip_link_set_up(), ip_addr_add() and ip_route_add() to the suite of helpers that automatically schedule a corresponding cleanup. When setting a new MAC, one needs to remember the old address first. Move mac_get() from forwarding/ to that end. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/add6bcbe30828fd01363266df20c338cf13aaf25.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09selftests: net: lib: Rename ip_link_master() to ip_link_set_master()Petr Machata
Let's have a verb in that function name to make it clearer what's going on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/fbf7c53a429b340b9cff5831280ea8c305a224f9.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09vxlan: Add an attribute to make VXLAN header validation configurablePetr Machata
The set of bits that the VXLAN netdevice currently considers reserved is defined by the features enabled at the netdevice construction. In order to make this configurable, add an attribute, IFLA_VXLAN_RESERVED_BITS. The payload is a pair of big-endian u32's covering the VXLAN header. This is validated against the set of flags used by the various enabled VXLAN features, and attempts to override bits used by an enabled feature are bounced. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/c657275e5ceed301e62c69fe8e559e32909442e2.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09vxlan: vxlan_rcv(): Drop unparsedPetr Machata
The code currently validates the VXLAN header in two ways: first by comparing it with the set of reserved bits, constructed ahead of time during the netdevice construction; and second by gradually clearing the bits off a separate copy of VXLAN header, "unparsed". Drop the latter validation method. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/4559f16c5664c189b3a4ee6f5da91f552ad4821c.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09vxlan: Bump error counters for header mismatchesPetr Machata
The VXLAN driver so far has not increased the error counters for packets that set reserved bits. It does so for other packet errors, so do it for this case as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/d096084167d56706d620afe5136cf37a2d34d1b9.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09vxlan: Track reserved bits explicitly as part of the configurationPetr Machata
In order to make it possible to configure which bits in VXLAN header should be considered reserved, introduce a new field vxlan_config::reserved_bits. Have it cover the whole header, except for the VNI-present bit and the bits for VNI itself, and have individual enabled features clear more bits off reserved_bits. (This is expressed as first constructing a used_bits set, and then inverting it to get the reserved_bits. The set of used_bits will be useful on its own for validation of user-set reserved_bits in a following patch.) The patch also moves a comment relevant to the validation from the unparsed validation site up to the new site. Logically this patch should add the new comment, and a later patch that removes the unparsed bits would remove the old comment. But keeping both legs in the same patch is better from the history spelunking point of view. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/984dbf98d5940d3900268dbffaf70961f731d4a4.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09vxlan: vxlan_rcv(): Extract vxlan_hdr(skb) to a named variablePetr Machata
Having a named reference to the VXLAN header is more handy than having to conjure it anew through vxlan_hdr() on every use. Add a new variable and convert several open-coded sites. Additionally, convert one "unparsed" use to the new variable as well. Thus the only "unparsed" uses that remain are the flag-clearing and the header validity check at the end. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/2a0a940e883c435a0fdbcdc1d03c4858957ad00e.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09vxlan: vxlan_rcv() callees: Drop the unparsed argumentPetr Machata
The functions vxlan_remcsum() and vxlan_parse_gbp_hdr() take both the SKB and the unparsed VXLAN header. Now that unparsed adjustment is handled directly by vxlan_rcv(), drop this argument, and have the function derive it from the SKB on its own. vxlan_parse_gpe_proto() does not take SKB, so keep the header parameter. However const it so that it's clear that the intention is that it does not get changed. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/5ea651f4e06485ba1a84a8eb556a457c39f0dfd4.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09vxlan: vxlan_rcv() callees: Move clearing of unparsed flags outPetr Machata
In order to migrate away from the use of unparsed to detect invalid flags, move all the code that actually clears the flags from callees directly to vxlan_rcv(). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/2857871d929375c881b9defe378473c8200ead9b.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09vxlan: In vxlan_rcv(), access flags through the vxlan netdevicePetr Machata
vxlan_sock.flags is constructed from vxlan_dev.cfg.flags, as the subset of flags (named VXLAN_F_RCV_FLAGS) that is important from the point of view of socket sharing. Attempts to reconfigure these flags during the vxlan netdev lifetime are also bounced. It is therefore immaterial whether we access the flags through the vxlan_dev or through the socket. Convert the socket accesses to netdevice accesses in this separate patch to make the conversions that take place in the following patches more obvious. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/5d237ffd731055e524d7b7c436de43358d8743d2.1733412063.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: reformat kdoc return statementsJakub Kicinski
kernel-doc -Wall warns about missing Return: statement for non-void functions. We have a number of kdocs in our headers which are missing the colon, IOW they use * Return some value or * Returns some value Having the colon makes some sense, it should help kdoc parser avoid false positives. So add them. This is mostly done with a sed script, and removing the unnecessary cases (mostly the comments which aren't kdoc). Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/20241205165914.1071102-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09net: dsa: microchip: Make MDIO bus name uniqueJesse Van Gavere
In configurations with 2 or more DSA clusters it will fail to allocate unique MDIO bus names as only the switch ID is used, fix this by using a combination of the tree ID and switch ID when needed Signed-off-by: Jesse Van Gavere <jesse.vangavere@scioteq.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241206204202.649912-1-jesse.vangavere@scioteq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09mctp: no longer rely on net->dev_index_head[]Eric Dumazet
mctp_dump_addrinfo() is one of the last users of net->dev_index_head[] in the control path. Switch to for_each_netdev_dump() for better scalability. Use C99 for mctp_device_rtnl_msg_handlers[] to prepare future RTNL removal from mctp_dump_addrinfo() (mdev->addrs is not yet RCU protected) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Matt Johnston <matt@codeconstruct.com.au> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20241206223811.1343076-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09perf ftrace: Fix undefined behavior in cmp_profile_data()Kuan-Wei Chiu
The comparison function cmp_profile_data() violates the C standard's requirements for qsort() comparison functions, which mandate symmetry and transitivity: * Symmetry: If x < y, then y > x. * Transitivity: If x < y and y < z, then x < z. When v1 and v2 are equal, the function incorrectly returns 1, breaking symmetry and transitivity. This causes undefined behavior, which can lead to memory corruption in certain versions of glibc [1]. Fix the issue by returning 0 when v1 and v2 are equal, ensuring compliance with the C standard and preventing undefined behavior. Link: https://www.qualys.com/2024/01/30/qsort.txt [1] Fixes: 0f223813edd0 ("perf ftrace: Add 'profile' command") Fixes: 74ae366c37b7 ("perf ftrace profile: Add -s/--sort option") Cc: stable@vger.kernel.org Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: jserv@ccns.ncku.edu.tw Cc: chuang@cs.nycu.edu.tw Link: https://lore.kernel.org/r/20241209134226.1939163-1-visitorckw@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-09Merge branch 'rxrpc-implement-jumbo-data-transmission-and-rack-tlp'Jakub Kicinski
David Howells says: ==================== rxrpc: Implement jumbo DATA transmission and RACK-TLP Here's a series of patches to implement two main features: (1) The transmission of jumbo data packets whereby several DATA packets of a particular size can be glued together into a single UDP packet, allowing us to make use of larger MTU sizes. The basic jumbo subpacket capacity is 1412 bytes (RXRPC_JUMBO_DATALEN) and, say, an MTU of 8192 allows five of them to be transmitted as one. An alternative (and possibly more efficient way) would be to expand/shrink the capacity of each DATA packet to match the MTU and thus save on header and tail-gap overhead, but the Rx protocol does not provide a mechanism for splitting the data - especially as the transported data is encrypted per-packet - and so UDP fragmentation would be the only way to handle this. In fact, in the future, AF_RXRPC also needs to look at shrinking the packet size where the MTU is smaller - for instance in the case of being carried by IPv6 over wifi where there isn't capacity for a 1412 byte capacity. (2) RACK-TLP to manage packet loss and retransmission in conjunction with the congestion control algorithm. These allow for better data throughput and work towards being able to have larger transmission windows. To this end, the following changes are also made: (1) Use a single large array of kvec structs for the I/O thread rather than having one per transmission buffer. We need a much bigger collection of kvecs for ping padding (2) Implement path-MTU probing by sending padded PING ACK packets and monitoring for PING RESPONSE ACKs. The pmtud value determined is used to configure the construction of jumbo DATA packets. (3) The transmission queue is changed from a linked list of transmission buffer structs to a linked list of transmission-queue structs, each of which points to either 32 or 64 transmission buffers (depending on cpu word size) and various bits of metadata are concentrated in the queue structs rather than the buffers to make better use of the cpu cache. (4) SACK data is stored in the transmission-queue structures in batches of 32 or 64 making it faster to process rather than being spread amongst all the individual packet buffers. (5) Don't change the DF flag on the UDP socket unless we need to - and basically only enable it for path-MTU probing. There are also some additional bits: (1) Fix the handling of connection aborts to poke the aborted connections. (2) Don't set the MORE-PACKETS Rx header flag on the wire. No one actually checks it and it is, in any case, generated inconsistently between implementations. (3) Request an ACK when, during call transmission, there's a stall in the app generating the data to be transmitted. (4) Fix attention starvation in the I/O thread by making sure we go through all outstanding events rather than returning to the beginning of the check cycle after any time we process an event. (5) Don't use the skbuff timestamp in the calculation of timeouts and RTT as we really should include local processing time in that too. Further, getting receive skbuff timestamps may be expensive. (6) Make RTT tracking per call with the saving of the value between calls, even within the same connection channel. The initial call timeout starts off large to allow the server time to set up its state before the initial reply. (7) Don't allocate txbuf structs for ACK packets, but rather use page frags and MSG_SPLICE_PAGES. (8) Use irq-disabling locks for interactions between app threads and I/O threads so that the I/O thread doesn't get help up. (9) Make rxrpc set the REQUEST-ACK flag on an outgoing packet when cwnd is at RXRPC_MIN_CWND (currently 4), not at 2 which it can never reach. (10) Add some tracing bits and pieces (including displaying the userStatus field in an ACK header) and some more stats counters (including different sizes of jumbo packets sent/received). Link: https://lore.kernel.org/r/20240306000655.1100294-1-dhowells@redhat.com/ [1] ==================== Link: https://patch.msgid.link/20241204074710.990092-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09rxrpc: Implement RACK/TLP to deal with transmission stalls [RFC8985]David Howells
When an rxrpc call is in its transmission phase and is sending a lot of packets, stalls occasionally occur that cause severe performance degradation (eg. increasing the transmission time for a 256MiB payload from 0.7s to 2.5s over a 10G link). rxrpc already implements TCP-style congestion control [RFC5681] and this helps mitigate the effects, but occasionally we're missing a time event that deals with a missing ACK, leading to a stall until the RTO expires. Fix this by implementing RACK/TLP in rxrpc. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09rxrpc: Fix request for an ACK when cwnd is minimumDavid Howells
rxrpc_prepare_data_subpacket() sets the REQUEST-ACK flag on the outgoing DATA packet under a number of circumstances, including, theoretically, when the cwnd is at minimum (or less). However, the minimum in this function is hard-coded as 2, but the actual minimum is RXRPC_MIN_CWND (which is currently 4) and so this never occurs. Without this, we will miss the request of some ACKs, potentially leading to a transmission stall until a timeout occurs on one side or the other that leads to an ACK being generated. Fix the function to use RXRPC_MIN_CWND rather than a hard-coded number. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09rxrpc: Manage RTT per-call rather than per-peerDavid Howells
Manage the determination of RTT on a per-call (ie. per-RPC op) basis rather than on a per-peer basis, averaging across all calls going to that peer. The problem is that the RTT measurements from the initial packets on a call may be off because the server may do some setting up (such as getting a lock on a file) before accepting the rest of the data in the RPC and, further, the RTT may be affected by server-side file operations, for instance if a large amount of data is being written or read. Note: When handling the FS.StoreData-type RPCs, for example, the server uses the userStatus field in the header of ACK packets as supplementary flow control to aid in managing this. AF_RXRPC does not yet support this, but it should be added. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09rxrpc: Add a reason indicator to the tx_ack tracepointDavid Howells
Record the reason for the transmission of an ACK in the rxrpc_tx_ack tracepoint, and not just in the rxrpc_propose_ack tracepoint. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09rxrpc: Add a reason indicator to the tx_data tracepointDavid Howells
Add an indicator to the rxrpc_tx_data tracepoint to indicate what triggered the transmission of a particular packet. At this point, it's only normal transmission and retransmission, plus the tracepoint is also used to record loss injection, but in a future patch, TLP-induced (re-)transmission will also be a thing. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09rxrpc: Tidy up the ACK parsing a bitDavid Howells
Tidy up the ACK parsing in the following ways: (1) Put the serial number of the ACK packet into the rxrpc_ack_summary struct and access it from there whilst parsing an ACK. (2) Be consistent about using "if (summary.acked_serial)" rather than "if (summary.acked_serial != 0)". Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09rxrpc: Use irq-disabling spinlocks between app and I/O threadDavid Howells
Where a spinlock is used by both the application thread and the I/O thread, use irq-disabling locking so that an interrupt taken on the app thread doesn't also slow down the I/O thread. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09rxrpc: Don't allocate a txbuf for an ACK transmissionDavid Howells
Don't allocate an rxrpc_txbuf struct for an ACK transmission. There's now no need as the memory to hold the ACK content is allocated with a page frag allocator. The allocation and freeing of a txbuf is just unnecessary overhead. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-09rxrpc: Send jumbo DATA packetsDavid Howells
Send jumbo DATA packets if the path-MTU probing using padded PING ACK packets shows up sufficient capacity to do so. This allows larger chunks of data to be sent without reducing the retryability as the subpackets in a jumbo packet can also be retransmitted individually. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>