summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)Author
2024-12-06bnxt_en: Fix potential crash when dumping FW log coredumpHongguang Gao
If the FW log context memory is retained after FW reset, the existing code is not handling the condition correctly and zeroes out the data structures. This potentially will cause a division by zero crash when the user runs ethtool -w. The last_type is also not set correctly when the context memory is retained. This will cause errors because the last_type signals to the FW that all context memory types have been configured. Oops: divide error: 0000 1 PREEMPT SMP NOPTI CPU: 53 UID: 0 PID: 7019 Comm: ethtool Kdump: loaded Tainted: G OE 6.12.0-rc7+ #1 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Supermicro SYS-621C-TN12R/X13DDW-A, BIOS 1.4 08/10/2023 RIP: 0010:__bnxt_copy_ctx_mem.constprop.0.isra.0+0x86/0x160 [bnxt_en] Code: 0a 31 d2 4c 89 6c 24 10 45 8b a5 fc df ff ff 4c 8b 74 24 20 31 db 66 89 44 24 06 48 63 c5 c1 e5 09 4c 0f af e0 48 8b 44 24 30 <49> f7 f4 4c 89 64 24 08 48 63 c5 4d 89 ec 31 ed 48 89 44 24 18 49 RSP: 0018:ff480591603d78b8 EFLAGS: 00010206 RAX: 0000000000100000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ff23959e46740000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000100000 R09: ff23959e46740000 R10: ff480591603d7a18 R11: 0000000000000010 R12: 0000000000000000 R13: ff23959e46742008 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f04227c1740(0000) GS:ff2395adbf680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f04225b33a5 CR3: 000000108b9a4001 CR4: 0000000000773ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? __bnxt_copy_ctx_mem.constprop.0.isra.0+0x86/0x160 [bnxt_en] ? do_error_trap+0x65/0x80 ? __bnxt_copy_ctx_mem.constprop.0.isra.0+0x86/0x160 [bnxt_en] ? exc_divide_error+0x36/0x50 ? __bnxt_copy_ctx_mem.constprop.0.isra.0+0x86/0x160 [bnxt_en] ? asm_exc_divide_error+0x16/0x20 ? __bnxt_copy_ctx_mem.constprop.0.isra.0+0x86/0x160 [bnxt_en] ? __bnxt_copy_ctx_mem.constprop.0.isra.0+0xda/0x160 [bnxt_en] bnxt_get_ctx_coredump.constprop.0+0x1ed/0x390 [bnxt_en] ? __memcg_slab_post_alloc_hook+0x21c/0x3c0 ? __bnxt_get_coredump+0x473/0x4b0 [bnxt_en] __bnxt_get_coredump+0x473/0x4b0 [bnxt_en] ? security_file_alloc+0x74/0xe0 ? cred_has_capability.isra.0+0x78/0x120 bnxt_get_coredump_length+0x4b/0xf0 [bnxt_en] bnxt_get_dump_flag+0x40/0x60 [bnxt_en] __dev_ethtool+0x17e4/0x1fc0 ? syscall_exit_to_user_mode+0xc/0x1d0 ? do_syscall_64+0x85/0x150 ? unmap_page_range+0x299/0x4b0 ? vma_interval_tree_remove+0x215/0x2c0 ? __kmalloc_cache_noprof+0x10a/0x300 dev_ethtool+0xa8/0x170 dev_ioctl+0x1b5/0x580 ? sk_ioctl+0x4a/0x110 sock_do_ioctl+0xab/0xf0 sock_ioctl+0x1ca/0x2e0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x79/0x150 Fixes: 24d694aec139 ("bnxt_en: Allocate backing store memory for FW trace logs") Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20241204215918.1692597-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06bnxt_en: Fix GSO type for HW GRO packets on 5750X chipsMichael Chan
The existing code is using RSS profile to determine IPV4/IPV6 GSO type on all chips older than 5760X. This won't work on 5750X chips that may be using modified RSS profiles. This commit from 2018 has updated the driver to not use RSS profile for HW GRO packets on newer chips: 50f011b63d8c ("bnxt_en: Update RSS setup and GRO-HW logic according to the latest spec.") However, a recent commit to add support for the newest 5760X chip broke the logic. If the GRO packet needs to be re-segmented by the stack, the wrong GSO type will cause the packet to be dropped. Fix it to only use RSS profile to determine GSO type on the oldest 5730X/5740X chips which cannot use the new method and is safe to use the RSS profiles. Also fix the L3/L4 hash type for RX packets by not using the RSS profile for the same reason. Use the ITYPE field in the RX completion to determine L3/L4 hash types correctly. Fixes: a7445d69809f ("bnxt_en: Add support for new RX and TPA_START completion types for P7") Reviewed-by: Colin Winegarden <colin.winegarden@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20241204215918.1692597-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06net: simplify resource acquisition + ioremapRosen Penev
get resource + request_mem_region + ioremap can all be done by a single function. Replace them with devm_platform_get_and_ioremap_resource or\ devm_platform_ioremap_resource where res is not used. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> # sja1000_platform.c Link: https://patch.msgid.link/20241203231337.182391-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-06net: freescale: ucc_geth: phylink conversionMaxime Chevallier
ucc_geth is quite capable in terms of supported interfaces, and even includes an externally controlled PCS (well, TBI). Port that driver to phylink. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06net: freescale: ucc_geth: Introduce a helper to check Reduced modesMaxime Chevallier
A number of parallel MII interfaces also exist in a "Reduced" mode, usually with higher clock rates and fewer data lines, to ease the hardware design. This is what the 'R' stands for in RGMII, RMII, RTBI, RXAUI, etc. The UCC Geth controller has a special configuration bit that needs to be set when the MII mode is one of the supported reduced modes. Add a local helper for that. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06net: freescale: ucc_geth: Move the serdes configuration aroundMaxime Chevallier
The uec_configure_serdes() function deals with serialized linkmodes settings. It's used during the link bringup sequence. It is planned to be used during the phylink conversion for mac configuration, but it needs to me moved around in the process. To make the phylink port clearer, this commit moves the function without any feature change. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06net: freescale: ucc_geth: Hardcode the preamble length to 7 bytesMaxime Chevallier
The preamble length can be configured in ucc_geth, however it just ends-up always being configured to 7 bytes, as nothing ever changes the default value of 7. Make that value the default value when the MACCFG2 register gets initialized, and remove the code to configure that value altogether. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06net: freescale: ucc_geth: Simplify frame length checkMaxime Chevallier
The frame length check is configured when the phy interface is setup. However, it's configured according to an internal flag that is always false. So, just make so that we disable the relevant bit in the MACCFG2 register upon accessing it for other MAC configuration operations. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06net: freescale: ucc_geth: Use the correct type to store WoL optsMaxime Chevallier
The WoL opts are represented through a bitmask stored in a u32. As this mask is copied as-is in the driver, make sure we use the exact same type to store them internally. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06net: freescale: ucc_geth: Fix WOL configurationMaxime Chevallier
The get/set_wol ethtool ops rely on querying the PHY for its WoL capabilities, checking for the presence of a PHY and a PHY interrupts isn't enough. Address that by cleaning up the WoL configuration sequence. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06net: freescale: ucc_geth: Use netdev->phydev to access the PHYMaxime Chevallier
As this driver pre-dates phylib, it uses a private pointer to get a reference to the attached phy_device. Drop that pointer and use the netdev's pointer instead. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06net: freescale: ucc_geth: split adjust_link for phylink conversionMaxime Chevallier
Preparing the phylink conversion, split the adjust_link callbaclk, by clearly separating the mac configuration, link_up and link_down phases. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-06net: freescale: ucc_geth: Drop support for the "interface" DT propertyMaxime Chevallier
In april 2007, ucc_geth was converted to phylib with : commit 728de4c927a3 ("ucc_geth: migrate ucc_geth to phylib"). In that commit, the device-tree property "interface", that could be used to retrieve the PHY interface mode was deprecated. DTS files that still used that property were converted along the way, in the following commit, also dating from april 2007 : commit 0fd8c47cccb1 ("[POWERPC] Replace undocumented interface properties in dts files") 17 years later, there's no users of that property left and I hope it's safe to say we can remove support from that in the ucc_geth driver, making the probe() function a bit simpler. Should there be any users that have a DT that was generated when 2.6.21 was cutting-edge, print an error message with hints on how to convert the devicetree if the 'interface' property is found. With that property gone, we can greatly simplify the parsing of the phy-interface-mode from the devicetree by using of_get_phy_mode(), allowing the removal of the open-coded parsing in the driver. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-05net/mlx5: DR, prevent potential error pointer dereferenceDan Carpenter
The dr_domain_add_vport_cap() function generally returns NULL on error but sometimes we want it to return ERR_PTR(-EBUSY) so the caller can retry. The problem here is that "ret" can be either -EBUSY or -ENOMEM and if it's and -ENOMEM then the error pointer is propogated back and eventually dereferenced in dr_ste_v0_build_src_gvmi_qpn_tag(). Fixes: 11a45def2e19 ("net/mlx5: DR, Add support for SF vports") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/07477254-e179-43e2-b1b3-3b9db4674195@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.13-rc2). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05Merge tag 'net-6.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can and netfilter. Current release - regressions: - rtnetlink: fix double call of rtnl_link_get_net_ifla() - tcp: populate XPS related fields of timewait sockets - ethtool: fix access to uninitialized fields in set RXNFC command - selinux: use sk_to_full_sk() in selinux_ip_output() Current release - new code bugs: - net: make napi_hash_lock irq safe - eth: - bnxt_en: support header page pool in queue API - ice: fix NULL pointer dereference in switchdev Previous releases - regressions: - core: fix icmp host relookup triggering ip_rt_bug - ipv6: - avoid possible NULL deref in modify_prefix_route() - release expired exception dst cached in socket - smc: fix LGR and link use-after-free issue - hsr: avoid potential out-of-bound access in fill_frame_info() - can: hi311x: fix potential use-after-free - eth: ice: fix VLAN pruning in switchdev mode Previous releases - always broken: - netfilter: - ipset: hold module reference while requesting a module - nft_inner: incorrect percpu area handling under softirq - can: j1939: fix skb reference counting - eth: - mlxsw: use correct key block on Spectrum-4 - mlx5: fix memory leak in mlx5hws_definer_calc_layout" * tag 'net-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits) net :mana :Request a V2 response version for MANA_QUERY_GF_STAT net: avoid potential UAF in default_operstate() vsock/test: verify socket options after setting them vsock/test: fix parameter types in SO_VM_SOCKETS_* calls vsock/test: fix failures due to wrong SO_RCVLOWAT parameter net/mlx5e: Remove workaround to avoid syndrome for internal port net/mlx5e: SD, Use correct mdev to build channel param net/mlx5: E-Switch, Fix switching to switchdev mode in MPV net/mlx5: E-Switch, Fix switching to switchdev mode with IB device disabled net/mlx5: HWS: Properly set bwc queue locks lock classes net/mlx5: HWS: Fix memory leak in mlx5hws_definer_calc_layout bnxt_en: handle tpa_info in queue API implementation bnxt_en: refactor bnxt_alloc_rx_rings() to call bnxt_alloc_rx_agg_bmap() bnxt_en: refactor tpa_info alloc/free into helpers geneve: do not assume mac header is set in geneve_xmit_skb() mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4 ethtool: Fix wrong mod state in case of verbose and no_mask bitset ipmr: tune the ipmr_can_free_table() checks. netfilter: nft_set_hash: skip duplicated elements pending gc run netfilter: ipset: Hold module reference while requesting a module ...
2024-12-05net :mana :Request a V2 response version for MANA_QUERY_GF_STATShradha Gupta
The current requested response version(V1) for MANA_QUERY_GF_STAT query results in STATISTICS_FLAGS_TX_ERRORS_GDMA_ERROR value being set to 0 always. In order to get the correct value for this counter we request the response version to be V2. Cc: stable@vger.kernel.org Fixes: e1df5202e879 ("net :mana :Add remaining GDMA stats for MANA to ethtool") Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://patch.msgid.link/1733291300-12593-1-git-send-email-shradhagupta@linux.microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-05net/mlx5: Add support for new scheduling elementsCarolina Jubran
Introduce new scheduling elements in the E-Switch QoS hierarchy to enhance traffic management capabilities. This patch adds support for: - Rate Limit scheduling elements: Enables bandwidth limitation across multiple nodes without a shared ancestor, providing a mechanism for more granular control of bandwidth allocation. - Traffic Class Transmit Scheduling Arbiter (TSAR): Introduces the infrastructure for creating Traffic Class TSARs, allowing hierarchical arbitration based on traffic classes. - Traffic Class Arbiter TSAR: Adds support for a TSAR capable of managing arbitration between multiple traffic classes, enabling improved bandwidth prioritization and traffic management. No functional changes are introduced in this patch. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241204220931.254964-4-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-04Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-12-03 (ice, idpf, ixgbe, ixgbevf, igb) This series contains updates to ice, idpf, ixgbe, ixgbevf, and igb drivers. For ice: Arkadiusz corrects search for determining whether PHY clock recovery is supported on the device. Przemyslaw corrects mask used for PHY timestamps on ETH56G devices. Wojciech adds missing virtchnl ops which caused NULL pointer dereference. Marcin fixes VLAN filter settings for uplink VSI in switchdev mode. For idpf: Josh restores setting of completion tag for empty buffers. For ixgbevf: Jake removes incorrect initialization/support of IPSEC for mailbox version 1.5. For ixgbe: Jake rewords and downgrades misleading message when negotiation of VF mailbox version is not supported. Tore Amundsen corrects value for BASE-BX10 capability. For igb: Yuan Can adds proper teardown on failed pci_register_driver() call. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igb: Fix potential invalid memory access in igb_init_module() ixgbe: Correct BASE-BX10 compliance code ixgbe: downgrade logging of unsupported VF API version to debug ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5 idpf: set completion tag for "empty" bufs associated with a packet ice: Fix VLAN pruning in switchdev mode ice: Fix NULL pointer dereference in switchdev ice: fix PHY timestamp extraction for ETH56G ice: fix PHY Clock Recovery availability check ==================== Link: https://patch.msgid.link/20241203215521.1646668-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04r8169: simplify setting hwmon attribute visibilityHeiner Kallweit
Use new member visible to simplify setting the static visibility. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/dba77e76-be45-4a30-96c7-45e284072ad2@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04net/mlx5e: Remove workaround to avoid syndrome for internal portJianbo Liu
Previously a workaround was added to avoid syndrome 0xcdb051. It is triggered when offload a rule with tunnel encapsulation, and forwarding to another table, but not matching on the internal port in firmware steering mode. The original workaround skips internal tunnel port logic, which is not correct as not all cases are considered. As an example, if vlan is configured on the uplink port, traffic can't pass because vlan header is not added with this workaround. Besides, there is no such issue for software steering. So, this patch removes that, and returns error directly if trying to offload such rule for firmware steering. Fixes: 06b4eac9c4be ("net/mlx5e: Don't offload internal port if filter device is out device") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Tested-by: Frode Nordahl <frode.nordahl@canonical.com> Reviewed-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04net/mlx5e: SD, Use correct mdev to build channel paramTariq Toukan
In a multi-PF netdev, each traffic channel creates its own resources against a specific PF. In the cited commit, where this support was added, the channel_param logic was mistakenly kept unchanged, so it always used the primary PF which is found at priv->mdev. In this patch we fix this by moving the logic to be per-channel, and passing the correct mdev instance. This bug happened to be usually harmless, as the resulting cparam structures would be the same for all channels, due to identical FW logic and decisions. However, in some use cases, like fwreset, this gets broken. This could lead to different symptoms. Example: Error cqe on cqn 0x428, ci 0x0, qn 0x10a9, opcode 0xe, syndrome 0x4, vendor syndrome 0x32 Fixes: e4f9686bdee7 ("net/mlx5e: Let channels be SD-aware") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04net/mlx5: E-Switch, Fix switching to switchdev mode in MPVPatrisious Haddad
Fix the mentioned commit change for MPV mode, since in MPV mode the IB device is shared between different core devices, so under this change when moving both devices simultaneously to switchdev mode the IB device removal and re-addition can race with itself causing unexpected behavior. In such case do rescan_drivers() only once in order to add the ethernet representor auxiliary device, and skip adding and removing IB devices. Fixes: ab85ebf43723 ("net/mlx5: E-switch, refactor eswitch mode change") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04net/mlx5: E-Switch, Fix switching to switchdev mode with IB device disabledPatrisious Haddad
In case that IB device is already disabled when moving to switchdev mode, which can happen when working with LAG, need to do rescan_drivers() before leaving in order to add ethernet representor auxiliary device. Fixes: ab85ebf43723 ("net/mlx5: E-switch, refactor eswitch mode change") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04net/mlx5: HWS: Properly set bwc queue locks lock classesCosmin Ratiu
The mentioned "Fixes" patch forgot to do that. Fixes: 9addffa34359 ("net/mlx5: HWS, use lock classes for bwc locks") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04net/mlx5: HWS: Fix memory leak in mlx5hws_definer_calc_layoutCosmin Ratiu
It allocates a match template, which creates a compressed definer fc struct, but that is not deallocated. This commit fixes that. Fixes: 74a778b4a63f ("net/mlx5: HWS, added definers handling") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241203204920.232744-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04bnxt_en: handle tpa_info in queue API implementationDavid Wei
Commit 7ed816be35ab ("eth: bnxt: use page pool for head frags") added a page pool for header frags, which may be distinct from the existing pool for the aggregation ring. Prior to this change, frags used in the TPA ring rx_tpa were allocated from system memory e.g. napi_alloc_frag() meaning their lifetimes were not associated with a page pool. They can be returned at any time and so the queue API did not alloc or free rx_tpa. But now frags come from a separate head_pool which may be different to page_pool. Without allocating and freeing rx_tpa, frags allocated from the old head_pool may be returned to a different new head_pool which causes a mismatch between the pp hold/release count. Fix this problem by properly freeing and allocating rx_tpa in the queue API implementation. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20241204041022.56512-4-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04bnxt_en: refactor bnxt_alloc_rx_rings() to call bnxt_alloc_rx_agg_bmap()David Wei
Refactor bnxt_alloc_rx_rings() to call bnxt_alloc_rx_agg_bmap() for allocating rx_agg_bmap. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20241204041022.56512-3-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04bnxt_en: refactor tpa_info alloc/free into helpersDavid Wei
Refactor bnxt_rx_ring_info->tpa_info operations into helpers that work on a single tpa_info in prep for queue API using them. There are 2 pairs of operations: * bnxt_alloc_one_tpa_info() * bnxt_free_one_tpa_info() These alloc/free the tpa_info array itself. * bnxt_alloc_one_tpa_info_data() * bnxt_free_one_tpa_info_data() These alloc/free the frags stored in tpa_info array. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20241204041022.56512-2-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04net: mvpp2: implement pcs_inband_caps() methodRussell King (Oracle)
Report the PCS in-band capabilities to phylink for Marvell PP2 interfaces. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tIUsE-006IUh-E7@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04net: mvneta: implement pcs_inband_caps() methodRussell King (Oracle)
Report the PCS in-band capabilities to phylink for Marvell NETA interfaces. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tIUs9-006IUb-Au@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04mlxsw: spectrum_acl_flex_keys: Use correct key block on Spectrum-4Ido Schimmel
The driver is currently using an ACL key block that is not supported by Spectrum-4. This works because the driver is only using a single field from this key block which is located in the same offset in the equivalent Spectrum-4 key block. The issue was discovered when the firmware started rejecting the use of the unsupported key block. The change has been reverted to avoid breaking users that only update their firmware. Nonetheless, fix the issue by using the correct key block. Fixes: 07ff135958dd ("mlxsw: Introduce flex key elements for Spectrum-4") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/35e72c97bdd3bc414fb8e4d747e5fb5d26c29658.1733237440.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04rtase: Add support for RTL907XD-VA PCIe portJustin Lai
1. Add RTL907XD-VA hardware version id. 2. Add the reported speed for RTL907XD-VA. Signed-off-by: Justin Lai <justinlai0215@realtek.com> Link: https://patch.msgid.link/20241203103146.734516-1-justinlai0215@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-03r8169: remove support for chip version 11Heiner Kallweit
This is a follow-up to 982300c115d2 ("r8169: remove detection of chip version 11 (early RTL8168b)"). Nobody complained yet, so remove support for this chip version. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/b689ab6d-20b5-4b64-bd7e-531a0a972ba3@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-03r8169: remove unused flag RTL_FLAG_TASK_RESET_NO_QUEUE_WAKEHeiner Kallweit
After 854d71c555dfc3 ("r8169: remove original workaround for RTL8125 broken rx issue") this flag isn't used any longer. So remove it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/d9dd214b-3027-4f60-b0e8-6f34a0c76582@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-03igb: Fix potential invalid memory access in igb_init_module()Yuan Can
The pci_register_driver() can fail and when this happened, the dca_notifier needs to be unregistered, otherwise the dca_notifier can be called when igb fails to install, resulting to invalid memory access. Fixes: bbd98fe48a43 ("igb: Fix DCA errors and do not use context index for 82576") Signed-off-by: Yuan Can <yuancan@huawei.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ixgbe: Correct BASE-BX10 compliance codeTore Amundsen
SFF-8472 (section 5.4 Transceiver Compliance Codes) defines bit 6 as BASE-BX10. Bit 6 means a value of 0x40 (decimal 64). The current value in the source code is 0x64, which appears to be a mix-up of hex and decimal values. A value of 0x64 (binary 01100100) incorrectly sets bit 2 (1000BASE-CX) and bit 5 (100BASE-FX) as well. Fixes: 1b43e0d20f2d ("ixgbe: Add 1000BASE-BX support") Signed-off-by: Tore Amundsen <tore@amundsen.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Acked-by: Ernesto Castellotti <ernesto@castellotti.net> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ixgbe: downgrade logging of unsupported VF API version to debugJacob Keller
The ixgbe PF driver logs an info message when a VF attempts to negotiate an API version which it does not support: VF 0 requested invalid api version 6 The ixgbevf driver attempts to load with mailbox API v1.5, which is required for best compatibility with other hosts such as the ESX VMWare PF. The Linux PF only supports API v1.4, and does not currently have support for the v1.5 API. The logged message can confuse users, as the v1.5 API is valid, but just happens to not currently be supported by the Linux PF. Downgrade the info message to a debug message, and fix the language to use 'unsupported' instead of 'invalid' to improve message clarity. Long term, we should investigate whether the improvements in the v1.5 API make sense for the Linux PF, and if so implement them properly. This may require yet another API version to resolve issues with negotiating IPSEC offload support. Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Reported-by: Yifei Liu <yifei.l.liu@oracle.com> Link: https://lore.kernel.org/intel-wired-lan/20240301235837.3741422-1-yifei.l.liu@oracle.com/ Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5Jacob Keller
Commit 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") added support for v1.5 of the PF to VF mailbox communication API. This commit mistakenly enabled IPSEC offload for API v1.5. No implementation of the v1.5 API has support for IPSEC offload. This offload is only supported by the Linux PF as mailbox API v1.4. In fact, the v1.5 API is not implemented in any Linux PF. Attempting to enable IPSEC offload on a PF which supports v1.5 API will not work. Only the Linux upstream ixgbe and ixgbevf support IPSEC offload, and only as part of the v1.4 API. Fix the ixgbevf Linux driver to stop attempting IPSEC offload when the mailbox API does not support it. The existing API design choice makes it difficult to support future API versions, as other non-Linux hosts do not implement IPSEC offload. If we add support for v1.5 to the Linux PF, then we lose support for IPSEC offload. A full solution likely requires a new mailbox API with a proper negotiation to check that IPSEC is actually supported by the host. Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03idpf: set completion tag for "empty" bufs associated with a packetJoshua Hay
Commit d9028db618a6 ("idpf: convert to libeth Tx buffer completion") inadvertently removed code that was necessary for the tx buffer cleaning routine to iterate over all buffers associated with a packet. When a frag is too large for a single data descriptor, it will be split across multiple data descriptors. This means the frag will span multiple buffers in the buffer ring in order to keep the descriptor and buffer ring indexes aligned. The buffer entries in the ring are technically empty and no cleaning actions need to be performed. These empty buffers can precede other frags associated with the same packet. I.e. a single packet on the buffer ring can look like: buf[0]=skb0.frag0 buf[1]=skb0.frag1 buf[2]=empty buf[3]=skb0.frag2 The cleaning routine iterates through these buffers based on a matching completion tag. If the completion tag is not set for buf2, the loop will end prematurely. Frag2 will be left uncleaned and next_to_clean will be left pointing to the end of packet, which will break the cleaning logic for subsequent cleans. This consequently leads to tx timeouts. Assign the empty bufs the same completion tag for the packet to ensure the cleaning routine iterates over all of the buffers associated with the packet. Fixes: d9028db618a6 ("idpf: convert to libeth Tx buffer completion") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Madhu chittim <madhu.chittim@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ice: Fix VLAN pruning in switchdev modeMarcin Szycik
In switchdev mode the uplink VSI should receive all unmatched packets, including VLANs. Therefore, VLAN pruning should be disabled if uplink is in switchdev mode. It is already being done in ice_eswitch_setup_env(), however the addition of ice_up() in commit 44ba608db509 ("ice: do switchdev slow-path Rx using PF VSI") caused VLAN pruning to be re-enabled after disabling it. Add a check to ice_set_vlan_filtering_features() to ensure VLAN filtering will not be enabled if uplink is in switchdev mode. Note that ice_is_eswitch_mode_switchdev() is being used instead of ice_is_switchdev_running(), as the latter would only return true after the whole switchdev setup completes. Fixes: 44ba608db509 ("ice: do switchdev slow-path Rx using PF VSI") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Priya Singh <priyax.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ice: Fix NULL pointer dereference in switchdevWojciech Drewek
Commit 608a5c05c39b ("virtchnl: support queue rate limit and quanta size configuration") introduced new virtchnl ops: - get_qos_caps - cfg_q_bw - cfg_q_quanta New ops were added to ice_virtchnl_dflt_ops, in commit 015307754a19 ("ice: Support VF queue rate limit and quanta size configuration"), but not to the ice_virtchnl_repr_ops. Because of that, if we get one of those messages in switchdev mode we end up with NULL pointer dereference: [ 1199.794701] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1199.794804] Workqueue: ice ice_service_task [ice] [ 1199.794878] RIP: 0010:0x0 [ 1199.795027] Call Trace: [ 1199.795033] <TASK> [ 1199.795039] ? __die+0x20/0x70 [ 1199.795051] ? page_fault_oops+0x140/0x520 [ 1199.795064] ? exc_page_fault+0x7e/0x270 [ 1199.795074] ? asm_exc_page_fault+0x22/0x30 [ 1199.795086] ice_vc_process_vf_msg+0x6e5/0xd30 [ice] [ 1199.795165] __ice_clean_ctrlq+0x734/0x9d0 [ice] [ 1199.795207] ice_service_task+0xccf/0x12b0 [ice] [ 1199.795248] process_one_work+0x21a/0x620 [ 1199.795260] worker_thread+0x18d/0x330 [ 1199.795269] ? __pfx_worker_thread+0x10/0x10 [ 1199.795279] kthread+0xec/0x120 [ 1199.795288] ? __pfx_kthread+0x10/0x10 [ 1199.795296] ret_from_fork+0x2d/0x50 [ 1199.795305] ? __pfx_kthread+0x10/0x10 [ 1199.795312] ret_from_fork_asm+0x1a/0x30 [ 1199.795323] </TASK> Fixes: 015307754a19 ("ice: Support VF queue rate limit and quanta size configuration") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ice: fix PHY timestamp extraction for ETH56GPrzemyslaw Korba
Fix incorrect PHY timestamp extraction for ETH56G. It's better to use FIELD_PREP() than manual shift. Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03ice: fix PHY Clock Recovery availability checkArkadiusz Kubalewski
To check if PHY Clock Recovery mechanic is available for a device, there is a need to verify if given PHY is available within the netlist, but the netlist node type used for the search is wrong, also the search context shall be specified. Modify the search function to allow specifying the context in the search. Use the PHY node type instead of CLOCK CONTROLLER type, also use proper search context which for PHY search is PORT, as defined in E810 Datasheet [1] ('3.3.8.2.4 Node Part Number and Node Options (0x0003)' and 'Table 3-105. Program Topology Device NVM Admin Command'). [1] https://cdrdv2.intel.com/v1/dl/getContent/613875?explicitVersion=true Fixes: 91e43ca0090b ("ice: fix linking when CONFIG_PTP_1588_CLOCK=n") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-03net/qed: allow old cards not supporting "num_images" to workLouis Leseur
Commit 43645ce03e00 ("qed: Populate nvm image attribute shadow.") added support for populating flash image attributes, notably "num_images". However, some cards were not able to return this information. In such cases, the driver would return EINVAL, causing the driver to exit. Add check to return EOPNOTSUPP instead of EINVAL when the card is not able to return these information. The caller function already handles EOPNOTSUPP without error. Fixes: 43645ce03e00 ("qed: Populate nvm image attribute shadow.") Co-developed-by: Florian Forestier <florian@forestier.re> Signed-off-by: Florian Forestier <florian@forestier.re> Signed-off-by: Louis Leseur <louis.leseur@gmail.com> Link: https://patch.msgid.link/20241128083633.26431-1-louis.leseur@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-02module: Convert symbol namespace to string literalPeter Zijlstra
Clean up the existing export namespace code along the same lines of commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo) to __section("foo")") and for the same reason, it is not desired for the namespace argument to be a macro expansion itself. Scripted using git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file; do awk -i inplace ' /^#define EXPORT_SYMBOL_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /^#define MODULE_IMPORT_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /MODULE_IMPORT_NS/ { $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g"); } /EXPORT_SYMBOL_NS/ { if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) { if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ && $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ && $0 !~ /^my/) { getline line; gsub(/[[:space:]]*\\$/, ""); gsub(/[[:space:]]/, "", line); $0 = $0 " " line; } $0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/, "\\1(\\2, \"\\3\")", "g"); } } { print }' $file; done Requested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc Acked-by: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02octeontx2-af: Fix SDP MAC link credits configurationGeetha sowjanya
Current driver allows only packet size < 512B as SDP_LINK_CREDIT register is set to default value. This patch fixes this issue by configure the register with maximum HW supported value to allow packet size > 512B. Fixes: 2f7f33a09516 ("octeontx2-pf: Add representors for sdp MAC") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-30bnxt_en: ethtool: Supply ntuple rss context actionDaniel Xu
Commit 2f4f9fe5bf5f ("bnxt_en: Support adding ntuple rules on RSS contexts") added support for redirecting to an RSS context as an ntuple rule action. However, it forgot to update the ETHTOOL_GRXCLSRULE codepath. This caused `ethtool -n` to always report the action as "Action: Direct to queue 0" which is wrong. Fix by teaching bnxt driver to report the RSS context when applicable. Fixes: 2f4f9fe5bf5f ("bnxt_en: Support adding ntuple rules on RSS contexts") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://patch.msgid.link/2e884ae39e08dc5123be7c170a6089cefe6a78f7.1732748253.git.dxu@dxuuu.xyz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-29net: enetc: Do not configure preemptible TCs if SIs do not supportWei Fang
Both ENETC PF and VF drivers share enetc_setup_tc_mqprio() to configure MQPRIO. And enetc_setup_tc_mqprio() calls enetc_change_preemptible_tcs() to configure preemptible TCs. However, only PF is able to configure preemptible TCs. Because only PF has related registers, while VF does not have these registers. So for VF, its hw->port pointer is NULL. Therefore, VF will access an invalid pointer when accessing a non-existent register, which will cause a crash issue. The simplified log is as follows. root@ls1028ardb:~# tc qdisc add dev eno0vf0 parent root handle 100: \ mqprio num_tc 4 map 0 0 1 1 2 2 3 3 queues 1@0 1@1 1@2 1@3 hw 1 [ 187.290775] Unable to handle kernel paging request at virtual address 0000000000001f00 [ 187.424831] pc : enetc_mm_commit_preemptible_tcs+0x1c4/0x400 [ 187.430518] lr : enetc_mm_commit_preemptible_tcs+0x30c/0x400 [ 187.511140] Call trace: [ 187.513588] enetc_mm_commit_preemptible_tcs+0x1c4/0x400 [ 187.518918] enetc_setup_tc_mqprio+0x180/0x214 [ 187.523374] enetc_vf_setup_tc+0x1c/0x30 [ 187.527306] mqprio_enable_offload+0x144/0x178 [ 187.531766] mqprio_init+0x3ec/0x668 [ 187.535351] qdisc_create+0x15c/0x488 [ 187.539023] tc_modify_qdisc+0x398/0x73c [ 187.542958] rtnetlink_rcv_msg+0x128/0x378 [ 187.547064] netlink_rcv_skb+0x60/0x130 [ 187.550910] rtnetlink_rcv+0x18/0x24 [ 187.554492] netlink_unicast+0x300/0x36c [ 187.558425] netlink_sendmsg+0x1a8/0x420 [ 187.606759] ---[ end trace 0000000000000000 ]--- In addition, some PFs also do not support configuring preemptible TCs, such as eno1 and eno3 on LS1028A. It won't crash like it does for VFs, but we should prevent these PFs from accessing these unimplemented registers. Fixes: 827145392a4a ("net: enetc: only commit preemptible TCs to hardware when MM TX is active") Signed-off-by: Wei Fang <wei.fang@nxp.com> Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-29net: enetc: read TSN capabilities from port register, not SIVladimir Oltean
Configuring TSN (Qbv, Qbu, PSFP) capabilities requires access to port registers, which are available to the PSI but not the VSI. Yet, the SI port capability register 0 (PSICAPR0), exposed to both PSIs and VSIs, presents the same capabilities to the VF as to the PF, thus leading the VF driver into thinking it can configure these features. In the case of ENETC_SI_F_QBU, having it set in the VF leads to a crash: root@ls1028ardb:~# tc qdisc add dev eno0vf0 parent root handle 100: \ mqprio num_tc 4 map 0 0 1 1 2 2 3 3 queues 1@0 1@1 1@2 1@3 hw 1 [ 187.290775] Unable to handle kernel paging request at virtual address 0000000000001f00 [ 187.424831] pc : enetc_mm_commit_preemptible_tcs+0x1c4/0x400 [ 187.430518] lr : enetc_mm_commit_preemptible_tcs+0x30c/0x400 [ 187.511140] Call trace: [ 187.513588] enetc_mm_commit_preemptible_tcs+0x1c4/0x400 [ 187.518918] enetc_setup_tc_mqprio+0x180/0x214 [ 187.523374] enetc_vf_setup_tc+0x1c/0x30 [ 187.527306] mqprio_enable_offload+0x144/0x178 [ 187.531766] mqprio_init+0x3ec/0x668 [ 187.535351] qdisc_create+0x15c/0x488 [ 187.539023] tc_modify_qdisc+0x398/0x73c [ 187.542958] rtnetlink_rcv_msg+0x128/0x378 [ 187.547064] netlink_rcv_skb+0x60/0x130 [ 187.550910] rtnetlink_rcv+0x18/0x24 [ 187.554492] netlink_unicast+0x300/0x36c [ 187.558425] netlink_sendmsg+0x1a8/0x420 [ 187.606759] ---[ end trace 0000000000000000 ]--- while the other TSN features in the VF are harmless, because the net_device_ops used for the VF driver do not expose entry points for these other features. These capability bits are in the process of being defeatured from the SI registers. We should read them from the port capability register, where they are also present, and which is naturally only exposed to the PF. The change to blame (relevant for stable backports) is the one where this started being a problem, aka when the kernel started to crash due to the wrong capability seen by the VF driver. Fixes: 827145392a4a ("net: enetc: only commit preemptible TCs to hardware when MM TX is active") Reported-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>