summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)Author
2023-11-29mlxsw: spectrum_fid: Add an op to get PGT allocation sizePetr Machata
In the CFF flood mode, the PGT allocation size of RFID family will not depend on number of FIDs, but rather number of ports and LAGs. Therefore introduce a FID family operation to calculate the PGT allocation size. The way that size is calculated in the CFF mode depends on calling fallible functions. Thus express the op as returning an int, with the size returned via a pointer argument. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/1174651b7160fcedbef50010ae4b68201112fe6f.1701183892.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29mlxsw: spectrum_fid: Add an op for flood table initializationPetr Machata
In controlled flood mode, for each bridge FID family (i.e., 802.1Q and 802.1D) and packet type (i.e., UUC/MC/BC), the hardware needs to be told which PGT address to use as the base address for the flood table and how to determine the offset from the base for each FID. The above is not needed in CFF mode where each FID has its own flood table instead of the FID family itself. Therefore, create a new FID family operation for the above configuration and only implement it for the 802.1Q and 802.1D families in controlled flood mode. No functional changes intended. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/06f71415eec75811585ec597e1dd101b6dff77e7.1701183892.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29mlxsw: spectrum_fid: Move mlxsw_sp_fid_flood_table_init() upPetr Machata
Move the function to the point where it will need to be to be visible for the 802.1d ops. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/aef09e26b0c2dd077531e665d7135b300bdaf0a8.1701183892.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29mlxsw: spectrum_fid: Make mlxsw_sp_fid_ops.setup return an intPetr Machata
This operation will be fallible for rFIDs in CFF mode, which will be introduced in follow-up patches. Have it return an int, and handle the failures in the caller. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/75f1b85c0cb86bea5501fcc8657042f221a78b32.1701183892.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29mlxsw: spectrum_fid: Split a helper out of mlxsw_sp_fid_flood_table_mid()Petr Machata
In future patches, for CFF flood mode support, we will need a way to determine a PGT base dynamically, as an op. Therefore, for symmetry, split out a helper, mlxsw_sp_fid_pgt_base_ctl(), that determines a PGT base in the controlled mode as well. Now that the helper is available, use it in mlxsw_sp_fid_flood_table_init() which currently invokes the FID->MID helper to that end. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/fd41c66a1df4df6499d3da34f40e7b9efa15bc3e.1701183892.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29mlxsw: spectrum_fid: Rename FID ops, families, arraysPetr Machata
Currently, mlxsw always uses a "controlled" flood mode on all Nvidia Spectrum generations. The following patches will however introduce a possibility to run a "CFF" (for Compressed FID Flooding) mode on newer machines, if the FW supports it. To reflect that, label all FID ops, FID families and FID family arrays with a _ctl suffix. This will make it clearer what is what when the CFF families are introduced in later patches. Keep the dummy family intact. Since the dummy family has no flood tables in either CTL or CFF mode, there are no flood-mode-specific callbacks. Additionally, add a remark at two fields that they are only relevant when flood mode is not CFF. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/96b6da5439bb662fa86e795bbcec9dc3ccfa59fd.1701183892.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29mlxsw: spectrum_fid: Privatize FID familiesPetr Machata
Currently, mlxsw always uses a "controlled" flood mode on all Nvidia Spectrum generations. The following patches will however introduce a possibility to run a "CFF" (for Compressed FID Flooding) mode on newer machines, if the FW supports it. Several operations will differ between how they need to be done in controlled mode vs. CFF mode. Thus the per-FID-family ops will differ between controlled and CFF, thus the FID family array as such will differ depending on whether the mode negotiated with FW is controlled or CFF. The simple approach of having several globally visible arrays for spectrum.c to statically choose from no longer works. Instead privatize all FID initialization and finalization logic, and expose it as ops instead. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/d3fa390d97cf3dbd2f7a28741be69b311e2059e4.1701183891.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29ice: Fix VF Reset paths when interface in a failed over aggregateDave Ertman
There is an error when an interface has the following conditions: - PF is in an aggregate (bond) - PF has VFs created on it - bond is in a state where it is failed-over to the secondary interface - A VF reset is issued on one or more of those VFs The issue is generated by the originating PF trying to rebuild or reconfigure the VF resources. Since the bond is failed over to the secondary interface the queue contexts are in a modified state. To fix this issue, have the originating interface reclaim its resources prior to the tear-down and rebuild or reconfigure. Then after the process is complete, move the resources back to the currently active interface. There are multiple paths that can be used depending on what triggered the event, so create a helper function to move the queues and use paired calls to the helper (back to origin, process, then move back to active interface) under the same lag_mutex lock. Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://lore.kernel.org/r/20231127212340.1137657-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29r8169: improve handling task schedulingHeiner Kallweit
If we know that the task is going to be a no-op, don't even schedule it. And remove the check for netif_running() in the worker function, the check for flag RTL_FLAG_TASK_ENABLED is sufficient. Note that we can't remove the check for flag RTL_FLAG_TASK_ENABLED in the worker function because we have no guarantee when it will be executed. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://lore.kernel.org/r/c65873a3-7394-4107-99a7-83f20030779c@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29net: stmmac: Add Tx HWTS support to XDP ZCSong Yoong Siang
This patch enables transmit hardware timestamp support to XDP zero copy via XDP Tx metadata framework. This patchset is tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel Tiger Lake platform. Below are the test steps and results. Command on DUT: sudo ./xdp_hw_metadata <interface name> sudo hwstamp_ctl -i <interface name> -t 1 -r 1 Command on Link Partner: echo -n xdp | nc -u -q1 <destination IPv4 addr> 9091 Result: xsk_ring_cons__peek: 1 0x55bbbf08b6d0: rx_desc[2]->addr=8c100 addr=8c100 comp_addr=8c100 EoP No rx_hash err=-95 rx_timestamp: 1677762688429141540 (sec:1677762688.4291) HW RX-time: 1677762688429141540 (sec:1677762688.4291) delta to User RX-time sec:0.0003 (250.665 usec) XDP RX-time: 1677762688429375597 (sec:1677762688.4294) delta to User RX-time sec:0.0000 (16.608 usec) 0x55bbbf08b6d0: ping-pong with csum=561c (want f488) csum_start=34 csum_offset=6 0x55bbbf08b6d0: complete tx idx=2 addr=2008 tx_timestamp: 1677762688431127273 (sec:1677762688.4311) HW TX-complete-time: 1677762688431127273 (sec:1677762688.4311) delta to User TX-complete-time sec:0.0083 (8331.655 usec) XDP RX-time: 1677762688429375597 (sec:1677762688.4294) delta to User TX-complete-time sec:0.0101 (10083.331 usec) HW RX-time: 1677762688429141540 (sec:1677762688.4291) delta to HW TX-complete-time sec:0.0020 (1985.733 usec) 0x55bbbf08b6d0: complete rx idx=130 addr=8c100 Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231127190319.1190813-6-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-29net/mlx5e: Implement AF_XDP TX timestamp and checksum offloadStanislav Fomichev
TX timestamp: - requires passing clock, not sure I'm passing the correct one (from cq->mdev), but the timestamp value looks convincing TX checksum: - looks like device does packet parsing (and doesn't accept custom start/offset), so I'm ignoring user offsets Cc: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231127190319.1190813-5-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-29gve: Remove dependency on 4k page size.John Fraker
Prior to this change, gve crashes when attempting to run in kernels with page sizes other than 4k. This change removes unnecessary references to PAGE_SIZE and replaces them with more meaningful constants. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-6-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29gve: Add page size register to the register_page_list command.John Fraker
This register is required on platforms with page sizes greater than 4k. This is because the tx side of the driver vmaps the entire queue page list of pages into a single flat address space, then uses the entire space. Without communicating the guest page size to the backend, the backend will only access the first 4k of each page in the queue page list. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-5-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29gve: Remove obsolete checks that rely on page size.John Fraker
These checks are safe to remove as they are no longer enforced by the backend. Retaining them would require updating them to work differently with page sizes larger than 4k. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-4-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29gve: Deprecate adminq_pfn for pci revision 0x1.John Fraker
adminq_pfn assumes a page size of 4k, causing this mechanism to break in kernels compiled with different page sizes. A new PCI device revision was needed for the device to be able to communicate with the driver how to set up the admin queue prior to having access to the admin queue. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-3-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29gve: Perform adminq allocations through a dma_pool.John Fraker
This allows the adminq to be smaller than a page, paving the way for non 4k page support. This is to support platforms where PAGE_SIZE is not 4k, such as some ARM platforms. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-2-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28Merge branch '40GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-11-27 (i40e, iavf) This series contains updates to i40e and iavf drivers. Ivan Vecera performs more cleanups on i40e and iavf drivers; removing unused fields, defines, and unneeded fields. Petr Oros utilizes iavf_schedule_aq_request() helper to replace open coded equivalents. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: use iavf_schedule_aq_request() helper iavf: Remove queue tracking fields from iavf_adminq_ring i40e: Remove queue tracking fields from i40e_adminq_ring i40e: Remove AQ register definitions for VF types i40e: Delete unused and useless i40e_pf fields ==================== Link: https://lore.kernel.org/r/20231127211037.1135403-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28r8169: remove multicast filter limitHeiner Kallweit
Once upon a time, when r8169 was new, the multicast filter limit code was copied from RTL8139 driver. There the filter limit is even user-configurable. The filtering is hash-based and we don't have perfect filtering. Actually the mc filtering on RTL8125 still seems to be the same as used on 8390/NE2000. So it's not clear to me which benefit it should bring when switching to all-multi mode once a certain number of filter bits is set. More the opposite: Filtering out at least some unwanted mc traffic is better than no filtering. Also the available chip documentation doesn't mention any restriction. Therefore remove the filter limit. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/57076c05-3730-40d1-ab9a-5334b263e41a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28ice: fix error code in ice_eswitch_attach()Dan Carpenter
Set the "err" variable on this error path. Fixes: fff292b47ac1 ("ice: add VF representors one by one") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Link: https://lore.kernel.org/r/e0349ee5-76e6-4ff4-812f-4aa0d3f76ae7@moroto.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28nfp: ethtool: support TX/RX pause frame on/offYu Xiao
Add support for ethtool -A tx on/off and rx on/off. Signed-off-by: Yu Xiao <yu.xiao@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231127055116.6668-1-louis.peens@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28ravb: Fix races between ravb_tx_timeout_work() and net related opsYoshihiro Shimoda
Fix races between ravb_tx_timeout_work() and functions of net_device_ops and ethtool_ops by using rtnl_trylock() and rtnl_unlock(). Note that since ravb_close() is under the rtnl lock and calls cancel_work_sync(), ravb_tx_timeout_work() should calls rtnl_trylock(). Otherwise, a deadlock may happen in ravb_tx_timeout_work() like below: CPU0 CPU1 ravb_tx_timeout() schedule_work() ... __dev_close_many() // Under rtnl lock ravb_close() cancel_work_sync() // Waiting ravb_tx_timeout_work() rtnl_lock() // This is possible to cause a deadlock If rtnl_trylock() fails, rescheduling the work with sleep for 1 msec. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20231127122420.3706751-1-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28eth: link netdev to page_pools in driversJakub Kicinski
Link page pool instances to netdev for the drivers which already link to NAPI. Unless the driver is doing something very weird per-NAPI should imply per-netdev. Add netsec as well, Ilias indicates that it fits the mold. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28r8169: prevent potential deadlock in rtl8169_closeHeiner Kallweit
ndo_stop() is RTNL-protected by net core, and the worker function takes RTNL as well. Therefore we will deadlock when trying to execute a pending work synchronously. To fix this execute any pending work asynchronously. This will do no harm because netif_running() is false in ndo_stop(), and therefore the work function is effectively a no-op. However we have to ensure that no task is running or pending after rtl_remove_one(), therefore add a call to cancel_work_sync(). Fixes: abe5fc42f9ce ("r8169: use RTNL to protect critical sections") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/12395867-1d17-4cac-aa7d-c691938fcddf@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28r8169: fix deadlock on RTL8125 in jumbo mtu modeHeiner Kallweit
The original change results in a deadlock if jumbo mtu mode is used. Reason is that the phydev lock is held when rtl_reset_work() is called here, and rtl_jumbo_config() calls phy_start_aneg() which also tries to acquire the phydev lock. Fix this by calling rtl_reset_work() asynchronously. Fixes: 621735f59064 ("r8169: fix rare issue with broken rx after link-down on RTL8125") Reported-by: Ian Chen <free122448@hotmail.com> Tested-by: Ian Chen <free122448@hotmail.com> Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/caf6a487-ef8c-4570-88f9-f47a659faf33@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28octeontx2-pf: Restore TC ingress police rules when interface is upSubbaraya Sundeep
TC ingress policer rules depends on interface receive queue contexts since the bandwidth profiles are attached to RQ contexts. When an interface is brought down all the queue contexts are freed. This in turn frees bandwidth profiles in hardware causing ingress police rules non-functional after the interface is brought up. Fix this by applying all the ingress police rules config to hardware in otx2_open. Also allow adding ingress rules only when interface is running since no contexts exist for the interface when it is down. Fixes: 68fbff68dbea ("octeontx2-pf: Add police action for TC flower") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/1700930217-5707-1-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28octeontx2-pf: Fix adding mbox work queue entry when num_vfs > 64Geetha sowjanya
When more than 64 VFs are enabled for a PF then mbox communication between VF and PF is not working as mbox work queueing for few VFs are skipped due to wrong calculation of VF numbers. Fixes: d424b6c02415 ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/1700930042-5400-1-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: stmmac: xgmac: Disable FPE MMC interruptsFurong Xu
Commit aeb18dd07692 ("net: stmmac: xgmac: Disable MMC interrupts by default") tries to disable MMC interrupts to avoid a storm of unhandled interrupts, but leaves the FPE(Frame Preemption) MMC interrupts enabled, FPE MMC interrupts can cause the same problem. Now we mask FPE TX and RX interrupts to disable all MMC interrupts. Fixes: aeb18dd07692 ("net: stmmac: xgmac: Disable MMC interrupts by default") Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Link: https://lore.kernel.org/r/20231125060126.2328690-1-0x1207@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28octeontx2-af: Fix possible buffer overflowElena Salomatkina
A loop in rvu_mbox_handler_nix_bandprof_free() contains a break if (idx == MAX_BANDPROF_PER_PFFUNC), but if idx may reach MAX_BANDPROF_PER_PFFUNC buffer '(*req->prof_idx)[layer]' overflow happens before that check. The patch moves the break to the beginning of the loop. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: e8e095b3b370 ("octeontx2-af: cn10k: Bandwidth profiles config support"). Signed-off-by: Elena Salomatkina <elena.salomatkina.cmc@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/20231124210802.109763-1-elena.salomatkina.cmc@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-27iavf: use iavf_schedule_aq_request() helperPetr Oros
Use the iavf_schedule_aq_request() helper when we need to schedule a watchdog task immediately. No functional change. Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-27iavf: Remove queue tracking fields from iavf_adminq_ringIvan Vecera
Fields 'head', 'tail', 'len', 'bah' and 'bal' in iavf_adminq_ring are used to store register offsets. These offsets are initialized and remains constant so there is no need to store them in the iavf_adminq_ring structure. Remove these fields from iavf_adminq_ring and use register offset constants instead. Remove iavf_adminq_init_regs() that originally stores these constants into these fields. Finally improve iavf_check_asq_alive() that assumes that non-zero value of hw->aq.asq.len indicates fully initialized AdminQ send queue. Replace it by check for non-zero value of field hw->aq.asq.count that is non-zero when the sending queue is initialized and is zeroed during shutdown of the queue. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-27i40e: Remove queue tracking fields from i40e_adminq_ringIvan Vecera
Fields 'head', 'tail', 'len', 'bah' and 'bal' in i40e_adminq_ring are used to store register offsets. These offsets are initialized and remains constant so there is no need to store them in the i40e_adminq_ring structure. Remove these fields from i40e_adminq_ring and use register offset constants instead. Remove i40e_adminq_init_regs() that originally stores these constants into these fields. Finally improve i40e_check_asq_alive() that assumes that non-zero value of hw->aq.asq.len indicates fully initialized AdminQ send queue. Replace it by check for non-zero value of field hw->aq.asq.count that is non-zero when the sending queue is initialized and is zeroed during shutdown of the queue. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-27i40e: Remove AQ register definitions for VF typesIvan Vecera
The i40e driver does not handle its VF device types so there is no need to keep AdminQ register definitions for such device types. Remove them. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-27i40e: Delete unused and useless i40e_pf fieldsIvan Vecera
Removed fields: .fc_autoneg_status Since commit c56999f94876 ("i40e/i40evf: Add set_fc and init of FC settings") write-only and otherwise unused .eeprom_version Write-only and otherwise unused .atr_sample_rate Has only one possible value (I40E_DEFAULT_ATR_SAMPLE_RATE). Remove it and replace its occurrences by I40E_DEFAULT_ATR_SAMPLE_RATE .adminq_work_limit Has only one possible value (I40E_AQ_WORK_LIMIT). Remove it and replace its occurrences by I40E_AQ_WORK_LIMIT .tx_sluggish_count Unused, never written .pf_seid Used to store VSI downlink seid and it is referenced only once in the same codepath. There is no need to save it into i40e_pf. Remove it and use downlink_seid directly in the mentioned log message. .instance Write only. Remove it as well as ugly static local variable 'pfs_found' in i40e_probe. .int_policy .switch_kobj .ptp_pps_work .ptp_extts1_work .ptp_pps_start .pps_delay .ptp_pin .override_q_count All these unused at all Prior the patch: pahole -Ci40e_pf drivers/net/ethernet/intel/i40e/i40e.ko | tail -5 /* size: 5368, cachelines: 84, members: 127 */ /* sum members: 5297, holes: 20, sum holes: 71 */ /* paddings: 6, sum paddings: 19 */ /* last cacheline: 56 bytes */ }; After the patch: pahole -Ci40e_pf drivers/net/ethernet/intel/i40e/i40e.ko | tail -5 /* size: 4976, cachelines: 78, members: 112 */ /* sum members: 4905, holes: 17, sum holes: 71 */ /* paddings: 6, sum paddings: 19 */ /* last cacheline: 48 bytes */ }; Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-27net :mana :Add remaining GDMA stats for MANA to ethtoolShradha Gupta
Extend performance counter stats in 'ethtool -S <interface>' for MANA VF to include all GDMA stat counter. Tested-on: Ubuntu22 Testcases: 1. LISA testcase: PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-Synthetic 2. LISA testcase: PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-SRIOV Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Link: https://lore.kernel.org/r/1700830950-803-1-git-send-email-shradhagupta@linux.microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-26dpaa2-eth: recycle the RX buffer only after all processing doneIoana Ciornei
The blamed commit added support for Rx copybreak. This meant that for certain frame sizes, a new skb was allocated and the initial data buffer was recycled. Instead of waiting to recycle the Rx buffer only after all processing was done on it (like accessing the parse results or timestamp information), the code path just went ahead and re-used the buffer right away. This sometimes lead to corrupted HW and SW annotation areas. Fix this by delaying the moment when the buffer is recycled. Fixes: 50f826999a80 ("dpaa2-eth: add rx copybreak support") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-26dpaa2-eth: increase the needed headroom to account for alignmentIoana Ciornei
Increase the needed headroom to account for a 64 byte alignment restriction which, with this patch, we make mandatory on the Tx path. The case in which the amount of headroom needed is not available is already handled by the driver which instead sends a S/G frame with the first buffer only holding the SW and HW annotation areas. Without this patch, we can empirically see data corruption happening between Tx and Tx confirmation which sometimes leads to the SW annotation area being overwritten. Since this is an old IP where the hardware team cannot help to understand the underlying behavior, we make the Tx alignment mandatory for all frames to avoid the crash on Tx conf. Also, remove the comment that suggested that this is just an optimization. This patch also sets the needed_headroom net device field to the usual value that the driver would need on the Tx path: - 64 bytes for the software annotation area - 64 bytes to account for a 64 byte aligned buffer address Fixes: 6e2387e8f19e ("staging: fsl-dpaa2/eth: Add Freescale DPAA2 Ethernet driver") Closes: https://lore.kernel.org/netdev/aa784d0c-85eb-4e5d-968b-c8f74fa86be6@gin.de/ Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-25mlxsw: pci: Fix missing error checkingIdo Schimmel
I accidentally removed the error checking after issuing the reset. Restore it. Fixes: f257c73e5356 ("mlxsw: pci: Add support for new reset flow") Reported-by: Coverity Scan <scan-admin@coverity.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-24r8169: remove not needed check in rtl_fw_write_firmwareHeiner Kallweit
This check can never be true for a firmware file with a correct format. Existing checks in rtl_fw_data_ok() are sufficient, no problems with invalid firmware files are known. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-24octeon_ep: get max rx packet length from firmwareShinas Rasheed
Max receive packet length can vary across SoCs, so this needs to be queried from respective firmware and filled by driver. A control net get mtu api should be implemented to do the same. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-24octeon_ep: Solve style issues in control net filesShinas Rasheed
Solve observed style issues for kernel documentation and structure declarations in control net API definitions. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-24octeontx2-pf: TC flower offload support for ICMP type and codeGeetha sowjanya
Adds tc offload support for matching on ICMP type and code. Example usage: To enable adding tc ingress rules tc qdisc add dev eth0 ingress TC rule drop the ICMP echo reply: tc filter add dev eth0 protocol ip parent ffff: \ flower ip_proto icmp type 8 code 0 skip_sw action drop TC rule to drop ICMPv6 echo reply: tc filter add dev eth0 protocol ipv6 parent ffff: flower \ indev eth0 ip_proto icmpv6 type 128 code 0 action drop Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-24net: rswitch: Fix missing dev_kfree_skb_any() in error pathYoshihiro Shimoda
Before returning the rswitch_start_xmit() in the error path, dev_kfree_skb_any() should be called. So, fix it. Fixes: 33f5d733b589 ("net: renesas: rswitch: Improve TX timestamp accuracy") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-24net: rswitch: Fix return value in rswitch_start_xmit()Yoshihiro Shimoda
This .ndo_start_xmit() function should return netdev_tx_t value, not -ENOMEM. So, fix it. Fixes: 33f5d733b589 ("net: renesas: rswitch: Improve TX timestamp accuracy") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-24net: rswitch: Fix type of ret in rswitch_start_xmit()Yoshihiro Shimoda
The type of ret in rswitch_start_xmit() should be netdev_tx_t. So, fix it. Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/intel/ice/ice_main.c c9663f79cd82 ("ice: adjust switchdev rebuild path") 7758017911a4 ("ice: restore timestamp configuration after device reset") https://lore.kernel.org/all/20231121211259.3348630-1-anthony.l.nguyen@intel.com/ Adjacent changes: kernel/bpf/verifier.c bb124da69c47 ("bpf: keep track of max number of bpf_loop callback iterations") 5f99f312bd3b ("bpf: add register bounds sanity checks and sanitization") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-23net: axienet: Fix check for partial TX checksumSamuel Holland
Due to a typo, the code checked the RX checksum feature in the TX path. Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://lore.kernel.org/r/20231122004219.3504219-1-samuel.holland@sifive.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-23i40e: Fix adding unsupported cloud filtersIvan Vecera
If a VF tries to add unsupported cloud filter through virtchnl then i40e_add_del_cloud_filter(_big_buf) returns -ENOTSUPP but this error code is stored in 'ret' instead of 'aq_ret' that is used as error code sent back to VF. In this scenario where one of the mentioned functions fails the value of 'aq_ret' is zero so the VF will incorrectly receive a 'success'. Use 'aq_ret' to store return value and remove 'ret' local variable. Additionally fix the issue when filter allocation fails, in this case no notification is sent back to the VF. Fixes: e284fc280473 ("i40e: Add and delete cloud filter") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20231121211338.3348677-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-23ice: restore timestamp configuration after device resetJacob Keller
The driver calls ice_ptp_cfg_timestamp() during ice_ptp_prepare_for_reset() to disable timestamping while the device is resetting. This operation destroys the user requested configuration. While the driver does call ice_ptp_cfg_timestamp in ice_rebuild() to restore some hardware settings after a reset, it unconditionally passes true or false, resulting in failure to restore previous user space configuration. This results in a device reset forcibly disabling timestamp configuration regardless of current user settings. This was not detected previously due to a quirk of the LinuxPTP ptp4l application. If ptp4l detects a missing timestamp, it enters a fault state and performs recovery logic which includes executing SIOCSHWTSTAMP again, restoring the now accidentally cleared configuration. Not every application does this, and for these applications, timestamps will mysteriously stop after a PF reset, without being restored until an application restart. Fix this by replacing ice_ptp_cfg_timestamp() with two new functions: 1) ice_ptp_disable_timestamp_mode() which unconditionally disables the timestamping logic in ice_ptp_prepare_for_reset() and ice_ptp_release() 2) ice_ptp_restore_timestamp_mode() which calls ice_ptp_restore_tx_interrupt() to restore Tx timestamping configuration, calls ice_set_rx_tstamp() to restore Rx timestamping configuration, and issues an immediate TSYN_TX interrupt to ensure that timestamps which may have occurred during the device reset get processed. Modify the ice_ptp_set_timestamp_mode to directly save the user configuration and then call ice_ptp_restore_timestamp_mode. This way, reset no longer destroys the saved user configuration. This obsoletes the ice_set_tx_tstamp() function which can now be safely removed. With this change, all devices should now restore Tx and Rx timestamping functionality correctly after a PF reset without application intervention. Fixes: 77a781155a65 ("ice: enable receive hardware timestamping") Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-23ice: unify logic for programming PFINT_TSYN_MSKJacob Keller
Commit d938a8cca88a ("ice: Auxbus devices & driver for E822 TS") modified how Tx timestamps are handled for E822 devices. On these devices, only the clock owner handles reading the Tx timestamp data from firmware. To do this, the PFINT_TSYN_MSK register is modified from the default value to one which enables reacting to a Tx timestamp on all PHY ports. The driver currently programs PFINT_TSYN_MSK in different places depending on whether the port is the clock owner or not. For the clock owner, the PFINT_TSYN_MSK value is programmed during ice_ptp_init_owner just before calling ice_ptp_tx_ena_intr to program the PHY ports. For the non-clock owner ports, the PFINT_TSYN_MSK is programmed during ice_ptp_init_port. If a large enough device reset occurs, the PFINT_TSYN_MSK register will be reset to the default value in which only the PHY associated directly with the PF will cause the Tx timestamp interrupt to trigger. The driver lacks logic to reprogram the PFINT_TSYN_MSK register after a device reset. For the E822 device, this results in the PF no longer responding to interrupts for other ports. This results in failure to deliver Tx timestamps to user space applications. Rename ice_ptp_configure_tx_tstamp to ice_ptp_cfg_tx_interrupt, and unify the logic for programming PFINT_TSYN_MSK and PFINT_OICR_ENA into one place. This function will program both registers according to the combination of user configuration and device requirements. This ensures that PFINT_TSYN_MSK is always restored when we configure the Tx timestamp interrupt. Fixes: d938a8cca88a ("ice: Auxbus devices & driver for E822 TS") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-23ice: remove ptp_tx ring parameter flagJacob Keller
Before performing a Tx timestamp in ice_stamp(), the driver checks a ptp_tx ring variable to see if timestamping is enabled on that ring. This value is set for all rings whenever userspace configures Tx timestamping. Ostensibly this was done to avoid wasting cycles checking other fields when timestamping has not been enabled. However, for Tx timestamps we already get an individual per-SKB flag indicating whether userspace wants to request a timestamp on that packet. We do not gain much by also having a separate flag to check for whether timestamping was enabled. In fact, the driver currently fails to restore the field after a PF reset. Because of this, if a PF reset occurs, timestamps will be disabled. Since this flag doesn't add value in the hotpath, remove it and always provide a timestamp if the SKB flag has been set. A following change will fix the reset path to properly restore user timestamping configuration completely. This went unnoticed for some time because one of the most common applications using Tx timestamps, ptp4l, will reconfigure the socket as part of its fault recovery logic. Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>