Age | Commit message (Collapse) | Author |
|
This patch addresses a macvlan leak issue in the i40e driver caused by
concurrent access to vsi->mac_filter_hash. The leak occurs when multiple
threads attempt to modify the mac_filter_hash simultaneously, leading to
inconsistent state and potential memory leaks.
To fix this, we now wrap the calls to i40e_del_mac_filter() and zeroing
vf->default_lan_addr.addr with spin_lock/unlock_bh(&vsi->mac_filter_hash_lock),
ensuring atomic operations and preventing concurrent access.
Additionally, we add lockdep_assert_held(&vsi->mac_filter_hash_lock) in
i40e_add_mac_filter() to help catch similar issues in the future.
Reproduction steps:
1. Spawn VFs and configure port vlan on them.
2. Trigger concurrent macvlan operations (e.g., adding and deleting
portvlan and/or mac filters).
3. Observe the potential memory leak and inconsistent state in the
mac_filter_hash.
This synchronization ensures the integrity of the mac_filter_hash and prevents
the described leak.
Fixes: fed0d9f13266 ("i40e: Fix VF's MAC Address change on VM")
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Add jump targets so that a bit of exception handling can be better reused
at the end of two function implementations.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
We have for some time the assign_bit() API to replace open coded
if (foo)
set_bit(n, bar);
else
clear_bit(n, bar);
Use this API to clean the code. No functional change intended.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The max_frame and rx_buf_len fields of the VSI set the maximum frame size
for packets on the wire, and configure the size of the Rx buffer. In the
hardware, these are per-queue configuration. Most VSI types use a simple
method to determine the size of the buffers for all queues.
However, VFs may potentially configure different values for each queue.
While the Linux iAVF driver does not do this, it is allowed by the virtchnl
interface.
The current virtchnl code simply sets the per-VSI fields inbetween calls to
ice_vsi_cfg_single_rxq(). This technically works, as these fields are only
ever used when programming the Rx ring, and otherwise not checked again.
However, it is confusing to maintain.
The Rx ring also already has an rx_buf_len field in order to access the
buffer length in the hotpath. It also has extra unused bytes in the ring
structure which we can make use of to store the maximum frame size.
Drop the VSI max_frame and rx_buf_len fields. Add max_frame to the Rx ring,
and slightly re-order rx_buf_len to better fit into the gaps in the
structure layout.
Change the ice_vsi_cfg_frame_size function so that it writes to the ring
fields. Call this function once per ring in ice_vsi_cfg_rxqs(). This is
done over calling it inside the ice_vsi_cfg_rxq(), because
ice_vsi_cfg_rxq() is called in the virtchnl flow where the max_frame and
rx_buf_len have already been configured.
Change the accesses for rx_buf_len and max_frame to all point to the ring
structure. This has the added benefit that ice_vsi_cfg_rxq() no longer has
the surprise side effect of updating ring->rx_buf_len based on the VSI
field.
Update the virtchnl ice_vc_cfg_qs_msg() function to set the ring values
directly, and drop references to the removed VSI fields.
This now makes the VF logic clear, as the ring fields are obviously
per-queue. This reduces the required cognitive load when reasoning about
this logic.
Note that removing the VSI fields does leave a 4 byte gap, but the ice_vsi
structure has many gaps, and its layout is not as critical in the hot path.
The structure may benefit from a more thorough repacking, but no attempt
was made in this change.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The ice_vc_cfg_qs_msg() function is used to configure VF queues in response
to a VIRTCHNL_OP_CONFIG_VSI_QUEUES command.
The virtchnl command contains an array of queue pair data for configuring
Tx and Rx queues. This data includes a queue ID. When configuring the
queues, the driver generally uses this queue ID to determine which Tx and
Rx ring to program. However, a handful of places use the index into the
queue pair data from the VF. While most VF implementations appear to send
this data in order, it is not mandated by the virtchnl and it is not
verified that the queue pair data comes in order.
Fix the driver to consistently use the q_idx field instead of the 'i'
iterator value when accessing the rings. For the Rx case, introduce a local
ring variable to keep lines short.
Fixes: 7ad15440acf8 ("ice: Refactor VIRTCHNL_OP_CONFIG_VSI_QUEUES handling")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
E830 adds hardware support to prevent the VF from overflowing the PF
mailbox with VIRTCHNL messages. E830 will use the hardware feature
(ICE_F_MBX_LIMIT) instead of the software solution ice_is_malicious_vf().
To prevent a VF from overflowing the PF, the PF sets the number of
messages per VF that can be in the PF's mailbox queue
(ICE_MBX_OVERFLOW_WATERMARK). When the PF processes a message from a VF,
the PF decrements the per VF message count using the E830_MBX_VF_DEC_TRIG
register.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Enable ethtool reset support. Ethtool reset flags are mapped to the
E810 reset type:
PF reset:
$ ethtool --reset <ethX> irq dma filter offload
CORE reset:
$ ethtool --reset <ethX> irq-shared dma-shared filter-shared \
offload-shared ram-shared
GLOBAL reset:
$ ethtool --reset <ethX> irq-shared dma-shared filter-shared \
offload-shared mac-shared phy-shared ram-shared
Calling the same set of flags as in PF reset case on port representor
triggers VF reset.
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Increasing MSI-X value on a VF leads to invalid memory operations. This
is caused by not reallocating some arrays.
Reproducer:
modprobe ice
echo 0 > /sys/bus/pci/devices/$PF_PCI/sriov_drivers_autoprobe
echo 1 > /sys/bus/pci/devices/$PF_PCI/sriov_numvfs
echo 17 > /sys/bus/pci/devices/$VF0_PCI/sriov_vf_msix_count
Default MSI-X is 16, so 17 and above triggers this issue.
KASAN reports:
BUG: KASAN: slab-out-of-bounds in ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
Read of size 8 at addr ffff8888b937d180 by task bash/28433
(...)
Call Trace:
(...)
? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
kasan_report+0xed/0x120
? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
ice_vsi_cfg_def+0x3360/0x4770 [ice]
? mutex_unlock+0x83/0xd0
? __pfx_ice_vsi_cfg_def+0x10/0x10 [ice]
? __pfx_ice_remove_vsi_lkup_fltr+0x10/0x10 [ice]
ice_vsi_cfg+0x7f/0x3b0 [ice]
ice_vf_reconfig_vsi+0x114/0x210 [ice]
ice_sriov_set_msix_vec_count+0x3d0/0x960 [ice]
sriov_vf_msix_count_store+0x21c/0x300
(...)
Allocated by task 28201:
(...)
ice_vsi_cfg_def+0x1c8e/0x4770 [ice]
ice_vsi_cfg+0x7f/0x3b0 [ice]
ice_vsi_setup+0x179/0xa30 [ice]
ice_sriov_configure+0xcaa/0x1520 [ice]
sriov_numvfs_store+0x212/0x390
(...)
To fix it, use ice_vsi_rebuild() instead of ice_vf_reconfig_vsi(). This
causes the required arrays to be reallocated taking the new queue count
into account (ice_vsi_realloc_stat_arrays()). Set req_txq and req_rxq
before ice_vsi_rebuild(), so that realloc uses the newly set queue
count.
Additionally, ice_vsi_rebuild() does not remove VSI filters
(ice_fltr_remove_all()), so ice_vf_init_host_cfg() is no longer
necessary.
Reported-by: Jacob Keller <jacob.e.keller@intel.com>
Fixes: 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Triggering the reset while in switchdev mode causes
errors[1]. Rules are already removed by this time
because switch content is flushed in case of the reset.
This means that rules were deleted from HW but SW
still thinks they exist so when we get
SWITCHDEV_FDB_DEL_TO_DEVICE notification we try to
delete not existing rule.
We can avoid these errors by clearing the rules
early in the reset flow before they are removed from HW.
Switchdev API will get notified that the rule was removed
so we won't get SWITCHDEV_FDB_DEL_TO_DEVICE notification.
Remove unnecessary ice_clear_sw_switch_recipes.
[1]
ice 0000:01:00.0: Failed to delete FDB forward rule, err: -2
ice 0000:01:00.0: Failed to delete FDB guard rule, err: -2
Fixes: 7c945a1a8e5f ("ice: Switchdev FDB events support")
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
netif_is_ice() works by checking the pointer to netdev ops. However, it
only checks for the default ice_netdev_ops, not ice_netdev_safe_mode_ops,
so in Safe Mode it always returns false, which is unintuitive. While it
doesn't look like netif_is_ice() is currently being called anywhere in Safe
Mode, this could change and potentially lead to unexpected behaviour.
Fixes: df006dd4b1dc ("ice: Add initial support framework for LAG")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
If DDP package is missing or corrupted, the driver should enter Safe Mode.
Instead, an error is returned and probe fails.
To fix this, don't exit init if ice_init_ddp_config() returns an error.
Repro:
* Remove or rename DDP package (/lib/firmware/intel/ice/ddp/ice.pkg)
* Load ice
Fixes: cc5776fe1832 ("ice: Enable switching default Tx scheduler topology")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Given the previous patches, we no longer need the
struct iw_public_data etc., it's only used by the
old Intel drivers (and ps3_gelic creates it but
then doesn't use it). Remove all of that, including
the pointer in struct net_device.
Link: https://patch.msgid.link/20241007213525.8b2d52b60531.I6a27aaf30bded9a0977f07f47fba2bd31a3b3330@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The sizeof(struct napi_struct) can change. Don't hardcode the size to
400 bytes and instead use "sizeof(struct napi_struct)".
Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20241004105407.73585-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Depending on the SoC where the FEC is integrated into the PPS channel
might be routed to different timer instances. Make this configurable
from the devicetree.
When the related DT property is not present fallback to the previous
default and use channel 0.
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Tested-by: Rafael Beims <rafael.beims@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Csókás, Bence <csokas.bence@prolan.hu>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Preparation patch to allow for PPS channel configuration, no functional
change intended.
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Csókás, Bence <csokas.bence@prolan.hu>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Internal ports and PGID's are both defined relative to the number of
front ports on Sparx5. This will not work on lan969x. Instead make them
offsets to the number of front ports and add two helpers to retrieve
them. Use the helpers throughout.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We dont want to ops out each time a function needs to do some platform
specifics. In particular we have a few places, where it would be
convenient to just branch out on the platform type. Add the function
is_sparx5() and, initially, use it for:
- register writes that should only be done on Sparx5 (QSYS_CAL_CTRL,
CLKGEN_LCPLL1_CORE_CLK).
- function calls that should only be done on Sparx5
(ethtool_op_get_ts_info())
- register writes that are chip-exclusive (MASK_CFG1/2, PGID_CFG1/2,
these are replicated for n_ports >32 on Sparx5).
The is_sparx5() function simply checks the target chip type, to
determine if this is a Sparx5 SKU or not.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The DSM (Disassembler) calendar grants each port access to internal
busses. The configuration of the calendar is done differently on Sparx5
and lan969x. Therefore ops out the function that calculates the
calendar.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The PTP registers are located in two different register targets on
Sparx5 and lan969x. We can't handle this with the register macros, so
ops out the handler.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Port muxing is configured based on the supported port modes. As these
modes can differ on Sparx5 and lan969x we ops out the port muxing
function.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add getters for getting values in arrays: sdlb_groups and
sparx5_hsch_max_group_rate and ops out the getters, as these arrays will
differ on lan969x.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The chip port device index and mode bit can be obtained using the port
number. However the mapping of port number to chip device index and
mode bit differs on Sparx5 and lan969x. Therefore ops out the function.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add new struct sparx5_ops, containing functions that needs to be
different as the implementation differs on Sparx5 and lan969x. Initially
we add functions for checking the port type (2g5, 5g, 10g or 25g) based
on the port number. Update the code to use the ops instead of the
platform specific functions.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Now that we have indentified all the chip constants, update the use of
them where a symbol is not defined for the constant.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Now that we have indentified all the chip constants, update the use of
them where a symbol is already defined for the constant.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add new struct sparx5_consts, containing all the chip constants that are
known to be different for Sparx5 and lan969x.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The *sparx5 context pointer is required in functions that need to access
platform constants (which will be added in a subsequent patch). Prepare
for this by updating the prototype and use of such functions.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In preparation for lan969x, we need to define the SPX5_PORTS_ALL macro
as 70 (65 front ports + 5 internal ports). This is required as the
SPX5_PORT_CPU will be redefined as an offset to the number of front
ports, in a subsequent patch.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The register macros are used to read and write to the switch registers.
The registers are largely the same on Sparx5 and lan969x, however in some
cases they differ. The differences can be one or more of the following:
target size, register address, register count, group address, group
count, group size, field position, field size.
In order to handle these differences, we introduce a new indirection
layer, that defines and maps them to corresponding values, based on the
platform. As the register macro arguments can now be non-constants, we
also add non-constant variants of FIELD_GET and FIELD_PREP.
Since the indirection layer contributes to longer macros, we have
changed the formatting of them slightly, to adhere to a 80 character
limit, and added a comment if a macro is platform-specific.
With these additions, we can reuse all the existing macros for
lan969x.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In preparation for lan969x, add support for private match data. This
will be needed for abstracting away differences between the Sparx5 and
lan969x platforms. We initially add values for: iomap, iomap size and
ioranges. Update the use of these throughout.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Usage of devm_alloc_etherdev_mqs() conflicts with
am65_cpsw_nuss_cleanup_ndev() as the same struct net_device instances
get unregistered twice. Switch to alloc_etherdev_mqs() and make sure
am65_cpsw_nuss_cleanup_ndev() unregisters and frees those net_device
instances properly.
With this, it is finally possible to rmmod the driver without oopsing
the kernel.
Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Reviewed-by: Roger Quadros <roger@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In am65_cpsw_nuss_remove(), move the call to am65_cpsw_unregister_devlink()
after am65_cpsw_nuss_cleanup_ndev() to avoid triggering the
WARN_ON(devlink_port->type != DEVLINK_PORT_TYPE_NOTSET) in
devl_port_unregister(). Makes it coherent with usage in
m65_cpsw_nuss_register_ndevs()'s cleanup path.
Fixes: 58356eb31d60 ("net: ti: am65-cpsw-nuss: Add devlink support")
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
To prepare for constifying the following old driver core API:
struct device *device_find_child(struct device *dev, void *data,
int (*match)(struct device *dev, void *data));
to new:
struct device *device_find_child(struct device *dev, const void *data,
int (*match)(struct device *dev, const void *data));
The new API does not allow its match function (*match)() to modify
caller's match data @*data, but emac_sgmii_acpi_match(), as the old
API's match function, indeed modifies relevant match data, so it is
not suitable for the new API any more, solved by implementing the same
finding sgmii_ops function by correcting the function and using it
as parameter of device_for_each_child() instead of device_find_child().
By the way, this commit does not change any existing logic.
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://patch.msgid.link/20241003-qcom_emac_fix-v6-1-0658e3792ca4@quicinc.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Move the tx cpu dma ring index update out of transmit loop of
airoha_dev_xmit routine in order to not start transmitting the packet
before it is fully DMA mapped (e.g. fragmented skbs).
Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC")
Reported-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241004-airoha-eth-7581-mapping-fix-v1-1-8e4279ab1812@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
adin1110_read_fifo()
If 'frame_size' is too small or if 'round_len' is an error code, it is
likely that an error code should be returned to the caller.
Actually, 'ret' is likely to be 0, so if one of these sanity checks fails,
'success' is returned.
Return -EINVAL instead.
Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://patch.msgid.link/8ff73b40f50d8fa994a454911b66adebce8da266.1727981562.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit b514c47ebf41a6536551ed28a05758036e6eca7c.
The commit describes that we don't have to sync the page when
recycling, and it tries to optimize that case. But we do need
to sync after allocation. Recycling side should be changed to
pass the right sync size instead.
Fixes: b514c47ebf41 ("net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled")
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/20241004070846.2502e9ea@kernel.org
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Furong Xu <0x1207@gmail.com>
Link: https://patch.msgid.link/20241004142115.910876-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'struct mlxsw_afk_element_inst' are not modified in these drivers.
Constifying these structures moves some data to a read-only section, so
increases overall security.
Update a few functions and struct mlxsw_afk_block accordingly.
On a x86_64, with allmodconfig, as an example:
Before:
======
text data bss dec hex filename
4278 4032 0 8310 2076 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.o
After:
=====
text data bss dec hex filename
7934 352 0 8286 205e drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.o
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/8ccfc7bfb2365dcee5b03c81ebe061a927d6da2e.1727541677.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
While this does add overhead to the fast path, it should be minimal
as the cacheline should already be held for write from updating the
queue's rx_packets stat.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use our existing TSO stats, which count enqueued TSO TXes.
Users may expect them to count completions, as tx-packets and
tx-bytes do; however, these are the counters we have, and the
qstats documentation doesn't actually specify.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we handle a TX completion for an XDP packet, it is not counted
in the per-TXQ netdev stats. Record it in new internal counters,
and include those in the device-wide total in efx_get_base_stats().
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The previous patch changed when we increment the RX queue's rx_packets
counter, to match the semantics of netdev per-queue stats. The
differences between the old and new counts are scatter errors (which
produce a WARN_ON) and this counter, which is incremented by
efx_rx_packet__check_len() when an RX packet (which was placed in a
single buffer by SG, i.e. n_frags == 1) has a length (from the RX
event) which is too long to fit in the RX buffer. If this occurs, we
drop the packet and fire a ratelimited netif_err().
The counter previously was not reported anywhere; add it to ethtool -S
output to ensure users still have this information.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Just RX and TX packet counts and TX bytes for now. We do not
have per-queue RX byte counts, which causes us to fail
stats.pkt_byte_sum selftest with "Drivers should always report
basic keys" error.
Per-queue counts are since the last time the queue was inited
(typically by efx_start_datapath(), on ifup or reconfiguration);
device-wide total (efx_get_base_stats()) is since driver probe.
This is not the same lifetime as rtnl_link_stats64, which uses
firmware stats which count since FW (re)booted; this can cause a
"Qstats are lower" or "RTNL stats are lower" failure in
stats.pkt_byte_sum selftest.
Move the increment of rx_queue->rx_packets to match the semantics
specified for netdev per-queue stats, i.e. just before handing
the packet to XDP (if present) or the netstack (through GRO).
This will affect the existing ethtool -S output which also
reports these counters.
XDP TX packets are not yet counted into base_stats.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The n_rx_tobe_disc and n_rx_mcast_mismatch counters are a legacy
from farch, and are never written in EF10 or EF100 code. Remove
them from the struct and from ethtool -S output, saving a bit of
memory and avoiding user confusion.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.
Convert all platform drivers below drivers/net/ethernet to use
.remove(), with the eventual goal to drop struct
platform_driver::remove_new(). As .remove() and .remove_new() have the
same prototypes, conversion is done by just changing the structure
member name in the driver initializer.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/18f7c585a1a8a8ac8b03a2fca7de19bd5c52ac2b.1727949050.git.u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-10-01 (ice)
This series contains updates to ice driver only.
Karol cleans up current PTP GPIO pin handling, fixes minor bugs,
refactors implementation for all products, introduces SDP (Software
Definable Pins) for E825C and implements reading SDP section from NVM
for E810 products.
Sergey replaces multiple aux buses and devices used in the PTP support
code with struct ice_adapter holding the necessary shared data.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: Drop auxbus use for PTP to finalize ice_adapter move
ice: Use ice_adapter for PTP shared data instead of auxdev
ice: Initial support for E825C hardware in ice_adapter
ice: Add ice_get_ctrl_ptp() wrapper to simplify the code
ice: Introduce ice_get_phy_model() wrapper
ice: Enable 1PPS out from CGU for E825C products
ice: Read SDP section from NVM for pin definitions
ice: Disable shared pin on E810 on setfunc
ice: Cache perout/extts requests and check flags
ice: Align E810T GPIO to other products
ice: Add SDPs support for E825C
ice: Implement ice_ptp_pin_desc
====================
Link: https://patch.msgid.link/20241001201702.3252954-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Previously, the TX header requirement for standard frames was ignored.
This requirement is a bitstring sent from the VIOS which maps to the
type of header information needed during TX. If no header information,
is needed then send subcrq direct can be used (which can be more
performant).
This bitstring was previously ignored for standard packets (AKA non LSO,
non CSO) due to the belief that the bitstring was over-cautionary. It
turns out that there are some configurations where the backing device
does need header information for transmission of standard packets. If
the information is not supplied then this causes continuous "Adapter
error" transport events. Therefore, this bitstring should be respected
and observed before considering the use of send subcrq direct.
Fixes: 74839f7a8268 ("ibmvnic: Introduce send sub-crq direct")
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241001163200.1802522-2-nnac123@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It no longer serves any purpose and is identical to mlx5_fc_create upon
which it was originally based of.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241001103709.58127-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
num_counters is only used for deciding whether to grow the bulk query
buffer, which is done once more counters than a small initial threshold
are present. After that, maintaining num_counters serves no purpose.
This commit replaces that with an actual xarray traversal to count the
counters. This appears expensive at first sight, but is only done when
the number of counters is less than the initial threshold (8) and only
once every sampling interval. Once the number of counters goes above the
threshold, the bulk query buffer is grown to max size and the xarray
traversal is never done again.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241001103709.58127-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The mlx5_fc struct has a cache for values queried from hw, which is
cacheline aligned. On x86_64, this results in:
struct mlx5_fc {
u32 id; /* 0 4 */
bool aging; /* 4 1 */
/* XXX 3 bytes hole, try to pack */
struct mlx5_fc_bulk * bulk; /* 8 8 */
/* XXX 48 bytes hole, try to pack */
/* --- cacheline 1 boundary (64 bytes) --- */
struct mlx5_fc_cache cache __attribute__((__aligned__(64)));
/* 64 24 */
u64 lastpackets; /* 88 8 */
u64 lastbytes; /* 96 8 */
/* size: 128, cachelines: 2, members: 6 */
/* sum members: 53, holes: 2, sum holes: 51 */
/* padding: 24 */
/* forced aligns: 1, forced holes: 1, sum forced holes: 48 */
} __attribute__((__aligned__(64)));
(output from pahole).
...So a 48+24=72 byte waste. As far as I can determine, this serves no
purpose other than maybe making sure that the values in the cache do not
span two cachelines in the worst case scenario, but that's not a valid
enough reason to waste 72 bytes per counter, especially since this code
is not performance-critical. There could potentially be hundreds of
thousands of counters (e.g. for connection-tracking), so this quickly
adds up to multiple MB wasted.
This commit removes the alignment, resulting in:
struct mlx5_fc {
[...]
/* size: 56, cachelines: 1, members: 6 */
/* sum members: 53, holes: 1, sum holes: 3 */
/* last cacheline: 56 bytes */
};
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241001103709.58127-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|