Age | Commit message (Collapse) | Author |
|
Add the helper function to look up the ntuple filter from the
hash index and use it in bnxt_rx_flow_steer(). The helper function
will also be used by user defined ntuple filters in the next
patches.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For ntuple filters added by aRFS, the Toeplitz hash calculated by our
NIC is available and is used to store the ntuple filter for quick
retrieval. In the next patches, user defined ntuple filter support
will be added and we need to calculate the same hash for these
filters. The same hash function needs to be used so we can detect
duplicates.
Add the function bnxt_toeplitz() to calculate the Toeplitz hash for
user defined ntuple filters. bnxt_toeplitz() uses the same Toeplitz
key and the same key length as the NIC.
bnxt_get_ntp_filter_idx() is added to return the hash index. For
aRFS, the hash comes from the NIC. For user defined ntuple, we call
bnxt_toeplitz() to calculate the hash index.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Refactor the L2 filter alloc/free logic so that these filters can be
added/deleted by the user.
The bp->ntp_fltr_bmap allocated size is also increased to allow enough
IDs for L2 filters.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With the new bnxt_l2_filter structure, we can now re-structure the
bnxt_ntuple_filter structure to point to the bnxt_l2_filter structure.
We eliminate the L2 ether address info from the ntuple filter structure
as we can get the information from the L2 filter structure. Note that
the source L2 MAC address is no longer used.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current driver only has an array of 4 additional L2 unicast
addresses to support the netdev uc address list. Generalize and
expand this infrastructure with an L2 address hash table so we can
support an expanded list of unicast addresses (for bridges,
macvlans, OVS, etc). The L2 hash table infrastructure will also
allow more generalized n-tuple filter support.
This patch creates the bnxt_l2_filter structure and the hash table.
This L2 filter structure has the same bnxt_filter_base structure
as used in the bnxt_ntuple_filter structure.
All currently supported L2 filters will now have an entry in this
new table.
Note that L2 filters may be created for the VF. VF filters should
not be freed when the PF goes down. Add some logic in
bnxt_free_l2_filters() to allow keeping the VF filters or to free
everything during rmmod.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is in preparation to support user defined L2 (ether) filters,
which will have many similarities with ntuple filters. Refactor
bnxt_ntuple_filter structure to have a bnxt_filter_base structure
that can be re-used by the L2 filters.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some r8168 NICs stop working upon system resume:
[ 688.051096] r8169 0000:02:00.1 enp2s0f1: rtl_ep_ocp_read_cond == 0 (loop: 10, delay: 10000).
[ 688.175131] r8169 0000:02:00.1 enp2s0f1: Link is Down
...
[ 691.534611] r8169 0000:02:00.1 enp2s0f1: PCI error (cmd = 0x0407, status_errs = 0x0000)
Not sure if it's related, but those NICs have a BMC device at function
0:
02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. Realtek RealManage BMC [10ec:816e] (rev 1a)
Trial and error shows that increase the loop wait on
rtl_ep_ocp_read_cond to 30 can eliminate the issue, so let
rtl8168ep_driver_start() to wait a bit longer.
Fixes: e6d6ca6e1204 ("r8169: Add support for another RTL8168FP")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Under heavy traffic, the BlueField Gigabit interface can
become unresponsive. This is due to a possible race condition
in the mlxbf_gige_rx_packet function, where the function exits
with producer and consumer indices equal but there are remaining
packet(s) to be processed. In order to prevent this situation,
read receive consumer index *before* the HW replenish so that
the mlxbf_gige_rx_packet function returns an accurate return
value even if a packet is received into just-replenished buffer
prior to exiting this routine. If the just-replenished buffer
is received and occupies the last RX ring entry, the interface
would not recover and instead would encounter RX packet drops
related to internal buffer shortages since the driver RX logic
is not being triggered to drain the RX ring. This patch will
address and prevent this "ring full" condition.
Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: David Thompson <davthompson@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2023-12-20
mlx5 Socket direct support and management PF profile.
Tariq Says:
===========
Support Socket-Direct multi-dev netdev
This series adds support for combining multiple devices (PFs) of the
same port under one netdev instance. Passing traffic through different
devices belonging to different NUMA sockets saves cross-numa traffic and
allows apps running on the same netdev from different numas to still
feel a sense of proximity to the device and achieve improved
performance.
We achieve this by grouping PFs together, and creating the netdev only
once all group members are probed. Symmetrically, we destroy the netdev
once any of the PFs is removed.
The channels are distributed between all devices, a proper configuration
would utilize the correct close numa when working on a certain app/cpu.
We pick one device to be a primary (leader), and it fills a special
role. The other devices (secondaries) are disconnected from the network
in the chip level (set to silent mode). All RX/TX traffic is steered
through the primary to/from the secondaries.
Currently, we limit the support to PFs only, and up to two devices
(sockets).
===========
Armen Says:
===========
Management PF support and module integration
This patch rolls out comprehensive support for the Management Physical
Function (MGMT PF) within the mlx5 driver. It involves updating the
mlx5 interface header to introduce necessary definitions for MGMT PF
and adding a new management PF netdev profile, which will allow the host
side to communicate with the embedded linux on Blue-field devices.
===========
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the driver accepts VLAN EtherType steering rules regardless of
the configured mask. And things might fail silently or with confusing error
messages to the user. The VLAN EtherType can only be matched by full
mask. Therefore, add a check for that.
For instance the following rule is invalid, but the driver accepts it and
ignores the user specified mask:
|root@host:~# ethtool -N enp3s0 flow-type ether vlan-etype 0x8100 \
| m 0x00ff action 0
|Added rule with ID 63
|root@host:~# ethtool --show-ntuple enp3s0
|4 RX rings available
|Total 1 rules
|
|Filter: 63
| Flow Type: Raw Ethernet
| Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
| Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
| Ethertype: 0x0 mask: 0xFFFF
| VLAN EtherType: 0x8100 mask: 0x0
| VLAN: 0x0 mask: 0xffff
| User-defined: 0x0 mask: 0xffffffffffffffff
| Action: Direct to queue 0
After:
|root@host:~# ethtool -N enp3s0 flow-type ether vlan-etype 0x8100 \
| m 0x00ff action 0
|rmgr: Cannot insert RX class rule: Operation not supported
Fixes: 2b477d057e33 ("igc: Integrate flex filter into ethtool ops")
Suggested-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently the driver accepts VLAN TCI steering rules regardless of the
configured mask. And things might fail silently or with confusing error
messages to the user.
There are two ways to handle the VLAN TCI mask:
1. Match on the PCP field using a VLAN prio filter
2. Match on complete TCI field using a flex filter
Therefore, add checks and code for that.
For instance the following rule is invalid and will be converted into a
VLAN prio rule which is not correct:
|root@host:~# ethtool -N enp3s0 flow-type ether vlan 0x0001 m 0xf000 \
| action 1
|Added rule with ID 61
|root@host:~# ethtool --show-ntuple enp3s0
|4 RX rings available
|Total 1 rules
|
|Filter: 61
| Flow Type: Raw Ethernet
| Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
| Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
| Ethertype: 0x0 mask: 0xFFFF
| VLAN EtherType: 0x0 mask: 0xffff
| VLAN: 0x1 mask: 0x1fff
| User-defined: 0x0 mask: 0xffffffffffffffff
| Action: Direct to queue 1
After:
|root@host:~# ethtool -N enp3s0 flow-type ether vlan 0x0001 m 0xf000 \
| action 1
|rmgr: Cannot insert RX class rule: Operation not supported
Fixes: 7991487ecb2d ("igc: Allow for Flex Filters to be installed")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently the driver allows to configure matching by VLAN EtherType.
However, the retrieval function does not report it back to the user. Add
it.
Before:
|root@host:~# ethtool -N enp3s0 flow-type ether vlan-etype 0x8100 action 0
|Added rule with ID 63
|root@host:~# ethtool --show-ntuple enp3s0
|4 RX rings available
|Total 1 rules
|
|Filter: 63
| Flow Type: Raw Ethernet
| Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
| Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
| Ethertype: 0x0 mask: 0xFFFF
| Action: Direct to queue 0
After:
|root@host:~# ethtool -N enp3s0 flow-type ether vlan-etype 0x8100 action 0
|Added rule with ID 63
|root@host:~# ethtool --show-ntuple enp3s0
|4 RX rings available
|Total 1 rules
|
|Filter: 63
| Flow Type: Raw Ethernet
| Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
| Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
| Ethertype: 0x0 mask: 0xFFFF
| VLAN EtherType: 0x8100 mask: 0x0
| VLAN: 0x0 mask: 0xffff
| User-defined: 0x0 mask: 0xffffffffffffffff
| Action: Direct to queue 0
Fixes: 2b477d057e33 ("igc: Integrate flex filter into ethtool ops")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Prevent VF from configuring filters with unsupported actions or use
REDIRECT action with invalid tc number. Current checks could cause
out of bounds access on PF side.
Fixes: e284fc280473 ("i40e: Add and delete cloud filter")
Reviewed-by: Andrii Staikov <andrii.staikov@intel.com>
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Stop dividing the phase_offset value received from firmware. This fault
is present since the initial implementation.
The phase_offset value received from firmware is in 0.01ps resolution.
Dpll subsystem is using the value in 0.001ps, raw value is adjusted
before providing it to the user.
The user can observe the value of phase offset with response to
`pin-get` netlink message of dpll subsystem for an active pin:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \
--do pin-get --json '{"id":2}'
Where example of correct response would be:
{'board-label': 'C827_0-RCLKA',
'capabilities': 6,
'clock-id': 4658613174691613800,
'frequency': 1953125,
'id': 2,
'module-name': 'ice',
'parent-device': [{'direction': 'input',
'parent-id': 6,
'phase-offset': -216839550,
'prio': 9,
'state': 'connected'},
{'direction': 'input',
'parent-id': 7,
'phase-offset': -42930,
'prio': 8,
'state': 'connected'}],
'phase-adjust': 0,
'phase-adjust-max': 16723,
'phase-adjust-min': -16723,
'type': 'mux'}
Provided phase-offset value (-42930) shall be divided by the user with
DPLL_PHASE_OFFSET_DIVIDER to get actual value of -42.930 ps.
Before the fix, the response was not correct:
{'board-label': 'C827_0-RCLKA',
'capabilities': 6,
'clock-id': 4658613174691613800,
'frequency': 1953125,
'id': 2,
'module-name': 'ice',
'parent-device': [{'direction': 'input',
'parent-id': 6,
'phase-offset': -216839,
'prio': 9,
'state': 'connected'},
{'direction': 'input',
'parent-id': 7,
'phase-offset': -42,
'prio': 8,
'state': 'connected'}],
'phase-adjust': 0,
'phase-adjust-max': 16723,
'phase-adjust-min': -16723,
'type': 'mux'}
Where phase-offset value (-42), after division
(DPLL_PHASE_OFFSET_DIVIDER) would be: -0.042 ps.
Fixes: 8a3a565ff210 ("ice: add admin commands to access cgu configuration")
Fixes: 90e1c90750d7 ("ice: dpll: implement phase related callbacks")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Disabling netdev with ethtool private flag "link-down-on-close" enabled
can cause NULL pointer dereference bug. Shut down VSI regardless of
"link-down-on-close" state.
Fixes: 8ac7132704f3 ("ice: Fix interface being down after reset with link-down-on-close flag on")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The driver should not report an error message when for a medialess port
the link_down_on_close flag is enabled and the physical link cannot be
set down.
Fixes: 8ac7132704f3 ("ice: Fix interface being down after reset with link-down-on-close flag on")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Katarzyna Wieczerzycka <katarzyna.wieczerzycka@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Couple of structures was not marked as __packed. This patch
fixes the same and mark them as __packed.
Fixes: 42006910b5ea ("octeontx2-af: cleanup KPU config data")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Size of the virtchnl2_rss_key struct should be 7 bytes but the
compiler introduces a padding byte for the structure alignment.
This results in idpf sending an additional byte of memory to the device
control plane than the expected buffer size. As the control plane
enforces virtchnl message size checks to validate the message,
set RSS key message fails resulting in the driver load failure.
Remove implicit compiler padding by using "__packed" structure
attribute for the virtchnl2_rss_key struct.
Also there is no need to use __DECLARE_FLEX_ARRAY macro for the
'key_flex' struct field. So drop it.
Fixes: 0d7502a9b4a7 ("virtchnl: add virtchnl version 2 ops")
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Scott Register <scott.register@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
idpf_ring::skb serves only for keeping an incomplete frame between
several NAPI Rx polling cycles, as one cycle may end up before
processing the end of packet descriptor. The pointer is taken from
the ring onto the stack before entering the loop and gets written
there after the loop exits. When inside the loop, only the onstack
pointer is used.
For some reason, the logics is broken in the singleq mode, where the
pointer is taken from the ring each iteration. This means that if a
frame got fragmented into several descriptors, each fragment will have
its own skb, but only the last one will be passed up the stack
(containing garbage), leaving the rest leaked.
Then, on ifdown, rxq::skb is being freed only in the splitq mode, while
it can point to a valid skb in singleq as well. This can lead to a yet
another skb leak.
Just don't touch the ring skb field inside the polling loop, letting
the onstack skb pointer work as expected: build a new skb if it's the
first frame descriptor and attach a frag otherwise. On ifdown, free
rxq::skb unconditionally if the pointer is non-NULL.
Fixes: a5ab9ee0df0b ("idpf: add singleq start_xmit and napi poll")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Scott Register <scott.register@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
In case a DPAA2 switch interface joins a bridge, the FDB used on the
port will be changed to the one associated with the bridge. What this
means exactly is that any VLAN installed on the port will need to be
removed and then installed back so that it points to the new FDB.
Once this is done, the previous FDB will become unused (no VLAN to
point to it). Even though no traffic will reach this FDB, it's best to
just cleanup the state of the FDB by zeroing its egress flood domain.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Two different DPAA2 switch ports from two different DPSW instances
cannot be under the same bridge. Instead of checking for this
unsupported configuration in the CHANGEUPPER event, check it as early as
possible in the PRECHANGEUPPER one.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Create separate functions, dpaa2_switch_port_prechangeupper and
dpaa2_switch_port_changeupper, to be called directly when a DPSW port
changes its upper device.
This way we are not open-coding everything in the main event callback
and we can easily extent, for example, with bond offload.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The DPSW object has multiple event sources multiplexed over the same
IRQ. The driver has the capability to configure only some of these
events to trigger the IRQ.
The dpsw_get_irq_status() can clear events automatically based on the
value stored in the 'status' variable passed to it. We don't want that
to happen because we could get into a situation when we are clearing
more events than we actually handled.
Just resort to manually clearing the events that we handled. Also, since
status is not used on the out path we remove its initialization to zero.
This change does not have a user-visible effect because the dpaa2-switch
driver enables and handles all the DPSW events which exist at the
moment.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 84cba72956fd ("dpaa2-switch: integrate the MAC endpoint support")
added support for MAC endpoints in the dpaa2-switch driver but omitted
to add the ENDPOINT_CHANGED irq to the list of interrupt sources. Fix
this by extending the list of events which can raise an interrupt by
extending the mask passed to the dpsw_set_irq_mask() firmware API.
There is no user visible impact even without this patch since whenever a
switch interface is connected/disconnected from an endpoint both events
are set (LINK_CHANGED and ENDPOINT_CHANGED) and, luckily, the
LINK_CHANGED event could actually raise the interrupt and thus get the
MAC/PHY SW configuration started.
Even with this, it's better to just not rely on undocumented firmware
behavior which can change.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Print a netdev error when we hit a case in which a specific VLAN is
already configured on the port. While at it, change the already existing
netdev_warn into an _err for consistency purposes.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is no restriction around the change of the MAC address on the
switch ports, thus declare the interface netdevs IFF_LIVE_ADDR_CHANGE
capable.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is no point in updating the MAC address of a switch interface each
time the link state changes, this only needs to happen in case the
endpoint changes (the switch interface is [dis]connected from/to a MAC).
Just move the call to dpaa2_switch_port_set_mac_addr() under
DPSW_IRQ_EVENT_ENDPOINT_CHANGED.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add SW IRQ coalescing based on hrtimers for TX and RX data path which
can be enabled by ethtool commands:
- RX coalescing
ethtool -C eth1 rx-usecs 50
- TX coalescing can be enabled per TX queue
- by default enables coalesing for TX0
ethtool -C eth1 tx-usecs 50
- configure TX0
ethtool -Q eth0 queue_mask 1 --coalesce tx-usecs 100
- configure TX1
ethtool -Q eth0 queue_mask 2 --coalesce tx-usecs 100
- configure TX0 and TX1
ethtool -Q eth0 queue_mask 3 --coalesce tx-usecs 100 --coalesce tx-usecs 100
show configuration for TX0 and TX1:
ethtool -Q eth0 queue_mask 3 --show-coalesce
Comparing to gro_flush_timeout and napi_defer_hard_irqs, this patch
allows to enable IRQ coalesing for RX path separately.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add driver support for viewing / changing the MAC Merge sublayer
parameters and seeing the verification state machine's current state
via ethtool.
As hardware does not support interrupt notification for verification
events we resort to polling on link up. On link up we try a couple of
times for verification success and if unsuccessful then give up.
The Frame Preemption feature is described in the Technical Reference
Manual [1] in section:
12.3.1.4.6.7 Intersperced Express Traffic (IET – P802.3br/D2.0)
Due to Silicon Errata i2208 [2] we set limit min IET fragment size to
124 (excluding 4 bytes mCRC).
[1] AM62x TRM - https://www.ti.com/lit/ug/spruiv7a/spruiv7a.pdf
[2] AM62x Silicon Errata - https://www.ti.com/lit/er/sprz487c/sprz487c.pdf
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds MQPRIO Qdisc offload in full 'channel' mode which allows
not only setting up pri:tc mapping, but also configuring TX shapers
(rate-limiting) on external port FIFOs.
The MQPRIO Qdisc offload is expected to work with or without VLAN/priority
tagged packets.
The CPSW external Port FIFO has 8 Priority queues. The rate-limit can be
set for each of these priority queues. Which Priority queue a packet is
assigned to depends on PN_REG_TX_PRI_MAP register which maps header
priority to switch priority.
The header priority of a packet is assigned via the RX_PRI_MAP_REG which
maps packet priority to header priority.
The packet priority is either the VLAN priority (for VLAN tagged packets)
or the thread/channel offset.
For simplicity, we assign the same priority queue to all queues of a
Traffic Class so it can be rate-limited correctly.
Configuration example:
ethtool -L eth1 tx 5
ethtool --set-priv-flags eth1 p0-rx-ptype-rrobin off
tc qdisc add dev eth1 parent root handle 100: mqprio num_tc 3 \
map 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 \
queues 1@0 1@1 1@2 hw 1 mode channel \
shaper bw_rlimit min_rate 0 100mbit 200mbit max_rate 0 101mbit 202mbit
tc qdisc replace dev eth2 handle 100: parent root mqprio num_tc 1 \
map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 hw 1
ip link add link eth1 name eth1.100 type vlan id 100
ip link set eth1.100 type vlan egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
In the above example two ports share the same TX CPPI queue 0 for low
priority traffic. 3 traffic classes are defined for eth1 and mapped to:
TC0 - low priority, TX CPPI queue 0 -> ext Port 1 fifo0, no rate limit
TC1 - prio 2, TX CPPI queue 1 -> ext Port 1 fifo1, CIR=100Mbit/s, EIR=1Mbit/s
TC2 - prio 3, TX CPPI queue 2 -> ext Port 1 fifo2, CIR=200Mbit/s, EIR=2Mbit/s
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move register definitions to header file. No functional change.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move this code around to avoid forward declaration.
No functional change.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Handle offloading commands using switch-case in
am65_cpsw_setup_taprio().
Move checks to am65_cpsw_taprio_replace().
Use NL_SET_ERR_MSG_MOD for error messages.
Change error message from "Failed to set cycle time extension"
to "cycle time extension not supported"
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We will use this Kconfig option to not only enable TAS/EST offload
but also other QoS features like Multiqueue priority descriptors
and MAC-Merge/Frame Preemption. TI_AM65_CPSW_QOS seems a more
appropriate Kconfig option name than TI_AM65_CPSW_TAS.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Build am65-cpsw-qos only if CONFIG_TI_AM65_CPSW_TAS is enabled.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There was a memory leak during error handling in function
npc_mcam_rsrcs_init().
Fixes: dd7842878633 ("octeontx2-af: Add new devlink param to configure maximum usable NIX block LFs")
Suggested-by: Simon Horman <horms@kernel.org>
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
intel: use bitfield operations
Jesse Brandeburg says:
After repeatedly getting review comments on new patches, and sporadic
patches to fix parts of our drivers, we should just convert the Intel code
to use FIELD_PREP() and FIELD_GET(). It's then "common" in the code and
hopefully future change-sets will see the context and do-the-right-thing.
This conversion was done with a coccinelle script which is mentioned in the
commit messages. Generally there were only a couple conversions that were
"undone" after the automatic changes because they tried to convert a
non-contiguous mask.
Patch 1 is required at the beginning of this series to fix a "forever"
issue in the e1000e driver that fails the compilation test after conversion
because the shift / mask was out of range.
The second patch just adds all the new #includes in one go.
The patch titled: "ice: fix pre-shifted bit usage" is needed to allow the
use of the FIELD_* macros and fix up the unexpected "shifts included"
defines found while creating this series.
The rest are the conversion to use FIELD_PREP()/FIELD_GET(), and the
occasional leXX_{get,set,encode}_bits() call, as suggested by Alex.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cross-merge networking fixes after downstream PR.
Adjacent changes:
drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
23c93c3b6275 ("bnxt_en: do not map packet buffers twice")
6d1add95536b ("bnxt_en: Modify TX ring indexing logic.")
tools/testing/selftests/net/Makefile
2258b666482d ("selftests: add vlan hw filter tests")
a0bc96c0cd6e ("selftests: net: verify fq per-band packet limit")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-12-18 (ice)
This series contains updates to ice driver only.
Jakes stops clearing of needed aggregator information.
Dave adds a check for LAG device support before initializing the
associated event handler.
Larysa restores accounting of XDP queues in TC configurations.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Fix PF with enabled XDP going no-carrier after reset
ice: alter feature support check for SRIOV and LAG
ice: stop trashing VF VSI aggregator node ID information
====================
Link: https://lore.kernel.org/r/20231218192708.3397702-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
mtk_wed_wo_queue_tx_clean()
In order to avoid a NULL pointer dereference, check entry->buf pointer before running
skb_free_frag in mtk_wed_wo_queue_tx_clean routine.
Fixes: 799684448e3e ("net: ethernet: mtk_wed: introduce wed wo support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/3c1262464d215faa8acebfc08869798c81c96f4a.1702827359.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add management PF modules, which introduce support for the structures
needed to create the resources for the MGMT PF to work.
Also, add the necessary calls and functions to establish this
functionality.
Signed-off-by: Armen Ratner <armeng@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Daniel Jurgens <danielj@nvidia.com>
|
|
Have an actual mlx5_sd instance in the core device, and fix the getter
accordingly. This allows SD stuff to flow, the feature becomes supported
only here.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
1) Each TX TLS device offloaded context has its own TIS object. Extra work
is needed to get it working in a SD environment, where a stream can move
between different SQs (belonging to different mdevs).
2) Each RX TLS device offloaded context needs a DEK object from the DEK
pool.
Extra work is needed to get it working in a SD environment, as the DEK
pool currently falsely depends on TX cap, and is on the primary device
only.
Disallow this combination for now.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Each queue counter object counts some events (in hardware) for the RQs
that are attached to it, like events of packet drops due to no receive
WQE (rx_out_of_buffer).
Each RQ can be attached to a queue counter only within the same vhca. To
still cover all RQs with these counters, we create multiple instances,
one per vhca.
The result that's shown to the user is now the sum of all instances.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Implement driver support for the HW feature that allows RX steering of
one device to target other device's RQs.
In SD multi-mdev netdev mode, we set the secondaries into silent mode,
disconnecting them from the network. This feature is then used to steer
traffic from the primary to the secondaries.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Distribute the channels between the different SD-devices to acheive
local numa node performance on multiple numas.
Each channel works against one specific mdev, creating all datapath
queues against it.
We distribute channels to mdevs in a round-robin policy.
Example for 2 mdevs and 6 channels:
+-------+---------+
| ch ix | mdev ix |
+-------+---------+
| 0 | 0 |
| 1 | 1 |
| 2 | 0 |
| 3 | 1 |
| 4 | 0 |
| 5 | 1 |
+-------+---------+
This round-robin distribution policy is preferred over another suggested
intuitive distribution, in which we first distribute one half of the
channels to mdev #0 and then the second half to mdev #1.
We prefer round-robin for a reason: it is less influenced by changes in
the number of channels. The mapping between channel index and mdev is
fixed, no matter how many channels the user configures. As the channel
stats are persistent to channels closure, changing the mapping every
single time would turn the accumulative stats less representing of the
channel's history.
Per-channel objects should stop using the primary mdev (priv->mdev)
directly, and instead move to using their own channel's mdev.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Traffic queues will be created on all devices, including the
secondaries. Create the needed core layer resources for them as well.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Integrate the SD library calls into the auxiliary_driver ops in
preparation for creating a single netdev for the multiple devices
belonging to the same SD group.
SD is still disabled at this stage. It is enabled by a downstream patch
when all needed parts are implemented.
The netdev is created only when the SD group, with all its participants,
are ready. It is later destroyed if any of the participating devices
drops.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Print to kernel log when an SD group moves from/to ready state.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Implement the needed SD steering adjustments for the primary and
secondaries.
While the SD multiple devices are used to avoid cross-numa memory, when
it comes to chip level all traffic goes only through the primary device.
The secondaries are forced to silent mode, to guarantee they are not
involved in any unexpected ingress/egress traffic.
In RX, secondary devices will not have steering objects. Traffic will be
steered from the primary device to the RQs of a secondary device using
advanced cross-vhca RX steering capabilities.
In TX, the primary creates a new TX flow table, which is aliased by the
secondaries.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|