summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox
AgeCommit message (Collapse)Author
2022-11-16mlxsw: update adjfine to use adjust_by_scaled_ppmJacob Keller
The mlxsw adjfine implementation in the spectrum_ptp.c file converts scaled_ppm into ppb before updating a cyclecounter multiplier using the standard "base * ppb / 1billion" calculation. This can be re-written to use adjust_by_scaled_ppm, directly using the scaled parts per million and reducing the amount of code required to express this calculation. We still calculate the parts per billion for passing into mlxsw_sp_ptp_phc_adjfreq because this function requires the input to be in parts per billion. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Amit Cohen <amcohen@nvidia.com> Cc: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20221114213701.815132-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-12net/mlx5e: ethtool: get_link_ext_stats for PHY down eventsSaeed Mahameed
Implement ethtool_op get_link_ext_stats for PHY down events Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com>
2022-11-12net/mlx5e: CT, optimize pre_ct table lookupOz Shlomo
The pre_ct table realizes in hardware the act_ct cache logic, bypassing the CT table if the ct state was already set by a previous ct lookup. As such, the pre_ct table will always miss for chain 0 filters. Optimize the pre_ct table lookup for rules installed on chain 0. Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5e: kTLS, Use a single async context object per a callback bulkTariq Toukan
A single async context object is sufficient to wait for the completions of many callbacks. Switch to using one instance per a bulk of commands. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5e: kTLS, Remove unnecessary per-callback completionTariq Toukan
Waiting on a completion object for each callback before cleaning up their async contexts is not necessary, as this is already implied in the mlx5_cmd_cleanup_async_ctx() API. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5e: kTLS, Remove unused work fieldTariq Toukan
Work field in struct mlx5e_async_ctx is not used. Remove it. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5e: TC, Remove redundant WARN_ON()Roi Dayan
The case where the packet is not offloaded and needs to be restored to slow path and couldn't find expected tunnel information should not dump a call trace to the user. there is a debug call. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5e: Add error flow when failing update_rxGuy Truzman
Up until now, return value of update_rx was ignored. Therefore, flow continues even if it fails. Add error flow in case of update_rx fails in mlx5e_open_locked, mlx5i_open and mlx5i_pkey_open. Signed-off-by: Guy Truzman <gtruzman@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5e: Move params kernel log print to probe functionTariq Toukan
Params info print was meant to be printed on load. With time, new calls to mlx5e_init_rq_type_params and mlx5e_build_rq_params were added, mistakenly printing the params once again. Move the print to were it belongs, in mlx5e_probe. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5e: Support enhanced CQE compressionOfer Levi
CQE compression feature improves performance by reducing PCI bandwidth bottleneck on CQEs write. Enhanced CQE compression introduced in ConnectX-6 and it aims to reduce CPU utilization of SW side packets decompression by eliminating the need to rewrite ownership bit, which is likely to cost a cache-miss, is replaced by validity byte handled solely by HW. Another advantage of the enhanced feature is that session packets are available to SW as soon as a single CQE slot is filled, instead of waiting for session to close, this improves packet latency from NIC to host. Performance: Following are tested scenarios and reults comparing basic and enahnced CQE compression. setup: IXIA 100GbE connected directly to port 0 and port 1 of ConnectX-6 Dx 100GbE dual port. Case #1 RX only, single flow goes to single queue: IRQ rate reduced by ~ 30%, CPU utilization improved by 2%. Case #2 IP forwarding from port 1 to port 0 single flow goes to single queue: Avg latency improved from 60us to 21us, frame loss improved from 0.5% to 0.0%. Case #3 IP forwarding from port 1 to port 0 Max Throughput IXIA sends 100%, 8192 UDP flows, goes to 24 queues: Enhanced is equal or slightly better than basic. Testing the basic compression feature with this patch shows there is no perfrormance degradation of the basic compression feature. Signed-off-by: Ofer Levi <oferle@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5e: Use clamp operation instead of open coding itGal Pressman
Replace the min/max operations with a single clamp. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5e: remove unused list in arfsAnisse Astier
This is never used, and probably something that was intended to be used before per-protocol hash tables were chosen instead. Signed-off-by: Anisse Astier <anisse@astier.eu> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5: Expose vhca_id to debugfsEli Cohen
hca_id is an identifier of an mlx5_core instance within the hardware. This identifier may be required for troubleshooting. Expose it to debugfs. Example: $ cat /sys/kernel/debug/mlx5/mlx5_core.sf.2/vhca_id 0x12 Signed-off-by: Eli Cohen <elic@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5: Unregister traps on driver unload flowMoshe Shemesh
Before this patch, devlink traps are registered only on full driver probe and unregistered on driver removal. As devlink traps are not usable once driver functionality is unloaded, it should be unrgeistered also on flows that unload the driver and then registered when loaded back, e.g. devlink reload flow. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5: Fix spelling mistake "destoy" -> "destroy"Colin Ian King
There is a spelling mistake in an error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-12net/mlx5: Bridge, Use debug instead of warn if entry doesn't existsRoi Dayan
There is no need for the warn if entry already removed. Use debug print like in the update flow. Also update the messages so user can identify if the it's from the update flow or remove flow. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/can/pch_can.c ae64438be192 ("can: dev: fix skb drop check") 1dd1b521be85 ("can: remove obsolete PCH CAN driver") https://lore.kernel.org/all/20221110102509.1f7d63cc@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-09mlxsw: spectrum_switchdev: Add locked bridge port supportIdo Schimmel
Add locked bridge port support by reacting to changes in the 'BR_PORT_LOCKED' flag. When set, enable security checks on the local port via the previously added SPFSR register. When security checks are enabled, an incoming packet will trigger an FDB lookup with the packet's source MAC and the FID it was classified to. If an FDB entry was not found or was found to be pointing to a different port, the packet will be dropped. Such packets increment the "discard_ingress_general" ethtool counter. For added visibility, user space can trap such packets to the CPU by enabling the "locked_port" trap. Example: # devlink trap set pci/0000:06:00.0 trap locked_port action trap Unlike other configurations done via bridge port flags (e.g., learning, flooding), security checks are enabled in the device on a per-port basis and not on a per-{port, VLAN} basis. As such, scenarios where user space can configure different locking settings for different VLANs configured on a port need to be vetoed. To that end, veto the following scenarios: 1. Locking is set on a bridge port that is a VLAN upper 2. Locking is set on a bridge port that has VLAN uppers 3. VLAN upper is configured on a locked bridge port Examples: # bridge link set dev swp1.10 locked on Error: mlxsw_spectrum: Locked flag cannot be set on a VLAN upper. # ip link add link swp1 name swp1.10 type vlan id 10 # bridge link set dev swp1 locked on Error: mlxsw_spectrum: Locked flag cannot be set on a bridge port that has VLAN uppers. # bridge link set dev swp1 locked on # ip link add link swp1 name swp1.10 type vlan id 10 Error: mlxsw_spectrum: VLAN uppers are not supported on a locked port. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-09mlxsw: spectrum_switchdev: Use extack in bridge port flag validationIdo Schimmel
Propagate extack to mlxsw_sp_port_attr_br_pre_flags_set() in order to communicate error messages related to bridge port flag validation. Example: # bridge link set dev swp1 locked on Error: mlxsw_spectrum: Unsupported bridge port flag. More error messages will be added in subsequent patches. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-09mlxsw: spectrum_switchdev: Add support for locked FDB notificationsIdo Schimmel
In Spectrum, learning happens in parallel to the security checks. Therefore, regardless of the result of the security checks, a learning notification will be generated by the device and polled later on by the driver. Currently, the driver reacts to learning notifications by programming corresponding FDB entries to the device. When a port is locked (i.e., has security checks enabled), this can no longer happen, as otherwise any host will blindly gain authorization. Instead, notify the learned entry as a locked entry to the bridge driver that will in turn notify it to user space, in case MAB is enabled. User space can then decide to authorize the host by clearing the "locked" flag, which will cause the entry to be programmed to the device. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-09mlxsw: spectrum_switchdev: Prepare for locked FDB notificationsIdo Schimmel
Subsequent patches will need to report locked FDB entries to the bridge driver. Prepare for that by adding a 'locked' argument to mlxsw_sp_fdb_call_notifiers() according to which the 'locked' bit is set in the FDB notification info. For now, always pass 'false'. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-09mlxsw: spectrum: Add an API to configure security checksIdo Schimmel
Add an API to enable or disable security checks on a local port. It will be used by subsequent patches when the 'BR_PORT_LOCKED' flag is toggled. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-09mlxsw: reg: Add Switch Port FDB Security RegisterIdo Schimmel
Add the Switch Port FDB Security Register (SPFSR) that allows enabling and disabling security checks on a given local port. In Linux terms, it allows locking / unlocking a port. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-09mlxsw: spectrum_trap: Register 802.1X packet traps with devlinkIdo Schimmel
Register the previously added packet traps with devlink. This allows user space to tune their policers and in the case of the locked port trap, user space can set its action to "trap" in order to gain visibility into packets that were discarded by the device due to the locked port check failure. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-09net/mlx5e: TC, Fix slab-out-of-bounds in parse_tc_actionsRoi Dayan
esw_attr is only allocated if namespace is fdb. BUG: KASAN: slab-out-of-bounds in parse_tc_actions+0xdc6/0x10e0 [mlx5_core] Write of size 4 at addr ffff88815f185b04 by task tc/2135 CPU: 5 PID: 2135 Comm: tc Not tainted 6.1.0-rc2+ #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x57/0x7d print_report+0x170/0x471 ? parse_tc_actions+0xdc6/0x10e0 [mlx5_core] kasan_report+0xbc/0xf0 ? parse_tc_actions+0xdc6/0x10e0 [mlx5_core] parse_tc_actions+0xdc6/0x10e0 [mlx5_core] Fixes: 94d651739e17 ("net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-09net/mlx5e: E-Switch, Fix comparing termination table instanceRoi Dayan
The pkt_reformat pointer being saved under flow_act and not dest attribute in the termination table instance. Fix the comparison pointers. Also fix returning success if one pkt_reformat pointer is null and the other is not. Fixes: 249ccc3c95bd ("net/mlx5e: Add support for offloading traffic from uplink to uplink") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-09net/mlx5e: TC, Fix wrong rejection of packet-per-second policingJianbo Liu
In the bellow commit, we added support for PPS policing without removing the check which block offload of such cases. Fix it by removing this check. Fixes: a8d52b024d6d ("net/mlx5e: TC, Support offloading police action") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-09net/mlx5e: Fix tc acts array not to be dependent on enum orderRoi Dayan
The tc acts array should not be dependent on kernel internal flow action id enum. Fix the array initialization. Fixes: fad547906980 ("net/mlx5e: Add tc action infrastructure") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-09net/mlx5e: Fix usage of DMA sync APIMaxim Mikityanskiy
DMA sync functions should use the same direction that was used by DMA mapping. Use DMA_BIDIRECTIONAL for XDP_TX from regular RQ, which reuses the same mapping that was used for RX, and DMA_TO_DEVICE for XDP_TX from XSK RQ and XDP_REDIRECT, which establish a new mapping in this direction. On the RX side, use the same direction that was used when setting up the mapping (DMA_BIDIRECTIONAL for XDP, DMA_FROM_DEVICE otherwise). Also don't skip sync for device when establishing a DMA_FROM_DEVICE mapping for RX, as some architectures (ARM) may require invalidating caches before the device can use the mapping. It doesn't break the bugfix made in commit 0b7cfa4082fb ("net/mlx5e: Fix page DMA map/unmap attributes"), since the bug happened on unmap. Fixes: 0b7cfa4082fb ("net/mlx5e: Fix page DMA map/unmap attributes") Fixes: b5503b994ed5 ("net/mlx5e: XDP TX forwarding support") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-09net/mlx5e: Add missing sanity checks for max TX WQE sizeMaxim Mikityanskiy
The commit cited below started using the firmware capability for the maximum TX WQE size. This commit adds an important check to verify that the driver doesn't attempt to exceed this capability, and also restores another check mistakenly removed in the cited commit (a WQE must not exceed the page size). Fixes: c27bd1718c06 ("net/mlx5e: Read max WQEBBs on the SQ from firmware") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-09net/mlx5: fw_reset: Don't try to load device in case PCI isn't workingShay Drory
In case PCI reads fail after unload, there is no use in trying to load the device. Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-09net/mlx5: E-switch, Set to legacy mode if failed to change switchdev modeChris Mi
No need to rollback to the other mode because probably will fail again. Just set to legacy mode and clear fdb table created flag. So that fdb table will not be cleared again. Fixes: f019679ea5f2 ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-09net/mlx5: Allow async trigger completion execution on single CPU systemsRoy Novich
For a single CPU system, the kernel thread executing mlx5_cmd_flush() never releases the CPU but calls down_trylock(&cmd→sem) in a busy loop. On a single processor system, this leads to a deadlock as the kernel thread which executes mlx5_cmd_invoke() never gets scheduled. Fix this, by adding the cond_resched() call to the loop, allow the command completion kernel thread to execute. Fixes: 8e715cd613a1 ("net/mlx5: Set command entry semaphore up once got index free") Signed-off-by: Alexander Schmidt <alexschm@de.ibm.com> Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-09net/mlx5: Bridge, verify LAG state when adding bond to bridgeVlad Buslov
Mlx5 LAG is initialized asynchronously on a workqueue which means that for a brief moment after setting mlx5 UL representors as lower devices of a bond netdevice the LAG itself is not fully initialized in the driver. When adding such bond device to a bridge mlx5 bridge code will not consider it as offload-capable, skip creating necessary bookkeeping and fail any further bridge offload-related commands with it (setting VLANs, offloading FDBs, etc.). In order to make the error explicit during bridge initialization stage implement the code that detects such condition during NETDEV_PRECHANGEUPPER event and returns an error. Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-11-03net: remove unused ndo_get_devlink_portJiri Pirko
Remove ndo_get_devlink_port which is no longer used alongside with the implementations in drivers. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03net: devlink: remove netdev arg from devlink_port_type_eth_set()Jiri Pirko
Since devlink_port_type_eth_set() should no longer be called by any driver with netdev pointer as it should rather use SET_NETDEV_DEVLINK_PORT, remove the netdev arg. Add a warn to type_clear() Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03net: make drivers to use SET_NETDEV_DEVLINK_PORT to set devlink_portJiri Pirko
Benefit from the previously implemented tracking of netdev events in devlink code and instead of calling devlink_port_type_eth_set() and devlink_port_type_clear() to set devlink port type and link to related netdev, use SET_NETDEV_DEVLINK_PORT() macro to assign devlink_port pointer to netdevice which is about to be registered. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-31ptp: mlx5: convert to .adjfine and adjust_by_scaled_ppmJacob Keller
The mlx5 implementation of .adjfreq is implemented in terms of a straight forward "base * ppb / 1 billion" calculation. Convert this to the .adjfine interface and use adjust_by_scaled_ppm for the calculation of the new mult value. Note that the mlx5_ptp_adjfreq_real_time function expects input in terms of ppb, so use the scaled_ppm_to_ppb to convert before passing to this function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Shirly Ohnona <shirlyo@nvidia.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Gal Pressman <gal@nvidia.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Aya Levin <ayal@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31ptp: mlx4: convert to .adjfine and adjust_by_scaled_ppmJacob Keller
The mlx4 implementation of .adjfreq is implemented in terms of a straight forward "base * ppb / 1 billion" calculation. Convert this driver to .adjfine and use adjust_by_scaled_ppm to perform the calculation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28Merge tag 'mlx5-updates-2022-10-24' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-10-24 SW steering updates from Yevgeny Kliteynik: 1) 1st Four patches: small fixes / optimizations for SW steering: - Patch 1: Don't abort destroy flow if failed to destroy table - continue and free everything else. - Patches 2 and 3 deal with fast teardown: + Skip sync during fast teardown, as PCI device is not there any more. + Check device state when polling CQ - otherwise SW steering keeps polling the CQ forever, because nobody is there to flush it. - Patch 4: Removing unneeded function argument. 2) Deal with the hiccups that we get during rules insertion/deletion, which sometimes reach 1/4 of a second. While insertion/deletion rate improvement was not the focus here, it still is a by-product of removing these hiccups. Another by-product is the reduced standard deviation in measuring the duration of rules insertion/deletion bursts. In the testing we add K rules (warm-up phase), and then continuously do insertion/deletion bursts of N rules. During the test execution, the driver measures hiccups (amount and duration) and total time for insertion/deletion of a batch of rules. Here are some numbers, before and after these patches: +--------------------------------------------+-----------------+----------------+ | | Create rules | Delete rules | | +--------+--------+--------+-------+ | | Before | After | Before | After | +--------------------------------------------+--------+--------+--------+-------+ | Max hiccup [msec] | 253 | 42 | 254 | 68 | +--------------------------------------------+--------+--------+--------+-------+ | Avg duration of 10K rules add/remove [msec]| 140.07 | 124.32 | 106.99 | 99.51 | +--------------------------------------------+--------+--------+--------+-------+ | Num of hiccups per 100K rules add/remove | 7.77 | 7.97 | 12.60 | 11.57 | +--------------------------------------------+--------+--------+--------+-------+ | Avg hiccup duration [msec] | 36.92 | 33.25 | 36.15 | 33.74 | +--------------------------------------------+--------+--------+--------+-------+ - Patch 5: Allocate a short array on stack instead of dynamically- it is destroyed at the end of the function. - Patch 6: Rather than cleaning the corresponding chunk's section of ste_arrays on chunk deletion, initialize these areas upon chunk creation. Chunk destruction tend to come in large batches (during pool syncing), so instead of doing huge memory initialization during pool sync, we amortize this by doing small initsializations on chunk creation. - Patch 7: In order to simplifies error flow and allows cleaner addition of new pools, handle creation/destruction of all the domain's memory pools and other memory-related fields in a separate init/uninit functions. - Patch 8: During rehash, write each table row immediately instead of waiting for the whole table to be ready and writing it all - saves allocations of ste_send_info structures and improves performance. - Patch 9: Instead of allocating/freeing send info objects dynamically, manage them in pool. The number of send info objects doesn't depend on number of rules, so after pre-populating the pool with an initial batch of send info objects, the pool is not expected to grow. This way we save alloc/free during writing STEs to ICM, which by itself can sometimes take up to 40msec. - Patch 10: Allocate icm_chunks from their own slab allocator, which lowered the alloc/free "hiccups" frequency. - Patch 11: Similar to patch 9, allocate htbl from its own slab allocator. - Patch 12: Lower sync threshold for ICM hot memory - set the threshold for sync to 1/4 of the pool instead of 1/2 of the pool. Although we will have more syncs, each sync will be shorter and will help with insertion rate stability. Also, notice that the overall number of hiccups wasn't increased due to all the other patches. - Patch 13: Keep track of hot ICM chunks in an array instead of list. After steering sync, we traverse the hot list and finally free all the chunks. It appears that traversing a long list takes unusually long time due to cache misses on many entries, which causes a big "hiccup" during rule insertion. This patch replaces the list with pre-allocated array that stores only the bookkeeping information that is needed to later free the chunks in its buddy allocator. - Patch 14: Remove the unneeded buddy used_list - we don't need to have the list of used chunks, we only need the total amount of used memory. * tag 'mlx5-updates-2022-10-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: DR, Remove the buddy used_list net/mlx5: DR, Keep track of hot ICM chunks in an array instead of list net/mlx5: DR, Lower sync threshold for ICM hot memory net/mlx5: DR, Allocate htbl from its own slab allocator net/mlx5: DR, Allocate icm_chunks from their own slab allocator net/mlx5: DR, Manage STE send info objects in pool net/mlx5: DR, In rehash write the line in the entry immediately net/mlx5: DR, Handle domain memory resources init/uninit separately net/mlx5: DR, Initialize chunk's ste_arrays at chunk creation net/mlx5: DR, For short chains of STEs, avoid allocating ste_arr dynamically net/mlx5: DR, Remove unneeded argument from dr_icm_chunk_destroy net/mlx5: DR, Check device state when polling CQ net/mlx5: DR, Fix the SMFS sync_steering for fast teardown net/mlx5: DR, In destroy flow, free resources even if FW command failed ==================== Link: https://lore.kernel.org/r/20221027145643.6618-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).Thomas Gleixner
Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c 2871edb32f46 ("can: kvaser_usb: Fix possible completions during init_completion") abb8670938b2 ("can: kvaser_usb_leaf: Ignore stale bus-off after start") 8d21f5927ae6 ("can: kvaser_usb_leaf: Fix improved state not being reported") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27net/mlx5e: Fix macsec sci endianness at rx sa updateRaed Salem
The cited commit at rx sa update operation passes the sci object attribute, in the wrong endianness and not as expected by the HW effectively create malformed hw sa context in case of update rx sa consequently, HW produces unexpected MACsec packets which uses this sa. Fix by passing sci to create macsec object with the correct endianness, while at it add __force u64 to prevent sparse check error of type "sparse: error: incorrect type in assignment". Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-16-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule functionRaed Salem
The cited commit produces a sparse check error of type "sparse: error: restricted __be64 degrades to integer". The offending line wrongly did a bitwise operation between two different storage types one of 64 bit when the other smaller side is 16 bit which caused the above sparse error, furthermore bitwise operation usage here is wrong in the first place as the constant MACSEC_PORT_ES is not a bitwise field. Fix by using the right mask to get the lower 16 bit if the sci number, and use comparison operator '==' instead of bitwise '&' operator. Fixes: 3b20949cb21b ("net/mlx5e: Add MACsec RX steering rules") Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-15-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27net/mlx5e: Fix macsec rx security association (SA) update/deleteRaed Salem
The cited commit adds the support for update/delete MACsec Rx SA, naturally, these operations need to check if the SA in question exists to update/delete the SA and return error code otherwise, however they do just the opposite i.e. return with error if the SA exists Fix by change the check to return error in case the SA in question does not exist, adjust error message and code accordingly. Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-14-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27net/mlx5e: Fix macsec coverity issue at rx sa updateRaed Salem
The cited commit at update rx sa operation passes object attributes to MACsec object create function without initializing/setting all attributes fields leaving some of them with garbage values, therefore violating the implicit assumption at create object function, which assumes that all input object attributes fields are set. Fix by initializing the object attributes struct to zero, thus leaving unset fields with the legal zero value. Fixes: aae3454e4d4c ("net/mlx5e: Add MACsec offload Rx command support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-13-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27net/mlx5: Fix crash during sync firmware resetSuresh Devarakonda
When setting Bluefield to DPU NIC mode using mlxconfig tool + sync firmware reset flow, we run into scenario where the host was not eswitch manager at the time of mlx5 driver load but becomes eswitch manager after the sync firmware reset flow. This results in null pointer access of mpfs structure during mac filter add. This change prevents null pointer access but mpfs table entries will not be added. Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate") Signed-off-by: Suresh Devarakonda <ramad@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Bodong Wang <bodong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-12-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27net/mlx5: Update fw fatal reporter state on PCI handlers successful recoverRoy Novich
Update devlink health fw fatal reporter state to "healthy" is needed by strictly calling devlink_health_reporter_state_update() after recovery was done by PCI error handler. This is needed when fw_fatal reporter was triggered due to PCI error. Poll health is called and set reporter state to error. Health recovery failed (since EEH didn't re-enable the PCI). PCI handlers keep on recover flow and succeed later without devlink acknowledgment. Fix this by adding devlink state update at the end of the PCI handler recovery process. Fixes: 6181e5cb752e ("devlink: add support for reporter recovery completion") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-11-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroedRoi Dayan
On multi table split the driver creates a new attr instance with data being copied from prev attr instance zeroing action flags. Also need to reset dests properties to avoid incorrect dests per attr. Fixes: 8300f225268b ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-10-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27net/mlx5e: TC, Reject forwarding from internal port to internal portAriel Levkovich
Reject TC rules that forward from internal port to internal port as it is not supported. This include rules that are explicitly have internal port as the filter device as well as rules that apply on tunnel interfaces as the route device for the tunnel interface can be an internal port. Fixes: 27484f7170ed ("net/mlx5e: Offload tc rules that redirect to ovs internal port") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20221026135153.154807-9-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>