summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox
AgeCommit message (Collapse)Author
2023-11-15net/mlx5e: Remove early assignment to netdev->featuresTariq Toukan
The netdev->features is initialized to netdev->hw_features at a later point in the flow. Remove any redundant earlier assignment. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5e: Add local loopback counter to vport rep statsOr Har-Toov
Add counter for number of unicast, multicast and broadcast packets/ octets that were loopback. Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5: Query maximum frequency adjustment of the PTP hardware clockRahul Rameshbabu
Some mlx5 devices do not support the default advertised maximum frequency adjustment value for the PTP hardware clock that is set by the driver. These devices need to be queried when initializing the clock functionality in order to get the maximum supported frequency adjustment value. This value can be greater than the minimum supported frequency adjustment across mlx5 devices (50 million ppb). Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5: Convert scaled ppm values outside the s32 range for PHC frequency ↵Rahul Rameshbabu
adjustments Represent scaled ppm as ppb to the device when the value in scaled ppm is not representable as a 32-bit signed integer. mlx5 devices only support a 32-bit field for the frequency adjustment value in units of either scaled ppm or ppb. Since mlx5 devices only support a 32-bit field for the frequency adjustment value independent of unit used, limit the maximum frequency adjustment to S32_MAX ppb. Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5: Initialize clock->ptp_info inside mlx5_init_timer_clockRahul Rameshbabu
Configure the PHC inside mlx5_init_timer_clock for calling mlx5_ptp_settime later in the function. Would previously use mlx5_ptp_clock_info instance to invoke mlx5_ptp_settime to set the NIC real-time clock to be synchronized with the host system clock. Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5: Refactor real time clock operation checks for PHCRahul Rameshbabu
Check if the MTUTC register of the NIC can be modified before attempting to execute a real-time clock operation. Previous implementation aborted the real-time clock operation pre-emptively when the MTUTC register used to control the real-time clock was not modifiable, indicating real-time clock mode was not enabled on the NIC. The original control flow was confusing since the noop-if-RTC-disabled branch looked similar to an error handling guard clause. The purpose of this patch is purely for improving readability and should lead to no functional change. Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5e: Access array with enum values instead of magic numbersGal Pressman
Access the headers array using pedit_cmd enum values, and don't assume anything about their values. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5: simplify mlx5_set_driver_version string assignmentsJustin Stitt
In total, just assigning this version string takes: (1) strncpy()'s (5) strlen()'s (3) strncat()'s (1) snprintf()'s (4) max_t()'s Moreover, `strncpy` is deprecated [1] and `strncat` really shouldn't be used either [2]. With this in mind, let's simply use a single `snprintf`. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://elixir.bootlin.com/linux/v6.6-rc5/source/include/linux/fortify-string.h#L448 [2] Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5: Annotate struct mlx5_flow_handle with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct mlx5_flow_handle. Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Cc: linux-rdma@vger.kernel.org Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5: Annotate struct mlx5_fc_bulk with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct mlx5_fc_bulk. Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Cc: linux-rdma@vger.kernel.org Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5e: Some cleanup in mlx5e_tc_stats_matchall()Amir Tzin
Function mlx5e_tc_stats_matchall() is only called from one file: drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c Move it there and make it static as exposing it is unnecessary. Also variable cur_stats is redundant. Remove it and avoid superfluous copy. This patch does not change functionality. Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Amir Tzin <amirtz@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5: Allow sync reset flow when BF MGT interface device is presentMoshe Shemesh
In sync reset flow, PF is checking that only devices of same device ID as itself present on the PCIe bridge, otherwise it will NACK the reset. Since the PCIe bridge connection to NIC card has to be 1 to 1, this is valid. However, the BlueField device may also expose another sub-device to the PCI called management interface, which only provides an ethernet channel between the host and the smart NIC. Allow sync reset flow also when management interface sub-device present when checking devices on the PCIe bridge. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-15net/mlx5: print change on SW reset semaphore returns busyMoshe Shemesh
While collecting crdump as part of fw_fatal health reporter dump the PF may fail to lock the SW reset semaphore. Change the print to indicate if it was due to another PF locked the semaphore already and so trying to lock the semaphore returned -EBUSY. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-11-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "Nothing exciting this cycle, most of the diffstat is changing SPDX 'or' to 'OR'. Summary: - Bugfixes for hns, mlx5, and hfi1 - Hardening patches for size_*, counted_by, strscpy - rts fixes from static analysis - Dump SRQ objects in rdma netlink, with hns support - Fix a performance regression in mlx5 MR deregistration - New XDR (200Gb/lane) link speed - SRQ record doorbell latency optimization for hns - IPSEC support for mlx5 multi-port mode - ibv_rereg_mr() support for irdma - Affiliated event support for bnxt_re - Opt out for the spec compliant qkey security enforcement as we discovered SW that breaks under enforcement - Comment and trivial updates" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (50 commits) IB/mlx5: Fix init stage error handling to avoid double free of same QP and UAF RDMA/mlx5: Fix mkey cache WQ flush RDMA/hfi1: Workaround truncation compilation error IB/hfi1: Fix potential deadlock on &irq_src_lock and &dd->uctxt_lock RDMA/core: Remove NULL check before dev_{put, hold} RDMA/hfi1: Remove redundant assignment to pointer ppd RDMA/mlx5: Change the key being sent for MPV device affiliation RDMA/bnxt_re: Fix clang -Wimplicit-fallthrough in bnxt_re_handle_cq_async_error() RDMA/hns: Fix init failure of RoCE VF and HIP08 RDMA/hns: Fix unnecessary port_num transition in HW stats allocation RDMA/hns: The UD mode can only be configured with DCQCN RDMA/hns: Add check for SL RDMA/hns: Fix signed-unsigned mixed comparisons RDMA/hns: Fix uninitialized ucmd in hns_roce_create_qp_common() RDMA/hns: Fix printing level of asynchronous events RDMA/core: Add support to set privileged QKEY parameter RDMA/bnxt_re: Do not report SRQ error in srq notification RDMA/bnxt_re: Report async events and errors RDMA/bnxt_re: Update HW interface headers IB/mlx5: Fix rdma counter binding for RAW QP ...
2023-10-31Merge tag 'v6.6' into rdma.git for-nextJason Gunthorpe
Resolve conflict by taking the spin_lock hunk from for-next: https://lore.kernel.org/r/20230928113851.5197a1ec@canb.auug.org.au Required for the next patch. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-10-26net/mlx5: fix uninit value usePrzemek Kitszel
Avoid use of uninitialized state variable. In case of mlx5e_tx_reporter_build_diagnose_output_sq_common() it's better to still collect other data than bail out entirely. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/netdev/8bd30131-c9f2-4075-a575-7fa2793a1760@moroto.mountain Fixes: d17f98bf7cc9 ("net/mlx5: devlink health: use retained error fmsg API") Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://lore.kernel.org/r/20231025145050.36114-1-przemyslaw.kitszel@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-23page_pool: remove PP_FLAG_PAGE_FRAGYunsheng Lin
PP_FLAG_PAGE_FRAG is not really needed after pp_frag_count handling is unified and page_pool_alloc_frag() is supported in 32-bit arch with 64-bit DMA, so remove it. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> CC: Lorenzo Bianconi <lorenzo@kernel.org> CC: Alexander Duyck <alexander.duyck@gmail.com> CC: Liang Chen <liangchen.linux@gmail.com> CC: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20231020095952.11055-3-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-20mlxsw: spectrum: Set SW LAG mode on Spectrum>1Petr Machata
On Spectrum-2, Spectrum-3 and Spectrum-4 machines, request SW responsibility for placement of the LAG table. On Spectrum-1, some FW versions claim to support lag_mode field despite quietly ignoring any settings made to that field. Thus refrain from attempting to configure lag_mode on those systems at all. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: spectrum: Allocate LAG table when in SW LAG modePetr Machata
In this patch, if the LAG mode is SW, allocate the LAG table and configure SGCR to indicate where it was allocated. We use the default "DDD" (for dynamic data duplication) layout of the LAG table. In the DDD mode, the membership information for each LAG is copied in 8 PGT entries. This is done for performance reasons. The LAG table then needs to be allocated on an address aligned to 8. Deal with this by moving the LAG init ahead so that the LAG table is allocated at address 0. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: spectrum_pgt: Generalize PGT allocationPetr Machata
PGT blocks are allocated through the function mlxsw_sp_pgt_mid_alloc_range(). The interface assumes that the caller knows which piece of PGT exactly they want to get. That was fine while the FID code was the only client allocating blocks of PGT. However for SW-allocated LAG table, there will be an additional client: mlxsw_sp_lag_init(). The interface should therefore be changed to not require particular coordinates, but to take just the requested size, allocate the block wherever, and give back the PGT address. In this patch, change the interface accordingly. Initialize FID family's pgt_base from the result of the PGT allocation (note that mlxsw makes a copy of the family structure, so what gets initialized is not actually the global structure). Drop the now-unnecessary pgt_base initializations and the corresponding defines. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: spectrum_fid: Allocate PGT for the whole FID family in one goPetr Machata
PGT blocks are allocated through the function mlxsw_sp_pgt_mid_alloc_range(). The interface assumes that the caller knows which piece of PGT exactly they want to get. That was fine while the FID code was the only client allocating blocks of PGT. However for SW-allocated LAG table, there will be an additional client: mlxsw_sp_lag_init(). The interface should therefore be changed to not require particular coordinates, but to take just the requested size, allocate the block wherever, and give back the PGT address. The current FID mode has one place where PGT address can be stored: the FID family's pgt_base. The allocation scheme should therefore be changed from allocating a block per FID flood table, to allocating a block per FID family. Do just that in this patch. The per-family allocation is going to be useful for another related feature as well: the CFF mode. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: pci: Permit toggling LAG modePetr Machata
Add to struct mlxsw_config_profile a field lag_mode_prefer_sw for the driver to indicate that SW LAG mode should be configured if possible. Add to the PCI module code to set lag_mode as appropriate. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: core, pci: Add plumbing related to LAG modePetr Machata
lag_mode describes where the responsibility for LAG table placement lies: SW or FW. The bus module determines whether LAG is supported, can configure it if it is, and knows what (if any) configuration has been applied. Therefore add a bus callback to determine the configured LAG mode. Also add to core an API to query it. The LAG mode is for now kept at the default value of 0 for FW-managed. The code to actually toggle it will be added later. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: cmd: Add QUERY_FW.lag_mode_supportPetr Machata
Add QUERY_FW.lag_mode_support, which determines whether CONFIG_PROFILE.lag_mode is available. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: cmd: Add CONFIG_PROFILE.{set_, }lag_modePetr Machata
Add CONFIG_PROFILE.lag_mode, which serves for moving responsibility for placement of the LAG table from FW to SW. Whether lag_mode should be configured is determined by CONFIG_PROFILE.set_lag_mode, which also add. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: cmd: Fix omissions in CONFIG_PROFILE field names in commentsPetr Machata
A number of CONFIG_PROFILE fields' comments refer to a field named like cmd_mbox_config_* instead of cmd_mbox_config_profile_*. Correct these omissions. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: reg: Add SGCR.lag_lookup_pgt_basePetr Machata
Add SGCR.lag_lookup_pgt_base, which is used for configuring the base address of the LAG table within the PGT table for cases when the driver is responsible for the table placement. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: reg: Drop SGCR.llbPetr Machata
SGCR, Switch General Configuration Register, has not been used since commit b0d80c013b04 ("mlxsw: Remove Mellanox SwitchX-2 ASIC support"). We will need the register again shortly, so instead of dropping it and reintroducing again, just drop the sole unused field. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20net/mlx5: devlink health: use retained error fmsg APIPrzemek Kitszel
Drop unneeded error checking. devlink_fmsg_*() family of functions is now retaining errors, so there is no need to check for them after each call. Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20mlxsw: core: devlink health: use retained error fmsg APIPrzemek Kitszel
Drop unneeded error checking. devlink_fmsg_*() family of functions is now retaining errors, so there is no need to check for them after each call. Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. net/mac80211/key.c 02e0e426a2fb ("wifi: mac80211: fix error path key leak") 2a8b665e6bcc ("wifi: mac80211: remove key_mtx") 7d6904bf26b9 ("Merge wireless into wireless-next") https://lore.kernel.org/all/20231012113648.46eea5ec@canb.auug.org.au/ Adjacent changes: drivers/net/ethernet/ti/Kconfig a602ee3176a8 ("net: ethernet: ti: Fix mixed module-builtin object") 98bdeae9502b ("net: cpmac: remove driver to prepare for platform removal") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-14net/mlx5e: Allow IPsec soft/hard limits in bytesLeon Romanovsky
Actually the mlx5 code already has needed support to allow users to configure soft/hard limits in bytes. It is possible due to the situation with TX path, where CX7 devices are missing hardware implementation to send events to the software, see commit b2f7b01d36a9 ("net/mlx5e: Simulate missing IPsec TX limits hardware functionality"). That software workaround is not limited to TX and works for bytes too. So relax the validation logic to not block soft/hard limits in bytes. Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5e: Increase max supported channels number to 256Adham Faris
Increase max supported channels number to 256 (it is not extended further due to testing disabilities). Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5e: Preparations for supporting larger number of channelsAdham Faris
Data center server CPUs number keeps getting larger with time. Currently, our driver limits the number of channels to 128. Maximum channels number is enforced and bounded by hardcoded defines (en.h/MLX5E_MAX_NUM_CHANNELS) even though the device and machine (CPUs num) can allow more. Refactor current implementation in order to handle further channels. The maximum supported channels number will be increased in the followup patch. Introduce RQT size calculation/allocation scheme below: 1) Preserve current RQT size of 256 for channels number up to 128 (the old limit). 2) For greater channels number, RQT size is calculated by multiplying the channels number by 2 and rounding up the result to the nearest power of 2. If the calculated RQT size exceeds the maximum supported size by the NIC, fallback to this maximum RQT size (1 << log_max_rqt_size). Since RQT size is no more static, allocate and free the indirection table SW shadow dynamically. Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5e: Refactor mlx5e_rss_init() and mlx5e_rss_free() API'sAdham Faris
Introduce code refactoring below: 1) Introduce single API for creating and destroying rss object, mlx5e_rss_create() and mlx5e_rss_destroy() respectively. 2) mlx5e_rss_create() constructs and initializes RSS object depends on a function new param enum mlx5e_rss_create_type. Callers (like rx_res.c) will no longer need to allocate RSS object via mlx5e_rss_alloc() and initialize it immediately via mlx5e_rss_init_no_tirs() or mlx5e_rss_init(), this will be done by a single call to mlx5e_rss_create(). Hence, mlx5e_rss_alloc() and mlx5e_rss_init_no_tirs() have been removed from rss.h file and became static functions. Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5e: Refactor mlx5e_rss_set_rxfh() and mlx5e_rss_get_rxfh()Adham Faris
Initialize indirect table array with memcpy rather than for loop. This change has made for two reasons: 1) To be consistent with the indirect table array init in mlx5e_rss_set_rxfh(). 2) In general, prefer to use memcpy for array initializing rather than for loop. Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5e: Refactor rx_res_init() and rx_res_free() APIsAdham Faris
Refactor mlx5e_rx_res_init() and mlx5e_rx_res_free() by wrapping mlx5e_rx_res_alloc() and mlx5e_rx_res_destroy() API's respectively. Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5e: Use PTR_ERR_OR_ZERO() to simplify codeYu Liao
Use the standard error pointer macro to shorten the code and simplify. Signed-off-by: Yu Liao <liaoyu15@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5: Use PTR_ERR_OR_ZERO() to simplify codeJinjie Ruan
Return PTR_ERR_OR_ZERO() instead of return 0 or PTR_ERR() to simplify code. Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5: Remove unused declarationYue Haibing
Commit 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path") declared mlx5e_ipsec_inverse_table_init() but never implemented it. Commit f52f2faee581 ("net/mlx5e: Introduce flow steering API") declared mlx5e_fs_set_tc() but never implemented it. Commit f2f3df550139 ("net/mlx5: EQ, Privatize eq_table and friends") declared mlx5_eq_comp_cpumask() but never implemented it. Commit cac1eb2cf2e3 ("net/mlx5: Lag, properly lock eswitch if needed") removed mlx5_lag_update() but not its declaration. Commit 35ba005d820b ("net/mlx5: DR, Set flex parser for TNL_MPLS dynamically") removed mlx5dr_ste_build_tnl_mpls() but not its declaration. Commit e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") declared but never implemented mlx5_alloc_cmd_mailbox_chain() and mlx5_free_cmd_mailbox_chain(). Commit 0cf53c124756 ("net/mlx5: FWPage, Use async events chain") removed mlx5_core_req_pages_handler() but not its declaration. Commit 938fe83c8dcb ("net/mlx5_core: New device capabilities handling") removed mlx5_query_odp_caps() but not its declaration. Commit f6a8a19bb11b ("RDMA/netdev: Hoist alloc_netdev_mqs out of the driver") removed mlx5_rdma_netdev_alloc() but not its declaration. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5: Replace global mlx5_intf_lock with HCA devcom component lockShay Drory
mlx5_intf_lock is used to sync between LAG changes and its slaves mlx5 core dev aux devices changes, which means every time mlx5 core dev add/remove aux devices, mlx5 is taking this global lock, even if LAG functionality isn't supported over the core dev. This cause a bottleneck when probing VFs/SFs in parallel. Hence, replace mlx5_intf_lock with HCA devcom component lock, or no lock if LAG functionality isn't supported. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5: Refactor LAG peer device lookout bus logic to mlx5 devcomShay Drory
LAG peer device lookout bus logic required the usage of global lock, mlx5_intf_mutex. As part of the effort to remove this global lock, refactor LAG peer device lookout to use mlx5 devcom layer. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5: Avoid false positive lockdep warning by adding lock_class_keyShay Drory
Downstream patch will add devcom component which will be locked in many places. This can lead to a false positive "possible circular locking dependency" warning by lockdep, on flows which lock more than one mlx5 devcom component, such as probing ETH aux device. Hence, add a lock_class_key per mlx5 device. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5: Redesign SF active work to remove table_lockWei Zhang
active_work is a work that iterates over all possible SF devices which their SF port representors are located on different function, and in case SF is in active state, probes it. Currently, the active_work in active_wq is synced with mlx5_vhca_events_work via table_lock and this lock causing a bottleneck in performance. To remove table_lock, redesign active_wq logic so that it now pushes active_work per SF to mlx5_vhca_events_workqueues. Since the latter workqueues are ordered, active_work and mlx5_vhca_events_work with same index will be pushed into same workqueue, thus it completely eliminates the need for a lock. Signed-off-by: Wei Zhang <weizhang@nvidia.com> Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-14net/mlx5: Parallelize vhca event handlingWei Zhang
At present, mlx5 driver have a general purpose event handler which not only handles vhca event but also many other events. This incurs a huge bottleneck because the event handler is implemented by single threaded workqueue and all events are forced to be handled in serial manner even though application tries to create multiple SFs simultaneously. Introduce a dedicated vhca event handler which manages SFs parallel creation. Signed-off-by: Wei Zhang <weizhang@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-10-13net/mlx4_core: replace deprecated strncpy with strscpyJustin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. We expect `dst` to be NUL-terminated based on its use with format strings: | mlx4_dbg(dev, "Reporting Driver Version to FW: %s\n", dst); Moreover, NUL-padding is not required. Considering the above, a suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20231011-strncpy-drivers-net-ethernet-mellanox-mlx4-fw-c-v1-1-4d7b5d34c933@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13mlxsw: pci: Allocate skbs using GFP_KERNEL during initializationIdo Schimmel
The driver allocates skbs during initialization and during Rx processing. Take advantage of the fact that the former happens in process context and allocate the skbs using GFP_KERNEL to decrease the probability of allocation failure. Tested with CONFIG_DEBUG_ATOMIC_SLEEP=y. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/dfa6ed0926e045fe7c14f0894cc0c37fee81bf9d.1697034729.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-13Merge branch 'mlx5-next' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Leon Romanovsky says: ==================== This PR is collected from https://lore.kernel.org/all/cover.1695296682.git.leon@kernel.org This series from Patrisious extends mlx5 to support IPsec packet offload in multiport devices (MPV, see [1] for more details). These devices have single flow steering logic and two netdev interfaces, which require extra logic to manage IPsec configurations as they performed on netdevs. [1] https://lore.kernel.org/linux-rdma/20180104152544.28919-1-leon@kernel.org/ * 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Handle IPsec steering upon master unbind/bind net/mlx5: Configure IPsec steering for ingress RoCEv2 MPV traffic net/mlx5: Configure IPsec steering for egress RoCEv2 MPV traffic net/mlx5: Add create alias flow table function to ipsec roce net/mlx5: Implement alias object allow and create functions net/mlx5: Add alias flow table bits net/mlx5: Store devcom pointer inside IPsec RoCE net/mlx5: Register mlx5e priv to devcom in MPV mode RDMA/mlx5: Send events from IB driver about device affiliation state net/mlx5: Introduce ifc bits for migration in a chunk mode ==================== Link: https://lore.kernel.org/r/20231002083832.19746-1-leon@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: kernel/bpf/verifier.c 829955981c55 ("bpf: Fix verifier log for async callback return values") a923819fb2c5 ("bpf: Treat first argument as return value for bpf_throw") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-12net/mlx5e: Fix VF representors reporting zero counters to "ip -s" commandAmir Tzin
Although vf_vport entry of struct mlx5e_stats is never updated, its values are mistakenly copied to the caller structure in the VF representor .ndo_get_stat_64 callback mlx5e_rep_get_stats(). Remove redundant entry and use the updated one, rep_stats, instead. Fixes: 64b68e369649 ("net/mlx5: Refactor and expand rep vport stat group") Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Amir Tzin <amirtz@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>