summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2023-12-13Merge patch series "Introduce support for multiqueue (MQ) in fnic"Martin K. Petersen
Karan Tilak Kumar <kartilak@cisco.com> says: Hi Martin, reviewers, This cover letter describes the feature: add support for multiqueue (MQ) to fnic driver. Background: The Virtual Interface Card (VIC) firmware exposes several queues that can be configured for sending IOs and receiving IO responses. Unified Computing System Manager (UCSM) and Intersight Manager (IMM) allows users to configure the number of queues to be used for IOs. The number of IO queues to be used is stored in a configuration file by the VIC firmware. The fNIC driver reads the configuration file and sets the number of queues to be used. Previously, the driver was hard-coded to use only one queue. With this set of changes, the fNIC driver will configure itself to use multiple queues. This feature takes advantage of the block multiqueue layer to parallelize IOs being sent out of the VIC card. Here's a brief description of some of the salient patches: - vnic_scsi.h needs to be in sync with VIC firmware to be able to read the number of queues from the firmware config file. A patch has been created for this. - In an environment with many fnics (like we see in our customer environments), it is hard to distinguish which fnic is printing logs. Therefore, an fnic number has been included in the logs. - read the number of queues from the firmware config file. - include definitions in fnic.h to support multiqueue. - modify the interrupt service routines (ISRs) to read from the correct registers. The numbers that are used here come from discussions with the VIC firmware team. - track IO statistics for different queues. - remove usage of host_lock, and only use fnic_lock in the fnic driver. - use a hardware queue based spinlock to protect io_req. - replace the hard-coded zeroth queue with a hardware queue number. This presents a bulk of the changes. - modify the definition of fnic_queuecommand to accept multiqueue tags. - improve log messages, and indicate fnic number and multiqueue tags for effective debugging. Even though the patches have been made into a series, some patches are heavier than others. But, every effort has been made to keep the purpose of each patch as a single-purpose, and to compile cleanly. This patchset has been tested as a whole. Therefore, the tested-by fields have been added only to two patches in the set. All the individual patches compile cleanly. However, I've refrained from adding tested-by to most of the patches, so as to not mislead the reviewer/reader. A brief note on the unit tests: 1. Increase number of queues to 64. Load driver. Run IOs via Medusa. 12+ hour run successful. 2. Configure multipathing, and run link flaps on single link. IOs drop briefly, but pick up as expected. 3. Configure multipathing, and run link flaps on two links, with a 30 second delay in between. IOs drop briefly, but pick up as expected. Repeat the above tests with 1 queue and 32 queues. All tests were successful. Please consider this patch series for the next merge window. Link: https://lore.kernel.org/r/20231211173617.932990-1-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Increment driver versionKaran Tilak Kumar
Increment driver version for multiqueue (MQ). Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-14-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Improve logs and add support for multiqueue (MQ)Karan Tilak Kumar
Improve existing logs by adding fnic number, hardware queue, tag, and mqtag in the prints. Add logs with the above elements for effective debugging. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Tested-by: Karan Tilak Kumar <kartilak@cisco.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-13-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Add support for multiqueue (MQ) in fnic driverKaran Tilak Kumar
Implement support for MQ in fnic driver: The block multiqueue layer issues IO to the fnic driver with an MQ tag. Use the mqtag and derive a tag from it. Derive the hardware queue from the mqtag and use it in all paths. Modify queuecommand to handle mqtag. Replace wq and cq indices to support MQ. Replace the zeroth queue with a hardware queue. Implement spin locks on a per hardware queue basis. Replace io_lock with per hardware queue spinlock. Implement out of range tag checks. Allocate an io_req_table to track status of the io_req. Test the driver by building it, loading it, and configuring 64 queues in UCSM. Issue IOs using Medusa on multiple fnics. Enable/disable links to exercise the abort and clean up path. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310300032.2awCqkfn-lkp@intel.com/ Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Tested-by: Karan Tilak Kumar <kartilak@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-12-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Add support for multiqueue (MQ) in fnic_main.cKaran Tilak Kumar
Set map_queues in the fnic_host_template to fnic_mq_map_queues_cpus. Define fnic_mq_map_queues_cpus to set cpu assignment to fnic queues. Refactor code in fnic_probe to enable vnic queues before scsi_add_host. Modify notify set to the correct index. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-11-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Remove usage of host_lockKaran Tilak Kumar
Remove usage of host_lock. Replace with fnic_lock, where necessary. fnic does not use host_lock. fnic uses fnic_lock. Use fnic lock to protect fnic members in fnic_queuecommand. Add log messages in error cases. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-10-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Define stats to track multiqueue (MQ) IOsKaran Tilak Kumar
Define an array to track IOs for the different queues, print the IO stats in fnic get stats data. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-9-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Modify ISRs to support multiqueue (MQ)Karan Tilak Kumar
Modify interrupt service routines for INTx, MSI, and MSI-x to support multiqueue. Modify parameter list of fnic_wq_copy_cmpl_handler to take cq_index. Modify fnic_cleanup function to use the new function call of fnic_wq_copy_cmpl_handler. Refactor code to set interrupt mode to MSI-x to a new function. Add a new stat for intx_dummy. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202310251847.4T8BVZAZ-lkp@intel.com/ Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-8-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Refactor and redefine fnic.h for multiqueueKaran Tilak Kumar
Refactor and re-define values in fnic.h to implement multiqueue (MQ) functionality. VIC firmware allows fnic to create up to 64 copy workqueues. Update the copy workqueue max to 64. Modify the interrupt index to be in sync with the firmware to support MQ. Add irq number to the MSIX entry. Define a software workqueue table to track the status of io_reqs. Define a base for the copy workqueue. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-7-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Get copy workqueue count and interrupt mode from configKaran Tilak Kumar
Get the copy workqueue count and interrupt mode from the configuration. The config can be changed via UCSM. Add logs to print the interrupt mode and copy workqueue count. Add logs to print the vNIC resources. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-6-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Rename wq_copy to hw_copy_wqKaran Tilak Kumar
Rename wq_copy to hw_copy_wq to accurately describe the copy workqueue. This will also help distinguish this data structure from software data structures that can be introduced. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-5-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Add and improve log messagesKaran Tilak Kumar
Add link related log messages in fnic_fcs.c, Improve log message in fnic_fcs.c, Add log message in vnic_dev.c. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-4-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Add and use fnic numberKaran Tilak Kumar
Add fnic_num in fnic.h to identify fnic in a multi-fnic environment. Increment and set the fnic number during driver load in fnic_probe. Replace the host number with fnic number in debugfs. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-3-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13scsi: fnic: Modify definitions to sync with VIC firmwareKaran Tilak Kumar
VIC firmware has updated definitions. Modify structure and definitions to sync with the latest VIC firmware. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20231211173617.932990-2-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-13dpaa2-switch: do not ask for MDB, VLAN and FDB replayIoana Ciornei
Starting with commit 4e51bf44a03a ("net: bridge: move the switchdev object replay helpers to "push" mode") the switchdev_bridge_port_offload() helper was extended with the intention to provide switchdev drivers easy access to object addition and deletion replays. This works by calling the replay helpers with non-NULL notifier blocks. In the same commit, the dpaa2-switch driver was updated so that it passes valid notifier blocks to the helper. At that moment, no regression was identified through testing. In the meantime, the blamed commit changed the behavior in terms of which ports get hit by the replay. Before this commit, only the initial port which identified itself as offloaded through switchdev_bridge_port_offload() got a replay of all port objects and FDBs. After this, the newly joining port will trigger a replay of objects on all bridge ports and on the bridge itself. This behavior leads to errors in dpaa2_switch_port_vlans_add() when a VLAN gets installed on the same interface multiple times. The intended mechanism to address this is to pass a non-NULL ctx to the switchdev_bridge_port_offload() helper and then check it against the port's private structure. But since the driver does not have any use for the replayed port objects and FDBs until it gains support for LAG offload, it's better to fix the issue by reverting the dpaa2-switch driver to not ask for replay. The pointers will be added back when we are prepared to ignore replays on unrelated ports. Fixes: b28d580e2939 ("net: bridge: switchdev: replay all VLAN groups") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20231212164326.2753457-3-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13dpaa2-switch: fix size of the dma_unmapIoana Ciornei
The size of the DMA unmap was wrongly put as a sizeof of a pointer. Change the value of the DMA unmap to be the actual macro used for the allocation and the DMA map. Fixes: 1110318d83e8 ("dpaa2-switch: add tc flower hardware offload on ingress traffic") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20231212164326.2753457-2-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13page_pool: transition to reference count management after page drainingLiang Chen
To support multiple users referencing the same fragment, 'pp_frag_count' is renamed to 'pp_ref_count', transitioning pp pages from fragment management to reference count management after draining based on the suggestion from [1]. The idea is that the concept of fragmenting exists before the page is drained, and all related functions retain their current names. However, once the page is drained, its management shifts to being governed by 'pp_ref_count'. Therefore, all functions associated with that lifecycle stage of a pp page are renamed. [1] http://lore.kernel.org/netdev/f71d9448-70c8-8793-dc9a-0eb48a570300@huawei.com Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://lore.kernel.org/r/20231212044614.42733-2-liangchen.linux@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13cxgb3: Avoid potential string truncation in descKees Cook
Builds with W=1 were warning about potential string truncations: drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c: In function 'cxgb_up': drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:394:38: warning: '%d' directive output may be truncated writing between 1 and 11 bytes into a region of size between 5 and 20 [-Wformat-truncation=] 394 | "%s-%d", d->name, pi->first_qset + i); | ^~ In function 'name_msix_vecs', inlined from 'cxgb_up' at drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:1264:3: drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:394:34: note: directive argument in the range [-2147483641, 509] 394 | "%s-%d", d->name, pi->first_qset + i); | ^~~~~~~ drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:393:25: note: 'snprintf' output between 3 and 28 bytes into a destination of size 21 393 | snprintf(adap->msix_info[msi_idx].desc, n, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 394 | "%s-%d", d->name, pi->first_qset + i); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Avoid open-coded %NUL-termination (this code was assuming snprintf wasn't %NUL terminating when it does -- likely thinking of strncpy), and grow the size of the string to handle a maximal value. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312100937.ZPZCARhB-lkp@intel.com/ Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231212220954.work.219-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13amd-xgbe: Avoid potential string truncation in nameKees Cook
Build with W=1 were warning about a potential string truncation: drivers/net/ethernet/amd/xgbe/xgbe-drv.c: In function 'xgbe_alloc_channels': drivers/net/ethernet/amd/xgbe/xgbe-drv.c:211:73: warning: '%u' directive output may be truncated writing between 1 and 10 bytes into a region of size 8 [-Wformat-truncation=] 211 | snprintf(channel->name, sizeof(channel->name), "channel-%u", i); | ^~ drivers/net/ethernet/amd/xgbe/xgbe-drv.c:211:64: note: directive argument in the range [0, 4294967294] 211 | snprintf(channel->name, sizeof(channel->name), "channel-%u", i); | ^~~~~~~~~~~~ drivers/net/ethernet/amd/xgbe/xgbe-drv.c:211:17: note: 'snprintf' output between 10 and 19 bytes into a destination of size 16 211 | snprintf(channel->name, sizeof(channel->name), "channel-%u", i); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Increase the size of the "name" buffer to handle the full format range. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312100937.ZPZCARhB-lkp@intel.com/ Cc: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231212221312.work.830-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13dpll: allocate pin ids in cycleJiri Pirko
Pin ID is just a number. Nobody should rely on a certain value, instead, user should use either pin-id-get op or RTNetlink to get it. Unify the pin ID allocation behavior with what there is already implemented for dpll devices. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231212150605.1141261-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13idpf: add get/set for Ethtool's header split ringparamMichal Kubiak
idpf supports the header split feature and that feature is always enabled by default. However, for flexibility reasons and to simplify some scenarios, it would be useful to have the support for switching the header split off (and on) from the userspace. Address that need by adding the user config parameter, the functions for disabling (or enabling) the header split feature, and calls to them from the Ethtool ringparam callbacks. It still is enabled by default if supported by the hardware. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Co-developed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20231212142752.935000-3-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13net/mlx5: DR, Use swap() instead of open coding itJiapeng Chong
Swap is a function interface that provides exchange function. To avoid code duplication, we can use swap function. ./drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c:1254:50-51: WARNING opportunity for swap(). Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7580 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5: devcom, Add component size getterTariq Toukan
Add a getter for the number of participants in a devcom component (those who share the same component id and key). Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Decouple CQ from privTariq Toukan
Make CQ struct and methods independent of "priv", use more basic arguments instead. This will ease the transition to netdev with multiple mdevs. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Add wrapping for auxiliary_driver ops and remove unused argsTariq Toukan
Turn some of the struct auxiliary_driver ops into wrappers to stop having dummy local vars passed as unused arguments. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Statify function mlx5e_monitor_counter_armTariq Toukan
Function usage is limited to the monitor_stats.c file, do not expose it. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5: Move TISes from priv to mdev HW resourcesTariq Toukan
The transport interface send (TIS) object is responsible for performing all transport related operations of the transmit side. Messages from Send Queues get segmented and transmitted by the TIS including all transport required implications, e.g. in the case of large send offload, the TIS is responsible for the segmentation. These are stateless objects and can be used by multiple netdevs (e.g. representors) who share the same core device. Providing the TISes as a service from the core layer to the netdev layer reduces the number of replecated TIS objects (in case of multiple netdevs), and will ease the transition to netdev with multiple mdevs. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Remove TLS-specific logic in generic create TIS APITariq Toukan
TLS TISes are created using their own dedicated functions, don't honor their specific logic here. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5: fs, Command to control TX flow table rootTariq Toukan
Introduce an API to set/unset the TX flow table root for a device. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5: fs, Command to control L2TABLE entry silent modeTariq Toukan
Introduce an API to set/unset the L2TABLE entry silent mode for a device. If silent, no north/south traffic is allowed, the device won't be able to communicate with the port directly to send/receive traffic by its own. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5: Expose Management PCIe Index Register (MPIR)Tariq Toukan
MPIR register allows to query the PCIe indexes and Socket-Direct related parameters. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net: phy: Add support for the DP83TG720S Ethernet PHYOleksij Rempel
The DP83TG720S-Q1 device is an IEEE 802.3bp and Open Alliance compliant automotive Ethernet physical layer transceiver. This driver was tested with i.MX8MP EQOS (stmmac) on the MAC side and same TI PHY on other side. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20231212054144.87527-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13net: phy: c45: add genphy_c45_pma_read_ext_abilities() functionOleksij Rempel
Move part of the genphy_c45_pma_read_abilities() code to a separate function. Some PHYs do not implement PMA/PMD status 2 register (Register 1.8) but do implement PMA/PMD extended ability register (Register 1.11). To make use of it, we need to be able to access this part of code separately. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20231212054144.87527-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13net/mlx5e: Correct snprintf truncation handling for fw_version buffer used ↵Rahul Rameshbabu
by representors snprintf returns the length of the formatted string, excluding the trailing null, without accounting for truncation. This means that is the return value is greater than or equal to the size parameter, the fw_version string was truncated. Link: https://docs.kernel.org/core-api/kernel-api.html#c.snprintf Fixes: 1b2bd0c0264f ("net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Correct snprintf truncation handling for fw_version bufferRahul Rameshbabu
snprintf returns the length of the formatted string, excluding the trailing null, without accounting for truncation. This means that is the return value is greater than or equal to the size parameter, the fw_version string was truncated. Reported-by: David Laight <David.Laight@ACULAB.COM> Closes: https://lore.kernel.org/netdev/81cae734ee1b4cde9b380a9a31006c1a@AcuMS.aculab.com/ Link: https://docs.kernel.org/core-api/kernel-api.html#c.snprintf Fixes: 41e63c2baa11 ("net/mlx5e: Check return value of snprintf writing to fw_version buffer") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Fix error codes in alloc_branch_attr()Dan Carpenter
Set the error code if set_branch_dest_ft() fails. Fixes: ccbe33003b10 ("net/mlx5e: TC, Don't offload post action rule if not supported") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Fix error code in mlx5e_tc_action_miss_mapping_get()Dan Carpenter
Preserve the error code if esw_add_restore_rule() fails. Don't return success. Fixes: 6702782845a5 ("net/mlx5e: TC, Set CT miss to the specific ct action instance") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5: Refactor mlx5_flow_destination->rep pointer to vport numVlad Buslov
Currently the destination rep pointer is only used for comparisons or to obtain vport number from it. Since it is used both during flow creation and deletion it may point to representor of another eswitch instance which can be deallocated during driver unload even when there are rules pointing to it[0]. Refactor the code to store vport number and 'valid' flag instead of the representor pointer. [0]: [176805.886303] ================================================================== [176805.889433] BUG: KASAN: slab-use-after-free in esw_cleanup_dests+0x390/0x440 [mlx5_core] [176805.892981] Read of size 2 at addr ffff888155090aa0 by task modprobe/27280 [176805.895462] CPU: 3 PID: 27280 Comm: modprobe Tainted: G B 6.6.0-rc3+ #1 [176805.896771] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [176805.898514] Call Trace: [176805.899026] <TASK> [176805.899519] dump_stack_lvl+0x33/0x50 [176805.900221] print_report+0xc2/0x610 [176805.900893] ? mlx5_chains_put_table+0x33d/0x8d0 [mlx5_core] [176805.901897] ? esw_cleanup_dests+0x390/0x440 [mlx5_core] [176805.902852] kasan_report+0xac/0xe0 [176805.903509] ? esw_cleanup_dests+0x390/0x440 [mlx5_core] [176805.904461] esw_cleanup_dests+0x390/0x440 [mlx5_core] [176805.905223] __mlx5_eswitch_del_rule+0x1ae/0x460 [mlx5_core] [176805.906044] ? esw_cleanup_dests+0x440/0x440 [mlx5_core] [176805.906822] ? xas_find_conflict+0x420/0x420 [176805.907496] ? down_read+0x11e/0x200 [176805.908046] mlx5e_tc_rule_unoffload+0xc4/0x2a0 [mlx5_core] [176805.908844] mlx5e_tc_del_fdb_flow+0x7da/0xb10 [mlx5_core] [176805.909597] mlx5e_flow_put+0x4b/0x80 [mlx5_core] [176805.910275] mlx5e_delete_flower+0x5b4/0xb70 [mlx5_core] [176805.911010] tc_setup_cb_reoffload+0x27/0xb0 [176805.911648] fl_reoffload+0x62d/0x900 [cls_flower] [176805.912313] ? mlx5e_rep_indr_block_unbind+0xd0/0xd0 [mlx5_core] [176805.913151] ? __fl_put+0x230/0x230 [cls_flower] [176805.913768] ? filter_irq_stacks+0x90/0x90 [176805.914335] ? kasan_save_stack+0x1e/0x40 [176805.914893] ? kasan_set_track+0x21/0x30 [176805.915484] ? kasan_save_free_info+0x27/0x40 [176805.916105] tcf_block_playback_offloads+0x79/0x1f0 [176805.916773] ? mlx5e_rep_indr_block_unbind+0xd0/0xd0 [mlx5_core] [176805.917647] tcf_block_unbind+0x12d/0x330 [176805.918239] tcf_block_offload_cmd.isra.0+0x24e/0x320 [176805.918953] ? tcf_block_bind+0x770/0x770 [176805.919551] ? _raw_read_unlock_irqrestore+0x30/0x30 [176805.920236] ? mutex_lock+0x7d/0xd0 [176805.920735] ? mutex_unlock+0x80/0xd0 [176805.921255] tcf_block_offload_unbind+0xa5/0x120 [176805.921909] __tcf_block_put+0xc2/0x2d0 [176805.922467] ingress_destroy+0xf4/0x3d0 [sch_ingress] [176805.923178] __qdisc_destroy+0x9d/0x280 [176805.923741] dev_shutdown+0x1c6/0x330 [176805.924295] unregister_netdevice_many_notify+0x6ef/0x1500 [176805.925034] ? netdev_freemem+0x50/0x50 [176805.925610] ? _raw_spin_lock_irq+0x7b/0xd0 [176805.926235] ? _raw_spin_lock_bh+0xe0/0xe0 [176805.926849] unregister_netdevice_queue+0x1e0/0x280 [176805.927592] ? unregister_netdevice_many+0x10/0x10 [176805.928275] unregister_netdev+0x18/0x20 [176805.928835] mlx5e_vport_rep_unload+0xc0/0x200 [mlx5_core] [176805.929608] mlx5_esw_offloads_unload_rep+0x9d/0xc0 [mlx5_core] [176805.930492] mlx5_eswitch_unload_vf_vports+0x108/0x1a0 [mlx5_core] [176805.931422] ? mlx5_eswitch_unload_sf_vport+0x50/0x50 [mlx5_core] [176805.932304] ? rwsem_down_write_slowpath+0x11f0/0x11f0 [176805.932987] mlx5_eswitch_disable_sriov+0x6f9/0xa60 [mlx5_core] [176805.933807] ? mlx5_core_disable_hca+0xe1/0x130 [mlx5_core] [176805.934576] ? mlx5_eswitch_disable_locked+0x580/0x580 [mlx5_core] [176805.935463] mlx5_device_disable_sriov+0x138/0x490 [mlx5_core] [176805.936308] mlx5_sriov_disable+0x8c/0xb0 [mlx5_core] [176805.937063] remove_one+0x7f/0x210 [mlx5_core] [176805.937711] pci_device_remove+0x96/0x1c0 [176805.938289] device_release_driver_internal+0x361/0x520 [176805.938981] ? kobject_put+0x5c/0x330 [176805.939553] driver_detach+0xd7/0x1d0 [176805.940101] bus_remove_driver+0x11f/0x290 [176805.943847] pci_unregister_driver+0x23/0x1f0 [176805.944505] mlx5_cleanup+0xc/0x20 [mlx5_core] [176805.945189] __x64_sys_delete_module+0x2b3/0x450 [176805.945837] ? module_flags+0x300/0x300 [176805.946377] ? dput+0xc2/0x830 [176805.946848] ? __kasan_record_aux_stack+0x9c/0xb0 [176805.947555] ? __call_rcu_common.constprop.0+0x46c/0xb50 [176805.948338] ? fpregs_assert_state_consistent+0x1d/0xa0 [176805.949055] ? exit_to_user_mode_prepare+0x30/0x120 [176805.949713] do_syscall_64+0x3d/0x90 [176805.950226] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [176805.950904] RIP: 0033:0x7f7f42c3f5ab [176805.951462] Code: 73 01 c3 48 8b 0d 75 a8 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 45 a8 1b 00 f7 d8 64 89 01 48 [176805.953710] RSP: 002b:00007fff07dc9d08 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [176805.954691] RAX: ffffffffffffffda RBX: 000055b6e91c01e0 RCX: 00007f7f42c3f5ab [176805.955691] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055b6e91c0248 [176805.956662] RBP: 000055b6e91c01e0 R08: 0000000000000000 R09: 0000000000000000 [176805.957601] R10: 00007f7f42d9eac0 R11: 0000000000000206 R12: 000055b6e91c0248 [176805.958593] R13: 0000000000000000 R14: 000055b6e91bfb38 R15: 0000000000000000 [176805.959599] </TASK> [176805.960324] Allocated by task 20490: [176805.960893] kasan_save_stack+0x1e/0x40 [176805.961463] kasan_set_track+0x21/0x30 [176805.962019] __kasan_kmalloc+0x77/0x90 [176805.962554] esw_offloads_init+0x1bb/0x480 [mlx5_core] [176805.963318] mlx5_eswitch_init+0xc70/0x15c0 [mlx5_core] [176805.964092] mlx5_init_one_devl_locked+0x366/0x1230 [mlx5_core] [176805.964902] probe_one+0x6f7/0xc90 [mlx5_core] [176805.965541] local_pci_probe+0xd7/0x180 [176805.966075] pci_device_probe+0x231/0x6f0 [176805.966631] really_probe+0x1d4/0xb50 [176805.967179] __driver_probe_device+0x18d/0x450 [176805.967810] driver_probe_device+0x49/0x120 [176805.968431] __driver_attach+0x1fb/0x490 [176805.968976] bus_for_each_dev+0xed/0x170 [176805.969560] bus_add_driver+0x21a/0x570 [176805.970124] driver_register+0x133/0x460 [176805.970684] 0xffffffffa0678065 [176805.971180] do_one_initcall+0x92/0x2b0 [176805.971744] do_init_module+0x22d/0x720 [176805.972318] load_module+0x58c3/0x63b0 [176805.972847] init_module_from_file+0xd2/0x130 [176805.973441] __x64_sys_finit_module+0x389/0x7c0 [176805.974045] do_syscall_64+0x3d/0x90 [176805.974556] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [176805.975566] Freed by task 27280: [176805.976077] kasan_save_stack+0x1e/0x40 [176805.976655] kasan_set_track+0x21/0x30 [176805.977221] kasan_save_free_info+0x27/0x40 [176805.977834] ____kasan_slab_free+0x11a/0x1b0 [176805.978505] __kmem_cache_free+0x163/0x2d0 [176805.979113] esw_offloads_cleanup_reps+0xb8/0x120 [mlx5_core] [176805.979963] mlx5_eswitch_cleanup+0x182/0x270 [mlx5_core] [176805.980763] mlx5_cleanup_once+0x9a/0x1e0 [mlx5_core] [176805.981477] mlx5_uninit_one+0xa9/0x180 [mlx5_core] [176805.982196] remove_one+0x8f/0x210 [mlx5_core] [176805.982868] pci_device_remove+0x96/0x1c0 [176805.983461] device_release_driver_internal+0x361/0x520 [176805.984169] driver_detach+0xd7/0x1d0 [176805.984702] bus_remove_driver+0x11f/0x290 [176805.985261] pci_unregister_driver+0x23/0x1f0 [176805.985847] mlx5_cleanup+0xc/0x20 [mlx5_core] [176805.986483] __x64_sys_delete_module+0x2b3/0x450 [176805.987126] do_syscall_64+0x3d/0x90 [176805.987665] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [176805.988667] Last potentially related work creation: [176805.989305] kasan_save_stack+0x1e/0x40 [176805.989839] __kasan_record_aux_stack+0x9c/0xb0 [176805.990443] kvfree_call_rcu+0x84/0xa30 [176805.990973] clean_xps_maps+0x265/0x6e0 [176805.991547] netif_reset_xps_queues.part.0+0x3f/0x80 [176805.992226] unregister_netdevice_many_notify+0xfcf/0x1500 [176805.992966] unregister_netdevice_queue+0x1e0/0x280 [176805.993638] unregister_netdev+0x18/0x20 [176805.994205] mlx5e_remove+0xba/0x1e0 [mlx5_core] [176805.994872] auxiliary_bus_remove+0x52/0x70 [176805.995490] device_release_driver_internal+0x361/0x520 [176805.996196] bus_remove_device+0x1e1/0x3d0 [176805.996767] device_del+0x390/0x980 [176805.997270] mlx5_rescan_drivers_locked.part.0+0x130/0x540 [mlx5_core] [176805.998195] mlx5_unregister_device+0x77/0xc0 [mlx5_core] [176805.998989] mlx5_uninit_one+0x41/0x180 [mlx5_core] [176805.999719] remove_one+0x8f/0x210 [mlx5_core] [176806.000387] pci_device_remove+0x96/0x1c0 [176806.000938] device_release_driver_internal+0x361/0x520 [176806.001612] unbind_store+0xd8/0xf0 [176806.002108] kernfs_fop_write_iter+0x2c0/0x440 [176806.002748] vfs_write+0x725/0xba0 [176806.003294] ksys_write+0xed/0x1c0 [176806.003823] do_syscall_64+0x3d/0x90 [176806.004357] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [176806.005317] The buggy address belongs to the object at ffff888155090a80 which belongs to the cache kmalloc-64 of size 64 [176806.006774] The buggy address is located 32 bytes inside of freed 64-byte region [ffff888155090a80, ffff888155090ac0) [176806.008773] The buggy address belongs to the physical page: [176806.009480] page:00000000a407e0e6 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x155090 [176806.010633] flags: 0x200000000000800(slab|node=0|zone=2) [176806.011352] page_type: 0xffffffff() [176806.011905] raw: 0200000000000800 ffff888100042640 ffffea000422b1c0 dead000000000004 [176806.012949] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 [176806.013933] page dumped because: kasan: bad access detected [176806.014935] Memory state around the buggy address: [176806.015601] ffff888155090980: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [176806.016568] ffff888155090a00: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [176806.017497] >ffff888155090a80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [176806.018438] ^ [176806.019007] ffff888155090b00: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [176806.020001] ffff888155090b80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [176806.020996] ================================================================== Fixes: a508728a4c8b ("net/mlx5e: VF tunnel RX traffic offloading") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5: Fix fw tracer first block checkMoshe Shemesh
While handling new traces, to verify it is not the first block being written, last_timestamp is checked. But instead of checking it is non zero it is verified to be zero. Fix to verify last_timestamp is not zero. Fixes: c71ad41ccb0c ("net/mlx5: FW tracer, events handling") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Feras Daoud <ferasda@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: XDP, Drop fragmented packets larger than MTU sizeCarolina Jubran
XDP transmits fragmented packets that are larger than MTU size instead of dropping those packets. The drop check that checks whether a packet is larger than MTU is comparing MTU size against the linear part length only. Adjust the drop check to compare MTU size against both linear and non-linear part lengths to avoid transmitting fragmented packets larger than MTU size. Fixes: 39a1665d16a2 ("net/mlx5e: Implement sending multi buffer XDP frames") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Decrease num_block_tc when unblock tc offloadChris Mi
The cited commit increases num_block_tc when unblock tc offload. Actually should decrease it. Fixes: c8e350e62fc5 ("net/mlx5e: Make TC and IPsec offloads mutually exclusive on a netdev") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Fix overrun reported by coverityJianbo Liu
Coverity Scan reports the following issue. But it's impossible that mlx5_get_dev_index returns 7 for PF, even if the index is calculated from PCI FUNC ID. So add the checking to make coverity slience. CID 610894 (#2 of 2): Out-of-bounds write (OVERRUN) Overrunning array esw->fdb_table.offloads.peer_miss_rules of 4 8-byte elements at element index 7 (byte offset 63) using index mlx5_get_dev_index(peer_dev) (which evaluates to 7). Fixes: 9bee385a6e39 ("net/mlx5: E-switch, refactor FDB miss rule add/remove") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: fix a potential double-free in fs_udp_create_groupsDinghao Liu
When kcalloc() for ft->g succeeds but kvzalloc() for in fails, fs_udp_create_groups() will free ft->g. However, its caller fs_udp_create_table() will free ft->g again through calling mlx5e_destroy_flow_table(), which will lead to a double-free. Fix this by setting ft->g to NULL in fs_udp_create_groups(). Fixes: 1c80bd684388 ("net/mlx5e: Introduce Flow Steering UDP API") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Fix a race in command alloc flowShifeng Li
Fix a cmd->ent use after free due to a race on command entry. Such race occurs when one of the commands releases its last refcount and frees its index and entry while another process running command flush flow takes refcount to this command entry. The process which handles commands flush may see this command as needed to be flushed if the other process allocated a ent->idx but didn't set ent to cmd->ent_arr in cmd_work_handler(). Fix it by moving the assignment of cmd->ent_arr into the spin lock. [70013.081955] BUG: KASAN: use-after-free in mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core] [70013.081967] Write of size 4 at addr ffff88880b1510b4 by task kworker/26:1/1433361 [70013.081968] [70013.082028] Workqueue: events aer_isr [70013.082053] Call Trace: [70013.082067] dump_stack+0x8b/0xbb [70013.082086] print_address_description+0x6a/0x270 [70013.082102] kasan_report+0x179/0x2c0 [70013.082173] mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core] [70013.082267] mlx5_cmd_flush+0x80/0x180 [mlx5_core] [70013.082304] mlx5_enter_error_state+0x106/0x1d0 [mlx5_core] [70013.082338] mlx5_try_fast_unload+0x2ea/0x4d0 [mlx5_core] [70013.082377] remove_one+0x200/0x2b0 [mlx5_core] [70013.082409] pci_device_remove+0xf3/0x280 [70013.082439] device_release_driver_internal+0x1c3/0x470 [70013.082453] pci_stop_bus_device+0x109/0x160 [70013.082468] pci_stop_and_remove_bus_device+0xe/0x20 [70013.082485] pcie_do_fatal_recovery+0x167/0x550 [70013.082493] aer_isr+0x7d2/0x960 [70013.082543] process_one_work+0x65f/0x12d0 [70013.082556] worker_thread+0x87/0xb50 [70013.082571] kthread+0x2e9/0x3a0 [70013.082592] ret_from_fork+0x1f/0x40 The logical relationship of this error is as follows: aer_recover_work | ent->work -------------------------------------------+------------------------------ aer_recover_work_func | |- pcie_do_recovery | |- report_error_detected | |- mlx5_pci_err_detected |cmd_work_handler |- mlx5_enter_error_state | |- cmd_alloc_index |- enter_error_state | |- lock cmd->alloc_lock |- mlx5_cmd_flush | |- clear_bit |- mlx5_cmd_trigger_completions| |- unlock cmd->alloc_lock |- lock cmd->alloc_lock | |- vector = ~dev->cmd.vars.bitmask |- for_each_set_bit | |- cmd_ent_get(cmd->ent_arr[i]) (UAF) |- unlock cmd->alloc_lock | |- cmd->ent_arr[ent->idx]=ent The cmd->ent_arr[ent->idx] assignment and the bit clearing are not protected by the cmd->alloc_lock in cmd_work_handler(). Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler") Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: Fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list()Shifeng Li
Out_sz that the size of out buffer is calculated using query_nic_vport _context_in structure when driver query the MAC list. However query_nic _vport_context_in structure is smaller than query_nic_vport_context_out. When allowed_list_size is greater than 96, calling ether_addr_copy() will trigger an slab-out-of-bounds. [ 1170.055866] BUG: KASAN: slab-out-of-bounds in mlx5_query_nic_vport_mac_list+0x481/0x4d0 [mlx5_core] [ 1170.055869] Read of size 4 at addr ffff88bdbc57d912 by task kworker/u128:1/461 [ 1170.055870] [ 1170.055932] Workqueue: mlx5_esw_wq esw_vport_change_handler [mlx5_core] [ 1170.055936] Call Trace: [ 1170.055949] dump_stack+0x8b/0xbb [ 1170.055958] print_address_description+0x6a/0x270 [ 1170.055961] kasan_report+0x179/0x2c0 [ 1170.056061] mlx5_query_nic_vport_mac_list+0x481/0x4d0 [mlx5_core] [ 1170.056162] esw_update_vport_addr_list+0x2c5/0xcd0 [mlx5_core] [ 1170.056257] esw_vport_change_handle_locked+0xd08/0x1a20 [mlx5_core] [ 1170.056377] esw_vport_change_handler+0x6b/0x90 [mlx5_core] [ 1170.056381] process_one_work+0x65f/0x12d0 [ 1170.056383] worker_thread+0x87/0xb50 [ 1170.056390] kthread+0x2e9/0x3a0 [ 1170.056394] ret_from_fork+0x1f/0x40 Fixes: e16aea2744ab ("net/mlx5: Introduce access functions to modify/query vport mac lists") Cc: Ding Hui <dinghui@sangfor.com.cn> Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13net/mlx5e: fix double free of encap_headerVlad Buslov
Cited commit introduced potential double free since encap_header can be destroyed twice in some cases - once by error cleanup sequence in mlx5e_tc_tun_{create|update}_header_ipv{4|6}(), once by generic mlx5e_encap_put() that user calls as a result of getting an error from tunnel create|update. At the same time the point where e->encap_header is assigned can't be delayed because the function can still return non-error code 0 as a result of checking for NUD_VALID flag, which will cause neighbor update to dereference NULL encap_header. Fix the issue by: - Nulling local encap_header variables in mlx5e_tc_tun_{create|update}_header_ipv{4|6}() to make kfree(encap_header) call in error cleanup sequence noop after that point. - Assigning reformat_params.data from e->encap_header instead of local variable encap_header that was set to NULL pointer by previous step. Also assign reformat_params.size from e->encap_size for uniformity and in order to make the code less error-prone in the future. Fixes: d589e785baf5 ("net/mlx5e: Allow concurrent creation of encap entries") Reported-by: Dust Li <dust.li@linux.alibaba.com> Reported-by: Cruz Zhao <cruzzhao@linux.alibaba.com> Reported-by: Tianchen Ding <dtcccc@linux.alibaba.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13Revert "net/mlx5e: fix double free of encap_header"Vlad Buslov
This reverts commit 6f9b1a0731662648949a1c0587f6acb3b7f8acf1. This patch is causing a null ptr issue, the proper fix is in the next patch. Fixes: 6f9b1a073166 ("net/mlx5e: fix double free of encap_header") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13Revert "net/mlx5e: fix double free of encap_header in update funcs"Vlad Buslov
This reverts commit 3a4aa3cb83563df942be49d145ee3b7ddf17d6bb. This patch is causing a null ptr issue, the proper fix is in the next patch. Fixes: 3a4aa3cb8356 ("net/mlx5e: fix double free of encap_header in update funcs") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-12-13mlx5: implement VLAN tag XDP hintLarysa Zaremba
Implement the newly added .xmo_rx_vlan_tag() hint function. Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20231205210847.28460-15-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13veth: Implement VLAN tag XDP hintLarysa Zaremba
In order to test VLAN tag hint in hardware-independent selftests, implement newly added hint in veth driver. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-13-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>