Age | Commit message (Collapse) | Author |
|
As opposed to Spectrum-1, in which time stamps arrive through a pair of
dedicated events into a queue and later are being matched to the
corresponding packets, in Spectrum-2 we are reading the time stamps
directly from the CQE. Software can get the time stamp in UTC format
using CQEv2.
Add a time stamp field to 'struct mlxsw_skb_cb'. In
mlxsw_pci_cqe_{rdq,sdq}_handle() extract the time stamp from the CQE into
the new time stamp field. Note that the time stamp in the CQE is
represented by 38 bits, which is a short representation of UTC time.
Software should create the full time stamp using the global UTC clock.
Read UTC clock from hardware only for PTP packets which were trapped to CPU
with PTP0 trap ID (event packets).
Use the time stamp from the SKB when packet is received or transmitted.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In Spectrum-2 and Spectrum-3, the correction field of PTP packets which are
sent as control packets is not updated at egress port. To overcome this
limitation, PTP packets which require time stamp, should be sent as data
packets with the following details:
1. FID valid = 1
2. FID value above the maximum FID
3. rx_router_port = 1
>From Spectrum-4 and on, this limitation will be solved.
Extend the function which handles TX header, in case that the packet is
a PTP packet, add TX header with type=data and all the above mentioned
requirements. Add operation as part of 'struct mlxsw_sp_ptp_ops', to be
able to separate the handling of PTP packets between different ASICs. Use
the data packet solution only for Spectrum-2 and Spectrum-3. Therefore, add
a dedicated operation structure for Spectrum-4, as it will be same to
Spectrum-2 in PTP implementation, just will not have the limitation of
control packets.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement physical hardware clock operations. The main difference between
the existing operations of Spectrum-1 and the new operations of Spectrum-2
is the usage of UTC hardware clock instead of FRC.
Add support for init() and fini() functions for PTP clock in Spectrum-2.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Query UTC sec and nsec PCI offsets during the pci_init(), to be able to
read UTC time later.
Implement functions to read UTC seconds and nanoseconds from the offset
which was read as part of initialization.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Lay the groundwork for Spectrum-2 support. On Spectrum-2, the packets get
the time stamps from the CQE, which means that the time stamp is attached
to its packet.
Configure MTPTPT to set which message types should arrive under which
PTP trap. PTP0 will be used for event message types, which means that
the packets require time stamp. PTP1 will be used for other packets.
Note that in Spectrum-2, all packets contain time stamp by default. The two
types of traps (PTP0, PTP1) will be used to separate between PTP_EVENT
traps and PTP_GENERAL traps, so then the driver will fill the time stamp as
part of the SKB only for event message types.
Later the driver will enable the traps using 'MTPCPC.ptp_trap_en' bit.
Then, PTP packets start arriving through the PTP traps.
Currently, the structure 'mlxsw_sp2_ptp_state' contains only the common
structure, the next patches will extend it.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, Tx completions are reported using Completion Queue Element
version 1 (CQEv1). These elements do not contain the Tx time stamp,
which is fine as Spectrum-1 reads Tx time stamps via a dedicated FIFO
and Spectrum-2 does not currently support PTP.
In preparation for Spectrum-2 PTP support, use CQEv2 for Spectrum-2 and
newer ASICs, as this CQE format encodes the Tx time stamp.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
MTPTPT register is used to set which message types should arrive under
which PTP trap. Currently, PTP0 is used for event message types, which
means that the packets require time stamp. PTP1 is used for other packets.
This configuration will be same for Spectrum-2 and newer ASICs. In
preparation for Spectrum-2 PTP support, add helper functions to
configure PTP traps and use them for Spectrum-1. These functions will be
used later also for Spectrum-2.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to support AVIC on SNP-enabled system, The IOMMU driver needs to
check EFR2[SNPAVICSup] and enables the support by setting SNPAVICEn bit
in the IOMMU control register (MMIO offset 18h).
For detail, please see section "SEV-SNP Guest Virtual APIC Support" of the
AMD I/O Virtualization Technology (IOMMU) Specification.
(https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf)
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20220726134348.6438-3-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Currently, enabling AVIC requires individually detect and enable GAM and
GALOG features on each IOMMU, which is difficult to keep track on
multi-IOMMU system, where the features needs to be enabled system-wide.
In addition, these features do not need to be enabled in early stage.
It can be delayed until after amd_iommu_init_pci().
Therefore, consolidate logic for detecting and enabling IOMMU GAM and
GALOG features into a helper function, enable_iommus_vapic(), which uses
the check_feature_on_all_iommus() helper function to ensure system-wide
support of the features before enabling them, and postpone until after
amd_iommu_init_pci().
The new function also check and clean up feature enablement residue from
previous boot (e.g. in case of booting into kdump kernel), which triggers
a WARN_ON (shown below) introduced by the commit a8d4a37d1bb9 ("iommu/amd:
Restore GA log/tail pointer on host resume") in iommu_ga_log_enable().
[ 7.731955] ------------[ cut here ]------------
[ 7.736575] WARNING: CPU: 0 PID: 1 at drivers/iommu/amd/init.c:829 iommu_ga_log_enable.isra.0+0x16f/0x190
[ 7.746135] Modules linked in:
[ 7.749193] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W -------- --- 5.19.0-0.rc7.53.eln120.x86_64 #1
[ 7.759706] Hardware name: Dell Inc. PowerEdge R7525/04D5GJ, BIOS 2.1.6 03/09/2021
[ 7.767274] RIP: 0010:iommu_ga_log_enable.isra.0+0x16f/0x190
[ 7.772931] Code: 20 20 00 00 8b 00 f6 c4 01 74 da 48 8b 44 24 08 65 48 2b 04 25 28 00 00 00 75 13 48 83 c4 10 5b 5d e9 f5 00 72 00 0f 0b eb e1 <0f> 0b eb dd e8 f8 66 42 00 48 8b 15 f1 85 53 01 e9 29 ff ff ff 48
[ 7.791679] RSP: 0018:ffffc90000107d20 EFLAGS: 00010206
[ 7.796905] RAX: ffffc90000780000 RBX: 0000000000000100 RCX: ffffc90000780000
[ 7.804038] RDX: 0000000000000001 RSI: ffffc90000780000 RDI: ffff8880451f9800
[ 7.811170] RBP: ffff8880451f9800 R08: ffffffffffffffff R09: 0000000000000000
[ 7.818303] R10: 0000000000000000 R11: 0000000000000000 R12: 0008000000000000
[ 7.825435] R13: ffff8880462ea900 R14: 0000000000000021 R15: 0000000000000000
[ 7.832572] FS: 0000000000000000(0000) GS:ffff888054a00000(0000) knlGS:0000000000000000
[ 7.840657] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7.846400] CR2: ffff888054dff000 CR3: 0000000053210000 CR4: 0000000000350eb0
[ 7.853533] Call Trace:
[ 7.855979] <TASK>
[ 7.858085] amd_iommu_enable_interrupts+0x180/0x270
[ 7.863051] ? iommu_setup+0x271/0x271
[ 7.866803] state_next+0x197/0x2c0
[ 7.870295] ? iommu_setup+0x271/0x271
[ 7.874049] iommu_go_to_state+0x24/0x2c
[ 7.877976] amd_iommu_init+0xf/0x29
[ 7.881554] pci_iommu_init+0xe/0x36
[ 7.885133] do_one_initcall+0x44/0x200
[ 7.888975] do_initcalls+0xc8/0xe1
[ 7.892466] kernel_init_freeable+0x14c/0x199
[ 7.896826] ? rest_init+0xd0/0xd0
[ 7.900231] kernel_init+0x16/0x130
[ 7.903723] ret_from_fork+0x22/0x30
[ 7.907306] </TASK>
[ 7.909497] ---[ end trace 0000000000000000 ]---
Fixes: commit a8d4a37d1bb9 ("iommu/amd: Restore GA log/tail pointer on host resume")
Reported-by: Jerry Snitselaar <jsnitsel@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Will Deacon <will@kernel.org> (maintainer:IOMMU DRIVERS)
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20220726134348.6438-2-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
If CONFIG_ACPI_IORT=y and CONFIG_IOMMU_API is not set,
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-,
will be failed, like this:
drivers/acpi/arm64/iort.c: In function ‘iort_get_rmr_sids’:
drivers/acpi/arm64/iort.c:1406:2: error: implicit declaration of function ‘iort_iommu_rmr_get_resv_regions’; did you mean ‘iort_iommu_get_resv_regions’? [-Werror=implicit-function-declaration]
iort_iommu_rmr_get_resv_regions(iommu_fwnode, NULL, head);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
iort_iommu_get_resv_regions
cc1: some warnings being treated as errors
make[3]: *** [drivers/acpi/arm64/iort.o] Error 1
The function iort_iommu_rmr_get_resv_regions()
is declared under CONFIG_IOMMU_API,
and the callers of iort_get_rmr_sids() and iort_put_rmr_sids()
would select IOMMU_API.
To fix this error, move the definitions to #ifdef CONFIG_IOMMU_API.
Fixes: e302eea8f497 ("ACPI/IORT: Add a helper to retrieve RMR info directly")
Signed-off-by: Ren Zhijie <renzhijie2@huawei.com>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/20220726033520.47865-1-renzhijie2@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Aside of urb->transfer_buffer_length and urb->context which might
change in the TX path, all the other URB parameters remains constant
during runtime. So, there is no reasons to call usb_fill_bulk_urb()
each time before submitting an URB.
Make sure to initialize all the fields of the URB at allocation
time. For the TX branch, replace the call usb_fill_bulk_urb() by an
assignment of urb->context. urb->urb->transfer_buffer_length is
already set by the caller functions, no need to set it again. For the
RX branch, because all parameters are unchanged, simply remove the
call to usb_fill_bulk_urb().
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/all/20220729080902.25839-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Remove the success initializer from the ret variable in rtw_pwr_wakeup,
as we set it later anyway in the success path, and also set on failure.
This makes the function appear cleaner and more consistent.
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Philipp Hortmann <philipp.g.hortmann@gmail.com>
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Link: https://lore.kernel.org/r/20220728231150.972-2-phil@philpotter.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The same function to read the switch id is used by drivers based on
qca8k family switch. Move them to common code to make them accessible
also by other drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same port LAG functions are used by drivers based on qca8k family
switch. Move them to common code to make them accessible also by other
drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same port VLAN functions are used by drivers based on qca8k family
switch. Move them to common code to make them accessible also by other
drivers.
Also drop exposing busy_wait and make it static.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same port mirror functions are used by drivers based on qca8k family
switch. Move them to common code to make them accessible also by other
drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same port FDB/MDB function are used by drivers based on qca8k family
switch. Move them to common code to make them accessible also by other
drivers.
Also drop bulk read/write functions and make them static
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same set age, MTU and port enable/disable function are used by
driver based on qca8k family switch.
Move them to common code to make them accessible also by other drivers.
While at it also drop unnecessary qca8k_priv cast for void pointers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same bridge functions are used by drivers based on qca8k family
switch. Move them to common code to make them accessible also by other
drivers.
While at it also drop unnecessary qca8k_priv cast for void pointers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same logic to disable/enable port, set eee and get ethtool stats is
used by drivers based on qca8k family switch.
Move it to common code to make it accessible also by other drivers.
While at it also drop unnecessary qca8k_priv cast for void pointers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same mib function is used by drivers based on qca8k family switch.
Move it to common code to make it accessible also by other drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same ATU function are used by drivers based on qca8k family switch.
Move the bulk read/write helper to common code to declare these shared
ATU functions in common code.
These helper will be dropped when regmap correctly support bulk
read/write.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same reg table and read/write/rmw function are used by drivers
based on qca8k family switch.
Move them to common code to make it accessible also by other drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The same MIB struct is used by drivers based on qca8k family switch. Move
it to common code to make it accessible also by other drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some switch may not support mib autocast feature and require the legacy
way of reading the regs directly.
Make the mib autocast feature optional and permit to declare support for
it using match_data struct in a dedicated qca8k_info_ops struct.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Using of_device_get_match_data is expensive. Cache match data to speed
up access and rework user of match data to use the new cached value.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since we have a proper endianness converters for BE 48-bit data use
them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220726144906.5217-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Correct spelling on 'non-existent' in comment
Signed-off-by: Ruffalo Lavoisier <RuffaloLavoisier@gmail.com>
Link: https://lore.kernel.org/r/20220728032854.151180-1-RuffaloLavoisier@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix following includecheck warning:
./drivers/net/ethernet/mellanox/mlxsw/core_linecard_dev.c: linux/err.h is included more than once.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220727233801.23781-1-yang.lee@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Let the core take the devlink instance lock around health callbacks and
remove the now redundant locking in the drivers.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Change devlink instance locks in mlx5 driver to have devlink health
recovery callback locked, while keeping all driver paths which lead to
devl_ API functions called by the driver locked.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Change devlink instance locks in mlx4 driver to have devlink reload
callback locked, while keeping all driver paths which leads to devl_ API
functions called by the mlx4 driver locked.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use devl_ API to call devl_port_register() and devl_port_unregister()
instead of devlink_port_register() and devlink_port_unregister(). Add
devlink instance lock in mlx4 driver paths to these functions.
This will be used by the downstream patch to invoke mlx4 devlink reload
callbacks with devlink lock held.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use devl_ API to call devl_region_create() and devl_region_destroy()
instead of devlink_region_create() and devlink_region_destroy().
Add devlink instance lock in mlx4 driver paths to these functions.
This will be used by the downstream patch to invoke mlx4 devlink reload
callbacks with devlink lock held.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Change devlink instance locks in mlx5 driver to have devlink reload
callbacks locked, while keeping all driver paths which lead to devl_ API
functions called by the driver locked.
Add mlx5_load_one_devl_locked() and mlx5_unload_one_devl_locked() which
are used by the paths which are already locked such as devlink reload
callbacks.
This patch makes the driver use devl_ API also for traps register as
these functions are called from the driver paths parallel to reload that
requires locking now.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refactor fw reset code to have the unload driver part done on
mlx5_fw_reset_complete_reload(), so if it was called by the PF which
initiated the reload fw activate flow, the unload part will be handled
by the mlx5_devlink_reload_fw_activate() callback itself and not by the
reset event work.
This will be used by the downstream patch to invoke devlink reload
callbacks with devlink lock held.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add callbacks
=============
.selftest_check: returns true for flash selftest.
.selftest_run: runs a flash selftest.
Also, refactor NVM APIs so that they can be
used with devlink and ethtool both.
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Let the TLS TX recycle pool be more flexible in size, by continuously
and dynamically allocating and releasing HW resources in response to
changes in the connections rate and load.
Allocate and release pool entries in bulks (16). Use a workqueue to
release/allocate in the background. Allocate a new bulk when the pool
size goes lower than the low threshold (1K). Symmetric operation is done
when the pool size gets greater than the upper threshold (4K).
Every idle pool entry holds: 1 TIS, 1 DEK (HW resources), in addition to
~100 bytes in host memory.
Start with an empty pool to minimize memory and HW resources waste for
non-TLS users that have the device-offload TLS enabled.
Upon a new request, in case the pool is empty, do not wait for a whole bulk
allocation to complete. Instead, trigger an instant allocation of a single
resource to reduce latency.
Performance tests:
Before: 11,684 CPS
After: 16,556 CPS
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The transport interface send (TIS) object is responsible for performing
all transport related operations of the transmit side. The ConnectX HW
uses a TIS object to save and access the TLS crypto information and state
of an offloaded TX kTLS connection.
Before this patch, we used to create a new TIS per connection and destroy
it once it’s closed. Every create and destroy of a TIS is a FW command.
Same applies for the private TLS context, where we used to dynamically
allocate and free it per connection.
Resources recycling reduce the impact of the allocation/free operations
and helps speeding up the connection rate.
In this feature we maintain a pool of TX objects and use it to recycle
the resources instead of re-creating them per connection.
A cached TIS popped from the pool is updated to serve the new connection
via the fast-path HW interface, updating the tls static and progress
params. This is a very fast operation, significantly faster than FW
commands.
On recycling, a WQE fence is required after the context params change.
This guarantees that the data is sent after the context has been
successfully updated in hardware, and that the context modification
doesn't interfere with existing traffic.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Let the caller of mlx5e_ktls_tx_handle_ooo() take care of updating the
stats, according to the returned value. As the switch/case blocks are
already there, this change saves unnecessary branches in the handler.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TLS TIS objects have a defined role in mapping and reaching the HW TLS
contexts. Some standard TIS attributes (like LAG port affinity) are
not relevant for them.
Use a dedicated TLS TIS create function instead of the generic
mlx5e_create_tis.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
delete extra space and tab in blank line, there is no functional change.
Signed-off-by: Xie Shaowen <studentxswpy@163.com>
Link: https://lore.kernel.org/r/20220727081253.3043941-1-studentxswpy@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 2aec377a2925 ("dm table: remove dm_table_get_num_targets()
wrapper") in linux-dm/for-next removed the function
dm_table_get_num_targets() which is used by verity-loadpin. Access
table->num_targets directly instead of using the defunct wrapper.
Fixes: b6c1c5745ccc ("dm: Add verity helpers for LoadPin")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220728085412.1.I242d21b378410eb6f9897a3160efb56e5608c59d@changeid
|
|
Pull drm fix from Dave Airlie:
"Quiet extra week, just a single fix for i915 workaround with execlist
backend.
i915:
- Further reset robustness improvements for execlists [Wa_22011802037]"
* tag 'drm-fixes-2022-07-29' of git://anongit.freedesktop.org/drm/drm:
drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add an optional flag that ensures dm_bufio_client does not sleep
(primary focus is to service dm_bufio_get without sleeping). This
allows the dm-bufio cache to be queried from interrupt context.
To ensure that dm-bufio does not sleep, dm-bufio must use a spinlock
instead of a mutex. Additionally, to avoid deadlocks, special care
must be taken so that dm-bufio does not sleep while holding the
spinlock.
But again: the scope of this no_sleep is initially confined to
dm_bufio_get, so __alloc_buffer_wait_no_callback is _not_ changed to
avoid sleeping because __bufio_new avoids allocation for NF_GET.
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
Add a flags argument to dm_bufio_client_create and update all the
callers. This is in preparation to add the DM_BUFIO_NO_SLEEP flag.
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
Commit ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone")
introduced the optimization to _not_ perform bio_associate_blkg()'s
relatively costly work when DM core clones its bio. But in doing so it
exposed the possibility for DM's cloned bio to alter DM target
behavior (e.g. crash) if a target were to issue IO without first
calling bio_set_dev().
The DM raid target can trigger an MD crash due to its need to split
the DM bio that is passed to md_handle_request(). The split will
recurse to submit_bio_noacct() using a bio with an uninitialized
->bi_blkg. This NULL bio->bi_blkg causes blk_throtl_bio() to
dereference a NULL blkg_to_tg(bio->bi_blkg).
Fix this in DM core by adding a new 'needs_bio_set_dev' target flag that
will make alloc_tio() call bio_set_dev() on behalf of the target.
dm-raid is the only target that requires this flag. bio_set_dev()
initializes the DM cloned bio's ->bi_blkg, using bio_associate_blkg,
before passing the bio to md_handle_request().
Long-term fix would be to audit and refactor MD code to rely on DM to
split its bio, using dm_accept_partial_bio(), but there are MD raid
personalities (e.g. raid1 and raid10) whose implementation are tightly
coupled to handling the bio splitting inline.
Fixes: ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
There is a KASAN warning in raid_resume when running the lvm test
lvconvert-raid.sh. The reason for the warning is that mddev->raid_disks
is greater than rs->raid_disks, so the loop touches one entry beyond
the allocated length.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
There is this warning when using a kernel with the address sanitizer
and running this testsuite:
https://gitlab.com/cki-project/kernel-tests/-/tree/main/storage/swraid/scsi_raid
==================================================================
BUG: KASAN: slab-out-of-bounds in raid_status+0x1747/0x2820 [dm_raid]
Read of size 4 at addr ffff888079d2c7e8 by task lvcreate/13319
CPU: 0 PID: 13319 Comm: lvcreate Not tainted 5.18.0-0.rc3.<snip> #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0x6a/0x9c
print_address_description.constprop.0+0x1f/0x1e0
print_report.cold+0x55/0x244
kasan_report+0xc9/0x100
raid_status+0x1747/0x2820 [dm_raid]
dm_ima_measure_on_table_load+0x4b8/0xca0 [dm_mod]
table_load+0x35c/0x630 [dm_mod]
ctl_ioctl+0x411/0x630 [dm_mod]
dm_ctl_ioctl+0xa/0x10 [dm_mod]
__x64_sys_ioctl+0x12a/0x1a0
do_syscall_64+0x5b/0x80
The warning is caused by reading conf->max_nr_stripes in raid_status. The
code in raid_status reads mddev->private, casts it to struct r5conf and
reads the entry max_nr_stripes.
However, if we have different raid type than 4/5/6, mddev->private
doesn't point to struct r5conf; it may point to struct r0conf, struct
r1conf, struct r10conf or struct mpconf. If we cast a pointer to one
of these structs to struct r5conf, we will be reading invalid memory
and KASAN warns about it.
Fix this bug by reading struct r5conf only if raid type is 4, 5 or 6.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|