Age | Commit message (Collapse) | Author |
|
The rk3568 usb2phy is a standalone device with a single muxed interrupt.
Add support for the registers to the usb2phy driver.
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Michael Riesch <michael.riesch@wolfvision.net>
Link: https://lore.kernel.org/r/20211215210252.120923-7-pgwipeout@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The rk3568 usb2phy has a single muxed interrupt that handles all
interrupts.
Allow the driver to plug in only a single interrupt as necessary.
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Michael Riesch <michael.riesch@wolfvision.net>
Link: https://lore.kernel.org/r/20211215210252.120923-6-pgwipeout@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
New Rockchip devices have the usb2 phy devices as standalone nodes
instead of children of the grf node.
Allow the driver to find the grf node from a phandle.
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Michael Riesch <michael.riesch@wolfvision.net>
Link: https://lore.kernel.org/r/20211215210252.120923-5-pgwipeout@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
New Rockchip devices have the usb phy nodes as standalone devices.
These nodes have register nodes with #address_cells = 2, but only use 32
bit addresses.
Adjust the driver to check if the returned address is "0", and adjust
the index in that case.
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Michael Riesch <michael.riesch@wolfvision.net>
Link: https://lore.kernel.org/r/20211215210252.120923-4-pgwipeout@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
This patch adds basic support for EtherType RX frame steering for
LLDP and PTP using the hardware offload capabilities.
Example steps for setting up RX frame steering for LLDP and PTP:
$ IFDEVNAME=eth0
$ tc qdisc add dev $IFDEVNAME ingress
$ tc qdisc add dev $IFDEVNAME root mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
For LLDP
$ tc filter add dev $IFDEVNAME parent ffff: protocol 0x88cc \
flower hw_tc 5
OR
$ tc filter add dev $IFDEVNAME parent ffff: protocol LLDP \
flower hw_tc 5
For PTP
$ tc filter add dev $IFDEVNAME parent ffff: protocol 0x88f7 \
flower hw_tc 6
Show tc ingress filter
$ tc filter show dev $IFDEVNAME ingress
v1->v2:
Thanks to Kurt's and Sebastian's suggestion.
- change from __be16 to u16 etype
- change ETHER_TYPE_FULL_MASK to use cpu_to_be16() macro
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch series extends the current supported bridge flags with the
following flags: BR_FLOOD, BR_BCAST_FLOOD and BR_LEARNING.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While the 'iopolicy' sysfs attribute can be set at runtime, most
storage arrays prefer to use the 'round-robin' iopolicy per default.
We can use udev rules to set this, but is getting rather unwieldy
for rebranded arrays as we would have to update the udev rules
anytime a new array shows up, leading to the same mess we currently
have in multipathd for configuring the RDAC arrays.
Hence this patch adds a module parameter 'iopolicy' to allow the
admin to switch the default, and to do away with the need for a
udev rule here.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The variable 'ctrl' became useless since the code using it was dropped
from nvme_setup_cmd() in the commit 292ddf67bbd5 ("nvme: increment
request genctr on completion"). Fix it to get rid of this compilation
warning in the nvme-5.17 branch:
drivers/nvme/host/core.c: In function ‘nvme_setup_cmd’:
drivers/nvme/host/core.c:993:20: warning: unused variable ‘ctrl’ [-Wunused-variable]
struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
^~~~
Fixes: 292ddf67bbd5 ("nvme: increment request genctr on completion")
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The nvme request generation counter is intended to catch duplicate
completions. Incrementing the counter on submission means duplicates can
only be caught if the request tag is reallocated and dispatched prior to
the driver observing the corrupted CQE. Incrementing on completion
removes this window, making it possible to detect duplicate completions
in consecutive entries.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Currently applications have a hard time figuring out which
nvme-over-fabrics arguments are supported for any given kernel;
the ioctl will return an error code on failure, and the application
has to guess whether this was due to an invalid argument or due
to a connection or controller error.
With this patch applications can read a list of supported
arguments by simply reading from /dev/nvme-fabrics, allowing
them to validate the connection string.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Constify 'struct spi_nor_fixups' in order to respect flash_info
structure declaration.
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Reviewed-by: Pratyush Yadav <p.yadav@ti.com>
Link: https://lore.kernel.org/r/20211106102915.153552-1-tudor.ambarus@microchip.com
|
|
Remove the references to the old spi-nor.c file.
The old drivers/mtd/spi-nor/spi-nor.c file
is not more present and now some of its code is contained in:
drivers/mtd/spi-nor/core.c
Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
[tudor.ambarus@microchip.com:
- remove change in Documentation/driver-api/mtd/spi-nor.rst.
The documentation has to be rewritten entirely.
- update commit message]
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Reviewed-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Link: https://lore.kernel.org/r/20210126092516.1431913-1-f.suligoi@asem.it
|
|
Update the driver version to newer version format i.e. 8.0.0.61.0.
Link: https://lore.kernel.org/r/20211220141159.16117-26-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Set reply queue depth of 1K for B0 and 4K for A0.
While freeing the segmented request queues use the actual queue depth that
is used while creating them.
Link: https://lore.kernel.org/r/20211220141159.16117-25-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Enhance driver to consider MPI3_IOCSTATUS_SCSI_IOC_TERMINATED as a success
for TMs issued by it and check the pending I/Os to decide the success or
failure of the task management requests instead of just considering the
MPI3_IOCSTATUS_SCSI_IOC_TERMINATED as a failure of the task management
request.
Link: https://lore.kernel.org/r/20211220141159.16117-24-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Remove locally defined TM response codes and use codes from MPI3 headers.
Link: https://lore.kernel.org/r/20211220141159.16117-23-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add support for the io_uring interface in I/O-polled mode.
This feature is disabled in the driver by default. To enable the feature, a
module parameter "poll_queues" has to be set with the desired number of
polling queues.
When the feature is enabled, the driver reserves a certain number of
operational queue pairs for the poll_queues either from the available queue
pairs or creates additional queue pairs based on the operational queue
availability.
The Polling queues will have corresponding IRQ and ISR functions as similar
to default queues. However, the IRQ line is disabled by the driver for
poll_queues.
Link: https://lore.kernel.org/r/20211220141159.16117-22-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Print cable management & temperature threshold event data.
Use vendor id & device id macro definitions from MPI3 headers.
Link: https://lore.kernel.org/r/20211220141159.16117-21-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The IOC sends a Prepare for Reset Event to the host to prepare for a Soft
Reset. This event data has two reason codes:
1. Start - The host is expected to gracefully quiesce all I/O within
approximately 1 second.
2. Abort - The IOC is requesting to abort a previous Prepare for Reset
Event request. Normal I/O may be resumed.
Link: https://lore.kernel.org/r/20211220141159.16117-20-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add Event acknowledgment logic.
Link: https://lore.kernel.org/r/20211220141159.16117-19-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Enhance driver to gracefully handle discrepancies in certain key data sizes
between firmware update operations as mentioned below:
- The driver displays an error message and marks the controller as
unrecoverable if the firmware reports ReplyFrameSize that is greater
than the current ReplyFrameSize.
- If the firmware reports ReplyFrameSize greater than the current
ReplyFrameSize then the driver uses the current ReplyFrameSize while
copying the reply messages.
- The driver displays an error message and marks the controller as
unrecoverable if the firmware reports MaxOperationalReplyQueues less
than the currently allocated operational reply queues count.
- If the firmware reports MaxOperationalReplyQueues that is greater than
the currently allocated operational reply queue count then the driver
ignores the new increased value and uses the previously allocated number
of operational queues only.
- If the firmware reports MaxDevHandle greater than the previously used
MaxDevHandle value after a reset then the driver re-allocates the
'device remove pending bitmap' buffer with the newer size using
krealloc().
Link: https://lore.kernel.org/r/20211220141159.16117-18-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Detect asynchronous reset that occurred in the firmware by polling for
reset history bit of IOC status register is set and if that bit is set,
then the driver waits for the controller to become ready and then
re-initializes the controller.
Also reduce the time driver is waiting for the controller to acknowledge
the reset action after issuing a specific reset action to the
controller. The wait time is reduced from 510 seconds to 30 seconds. If the
controller didn't acknowledge a specific reset action within the time
interval then the driver marks the controller as unrecoverable instead of
retrying two more times prior to giving up.
Link: https://lore.kernel.org/r/20211220141159.16117-17-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add IOC reinitialization function.
Link: https://lore.kernel.org/r/20211220141159.16117-16-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Currently the driver marks the controller as unrecoverable if there is an
asynchronous reset or fault during the initialization, reinitialization
post reset, and OS resume.
Enhance driver to retry the initialization, re-initialization, and resume
sequences for a maximum of 3 times if the controller became faulty or
asynchronously reset due to a firmware activation during the initialization
sequence.
Link: https://lore.kernel.org/r/20211220141159.16117-15-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Move the IOC initialization's bring up logic to mpi3mr_bring_ioc_ready()
routine.
Link: https://lore.kernel.org/r/20211220141159.16117-14-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Separate out reply and sense buffer allocation and initialization into two
routines and call only initialization routine while issuing the IOC Init
request message.
Also move out the event enable logic to a separate function.
Link: https://lore.kernel.org/r/20211220141159.16117-13-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Save snapdump and fault the controller with the given reason code if it is
already not in the fault or not in asynchronous reset. This ensures that
soft reset is issued from the watchdog thread. This will also be used to
handle initialization time faults/resets/timeout as in those cases
immediate soft reset invocation is not required.
Link: https://lore.kernel.org/r/20211220141159.16117-12-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Display IOC firmware package version by reading component image upload
data.
Link: https://lore.kernel.org/r/20211220141159.16117-11-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The following special handling is needed for UNMAP commands issued to NVMe
drives:
- On B0 boards, if the parameter list length is greater than 24 and not a
16-byte multiple, then truncate the parameter list length to a 16-byte
multiple.
- On A0 boards, if the parameter list length is greater than block
descriptor data length + 8, then truncate the parameter list length to
block descriptor data length + 8 value.
Link: https://lore.kernel.org/r/20211220141159.16117-10-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
- Increase internal command timeout to 60 seconds.
- Enable 16 device removal handshake processing in parallel in the device
removal handshake infrastructure.
Link: https://lore.kernel.org/r/20211220141159.16117-9-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add validation for various access statuses prior to exposing attached
target device to the operating system.
Link: https://lore.kernel.org/r/20211220141159.16117-8-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The SAS4 Controller firmware exposes the SES devices in Managed PCIe Switch
as a PCIe Device Type SCSI Device
(MPI3_DEVICE0_PCIE_DEVICE_INFO_TYPE_SCSI_DEVICE).
Driver is enhanced to handle this device type by:
- Exposing the device to the upper layers and
- Not updating any hardware sectors & virtual boundary settings as these
settings are needed only for NVMe devices.
Link: https://lore.kernel.org/r/20211220141159.16117-7-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Continued updating MPI3 headers.
Link: https://lore.kernel.org/r/20211220141159.16117-6-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update MPI3 headers.
Link: https://lore.kernel.org/r/20211220141159.16117-5-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Don't issue the soft reset if internal commands are flushed out with reset
status. Soft reset needs to be issued only if commands are really timed
out.
Link: https://lore.kernel.org/r/20211220141159.16117-4-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Use spin_lock_irqsave() instead of spin_lock() while acquiring
reply_free_queue_lock & sbq_lock locks.
Link: https://lore.kernel.org/r/20211220141159.16117-3-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add debug print functions which will print messages based on logging_level
bits enabled.
Link: https://lore.kernel.org/r/20211220141159.16117-2-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The driver doesn't express DMA addressing limitation under 32-bits anywhere
else, so remove the spurious GFP_DMA allocation.
Link: https://lore.kernel.org/r/20211222092247.928711-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The driver doesn't express DMA addressing limitation under 32-bits anywhere
else, so remove the spurious GFP_DMA allocation.
Link: https://lore.kernel.org/r/20211222092048.925829-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The myrs devices supports 64-bit addressing, so remove the spurious GFP_DMA
allocations.
Link: https://lore.kernel.org/r/20211222091935.925624-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The driver doesn't express DMA addressing limitation under 32-bits anywhere
else, so remove the spurious GFP_DMA allocation.
Link: https://lore.kernel.org/r/20211222091801.924745-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The driver doesn't express DMA addressing limitation under 32-bits anywhere
else, so remove the spurious GFP_DMA allocation.
Link: https://lore.kernel.org/r/20211222091630.922788-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The allocated buffers are used as a command payload, for which the block
layer and/or DMA API do the proper bounce buffering if needed.
Link: https://lore.kernel.org/r/20211222090842.920724-1-hch@lst.de
Reported-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The allocated buffers are used as a command payload, for which the block
layer and/or DMA API do the proper bounce buffering if needed.
Link: https://lore.kernel.org/r/20211222090311.916624-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
All the error handling paths of 'mlx5e_tc_add_fdb_flow()' end to 'err_out'
where 'flow_flag_set(flow, FAILED);' is called.
All but the new error handling paths added by the commits given in the
Fixes tag below.
Fix these error handling paths and branch to 'err_out'.
Fixes: 166f431ec6be ("net/mlx5e: Add indirect tc offload of ovs internal port")
Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
(cherry picked from commit 31108d142f3632970f6f3e0224bd1c6781c9f87d)
|
|
When there is ct or sample action, the ct or sample rule will be deleted
and return. But if there is an extra mirror action, the forward rule can't
be deleted because of the return.
Fix it by removing the return.
Fixes: 69e2916ebce4 ("net/mlx5: CT: Add support for mirroring")
Fixes: f94d6389f6a8 ("net/mlx5e: TC, Add support to offload sample action")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
There are two ICOSQs per channel: one is needed for RX, and the other
for async operations (XSK TX, kTLS offload). Currently, the recovery
flow for both is the same, and async ICOSQ is mistakenly treated like
the regular ICOSQ.
This patch prevents running the regular ICOSQ recovery on async ICOSQ.
The purpose of async ICOSQ is to handle XSK wakeup requests and post
kTLS offload RX parameters, it has nothing to do with RQ and XSKRQ UMRs,
so the regular recovery sequence is not applicable here.
Fixes: be5323c8379f ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Both regular RQ and XSKRQ use the same ICOSQ for UMRs. When doing
recovery for the ICOSQ, don't forget to deactivate XSKRQ.
XSK can be opened and closed while channels are active, so a new mutex
prevents the ICOSQ recovery from running at the same time. The ICOSQ
recovery deactivates and reactivates XSKRQ, so any parallel change in
XSK state would break consistency. As the regular RQ is running, it's
not enough to just flush the recovery work, because it can be
rescheduled.
Fixes: be5323c8379f ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When TC classifier action offloads are disabled (CONFIG_MLX5_CLS_ACT in
Kconfig), the mlx5e_rep_tc_receive() function which is responsible for
passing the skb to the stack (or freeing it) is defined as a nop, and
results in leaking the skb memory. Replace the nop with a call to
napi_gro_receive() to resolve the leak.
Fixes: 28e7606fa8f1 ("net/mlx5e: Refactor rx handler of represetor device")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct
mlx5e_txqsq *, but in TX-timeout-recovery flow the argument is actually
of type struct mlx5e_tx_timeout_ctx *.
mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected
mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000
BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae)
kernel stack overflow (page fault): 0000 [#1] SMP NOPTI
CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
[mlx5_core]
Call Trace:
mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core]
devlink_health_do_dump.part.91+0x71/0xd0
devlink_health_report+0x157/0x1b0
mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core]
? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0
[mlx5_core]
? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core]
? update_load_avg+0x19b/0x550
? set_next_entity+0x72/0x80
? pick_next_task_fair+0x227/0x340
? finish_task_switch+0xa2/0x280
mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core]
process_one_work+0x1de/0x3a0
worker_thread+0x2d/0x3c0
? process_one_work+0x3a0/0x3a0
kthread+0x115/0x130
? kthread_park+0x90/0x90
ret_from_fork+0x1f/0x30
--[ end trace 51ccabea504edaff ]---
RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
PKRU: 55555554
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled
end Kernel panic - not syncing: Fatal exception
To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which
extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the
TX-timeout-recovery flow dump callback.
Fixes: 5f29458b77d5 ("net/mlx5e: Support dump callback in TX reporter")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|