Age | Commit message (Collapse) | Author |
|
The Qcom SD controller defines the usage of 0xF in data
timeout counter register (0x2E) which is actually a reserved
bit as per specification. This would result in maximum of 21.26 secs
timeout value.
Some SDcard taking more time than 2.67secs (timeout value corresponding
to 0xE) and with that observed data timeout errors.
So increasing the timeout value to max possible timeout.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: Sarthak Garg <sartgarg@codeaurora.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1628232901-30897-3-git-send-email-sartgarg@codeaurora.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Introduce max_timeout_count variable in the sdhci_host structure
and use in timeout calculation. By default its set to 0xE
(max timeout register value as per SDHC spec). But at the same time
vendors drivers can update it if they support different max timeout
register value than 0xE.
Signed-off-by: Sarthak Garg <sartgarg@codeaurora.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1628232901-30897-2-git-send-email-sartgarg@codeaurora.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
For unexplained reasons, the prescaler register for this device needs to
be cleared (set to 1) while performing a data read or else the command
will hang. This does not appear to affect the real clock rate sent out
on the bus, so I assume it's purely to work around a hardware bug.
During normal operation, the prescaler is already set to 1, so nothing
needs to be done. However, in "initial mode" (which is used for sub-MHz
clock speeds, like the core sets while enumerating cards), it's set to
128 and so we need to reset it during data reads. We currently fail to
do this for long reads.
This has no functional affect on the driver's operation currently
written, as the MMC core always sets a clock above 1MHz before
attempting any long reads. However, the core could conceivably set any
clock speed at any time and the driver should still work, so I think
this fix is worthwhile.
I personally encountered this issue while performing data recovery on an
external chip. My connections had poor signal integrity, so I modified
the core code to reduce the clock speed. Without this change, I saw the
card enumerate but was unable to actually read any data.
Writes don't seem to work in the situation described above even with
this change (and even if the workaround is extended to encompass data
write commands). I was not able to find a way to get them working.
Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
Link: https://lore.kernel.org/r/2fef280d8409ab0100c26c6ac7050227defd098d.1627818365.git.tommyhebb@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Print out the contents of the offending tuples when we do print them.
This can make it easier to debug, since these tuples are not exposed to
userspace anywhere else. We are limited to 64 bytes, so keep printing
out the full length in case the tuple is truncated.
Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Link: https://lore.kernel.org/r/20210726163654.1110969-2-sean.anderson@seco.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
CIS tuples in the range 0x80-0x8F are reserved for vendors. Some devices
have tuples in this range which get warned about every boot. Since this
is normal behavior, don't print these tuples unless debug is enabled.
Unfortunately, we cannot use a variable for the format string since it
gets pasted by pr_*_ratelimited.
Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Link: https://lore.kernel.org/r/20210726163654.1110969-1-sean.anderson@seco.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
There is a spelling mistake in a pr_err message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210728103254.171546-1-colin.king@canonical.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Skip printing a retune error when we scan for a removed card because we
then expect a failed command.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20210630041658.7574-1-wsa+renesas@sang-engineering.com
[Ulf: Rebased patch]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Make 'struct mmc_request' contain a pointer to the request's
'struct bio_crypt_ctx' directly, instead of extracting a 32-bit DUN from
it which is a cqhci-crypto specific detail.
This keeps the cqhci crypto specific details in the cqhci module, and it
makes mmc_core and mmc_block ready for MMC crypto hardware that accepts
the DUN and/or key in a way that is more flexible than that which will
be specified by the eMMC v5.2 standard. Exynos SoCs are an example of
such hardware, as their inline encryption hardware takes keys directly
(it has no concept of keyslots) and supports 128-bit DUNs.
Note that the 32-bit DUN length specified by the standard is very
restrictive, so it is likely that more hardware will support longer DUNs
despite it not following the standard. Thus, limiting the scope of the
32-bit DUN assumption to the place that actually needs it is warranted.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20210721154738.3966463-1-ebiggers@kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
After the i.MX conversion to a DT-only platform, the mmc-esdhc-imx.h
header file is no longer used outside the driver, so move its content
to the sdhci-esdhc-imx driver and remove the header.
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20210719193413.3792615-1-festevam@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
When mmc_blk_card_busy() calls card_busy_detect() to poll for the card's
state with CMD13, this is done without any delays in between the commands
being sent.
Rather than fixing card_busy_detect() in this regards, let's instead
convert into using the common __mmc_poll_for_busy(), which also helps us to
avoid open-coding.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20210702134229.357717-4-ulf.hansson@linaro.org
|
|
When __mmc_blk_ioctl_cmd() calls card_busy_detect() to verify that the
card's states moves back into transfer state, the polling with CMD13 is
done without any delays in between the commands being sent.
Rather than fixing card_busy_detect() in this regards, let's instead
convert into using the common mmc_poll_for_busy(), which also helps us to
avoid open-coding.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20210702134229.357717-3-ulf.hansson@linaro.org
|
|
When mmc_blk_fix_state() sends a CMD12 to try to move the card into the
transfer state, it calls card_busy_detect() to poll for the card's state
with CMD13. This is done without any delays in between the commands being
sent.
Rather than fixing card_busy_detect() in this regards, let's instead
convert into using the common mmc_poll_for_busy(), which also helps us to
avoid open-coding.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20210702134229.357717-2-ulf.hansson@linaro.org
|
|
This driver has had problems when handling data errors. Add fault
injection support so that the abort handling can be easily triggered and
regression-tested. A hrtimer is used to indicate a data CRC error at
various points during the data transfer.
Note that for the recent problem with hangs in the case of some data CRC
errors, a udelay(10) inserted at the start of send_stop_abort() greatly
helped in triggering the error, but I've not included this as part of
the fault injection support since it seemed too specific.
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Reviewed-by: Jaehoon Chung <jh80.chung@samsung.com>
Link: https://lore.kernel.org/r/20210701080534.23138-1-vincent.whitchurch@axis.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Infinite loops are hard to read and understand because of
hidden main loop condition. Simplify such one in mmc_spi_skip().
Using schedule() to schedule (and be friendly to others)
is discouraged and cond_resched() should be used instead.
Hence, replace schedule() with cond_resched() at the same
time.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210623101731.87885-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
If we find a reset handle when probing the MMCI block,
make sure the reset is de-asserted. It could happen that
a hardware has reset asserted at boot.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Yann Gautier <yann.gautier@foss.st.com>
Cc: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Yann Gautier <yann.gautier@foss.st.com>
Link: https://lore.kernel.org/r/20210630102408.3543024-1-linus.walleij@linaro.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
dmaengine_terminate_all() is deprecated in favor of explicitly saying if
it should be sync or async. Here, we want dmaengine_terminate_sync()
because there is no other synchronization code in the driver to handle
an async case.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20210623095734.3046-4-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
dmaengine_terminate_all() is deprecated in favor of explicitly saying if
it should be sync or async. Here, we want dmaengine_terminate_sync()
because there is no other synchronization code in the driver to handle
an async case.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20210623095734.3046-3-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
dmaengine_terminate_all() is deprecated in favor of explicitly saying if
it should be sync or async. Here, we want dmaengine_terminate_sync()
because there is no other synchronization code in the driver to handle
an async case.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20210623095734.3046-2-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
'of_property_read_variable_u32_array' function returns number
of elements read on success. This patch updates the condition
check in the driver to overwrite the tap values from DT if exist.
Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com>
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1623753837-21035-8-git-send-email-manish.narani@xilinx.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Modify the data type of the clk_phase array to u32 to make it compatible
with the argument requirement of "of_property_read_variable_u32_array".
Addresses-coverity: ("incompatible_param")
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1623753837-21035-7-git-send-email-manish.narani@xilinx.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
The division macro DIV_ROUND_CLOSEST takes int values as the argument.
However the code here uses unsigned int values for this, which is
causing the values comparison with 0 as always true. We can use
DIV_ROUND_CLOSEST_ULL instead for the same.
Addresses-coverity: ("result_independent_of_operands")
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1623753837-21035-6-git-send-email-manish.narani@xilinx.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
At a couple of places, the return values of the non-void functions were
not getting checked. This was reported by the coverity tool. Modify the
code to check the return values of the same.
Addresses-Coverity: ("check_return")
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1623753837-21035-5-git-send-email-manish.narani@xilinx.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
ZynqMP platform does not perform auto tuning in DDR50 mode. Skip the
same while the card is operating in DDR50 mode.
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1623753837-21035-4-git-send-email-manish.narani@xilinx.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Arasan controller supports AUTO CMD12, this patch adds
"SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12" quirk to enable auto cmd12
feature.
By using auto cmd12 we can also avoid following error message
"Got data interrupt even though no data operation in progress"
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1623753837-21035-3-git-send-email-manish.narani@xilinx.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
SD standard speed timing was met only at 19MHz and not 25 MHz, that's
why changing driver to 19MHz. The reason for this is when a level shifter
is used on the board, timing was met for standard speed only at 19MHz.
Since this level shifter is commonly required for high speed modes,
the driver is modified to use standard speed of 19Mhz.
Signed-off-by: Manish Narani <manish.narani@xilinx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1623753837-21035-2-git-send-email-manish.narani@xilinx.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
We have this in two places, so let's have a dedicated function. It is
also more readable.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20210624151616.38770-4-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
I wanted to use it in a wrong way, so document the intended way.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20210624151616.38770-3-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
PCI devices expose the associated MSI interrupts via sysfs, but platform
devices which utilize MSI interrupts do not. This information is important
for user space tools to optimize affinity settings.
Utilize the generic MSI sysfs facility to expose this information for
platform MSI.
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210813035628.6844-3-21cnbao@gmail.com
|
|
Move PCI's MSI sysfs code to the irq core so that other busses such as
platform can reuse it.
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210813035628.6844-2-21cnbao@gmail.com
|
|
Add FORCE so that if_changed can detect the command line change.
scsi_devinfo_tbl.c must be added to 'targets' too.
Link: https://lore.kernel.org/r/20210819012339.709409-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Use scsi_cmd_to_rq(cmd)->tag instead of scsi_cmnd.tag as preference.
Link: https://lore.kernel.org/r/1629294801-32102-2-git-send-email-john.garry@huawei.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Device reset and target reset will be using different calling sequences, so
open-code __qla2xxx_eh_generic_reset() in qla2xxx_eh_device_reset(), and
remove the now obsolete function __qla2xxx_eh_generic_reset(). No
functional changes.
Link: https://lore.kernel.org/r/20210819091913.94436-4-hare@suse.de
Cc: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Device reset and target reset will be using different calling sequences, so
open-code __qla2xxx_eh_generic_reset() in qla2xxx_eh_target_reset(). No
functional changes.
Link: https://lore.kernel.org/r/20210819091913.94436-3-hare@suse.de
Cc: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When calling bus reset the driver will be doing a full SAN resync, so there
is no need to wait for any pending RSCNs; they'll be re-issued during
resync anyway.
Link: https://lore.kernel.org/r/20210819091913.94436-2-hare@suse.de
Cc: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Link: https://lore.kernel.org/r/20210817051315.2477-13-njavali@marvell.com
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
drivers/scsi/qla2xxx/qla_edif.c:213:25-29: Unneeded variable: "rval". Return "0"
on line 264
Remove unneeded variable used to store return value.
Generated by: scripts/coccinelle/misc/returnvar.cocci
Link: https://lore.kernel.org/r/20210817051315.2477-12-njavali@marvell.com
Fixes: 7ebb336e45ef ("scsi: qla2xxx: edif: Add start + stop bsgs")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When Target port transitions personality from one to another (NVMe <-->
FCP), there could be some overlap of the two where one layer is going down
while the other layer is coming up. This overlap can cause temporary I/O
error. Detect those errors/transitions and recover from them. Triggers
session tear down and allow relogin to re-drive the connection under the
following conditions:
- NVMe command error
- On PRLO + N2N (rida format 2)
Link: https://lore.kernel.org/r/20210817051315.2477-11-njavali@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
For target port that register itself as both FCP + NVMe, initiator driver
will try to login one mode at a time. If the last mode did not succeed,
then driver will try the other mode.
When error is encountered, current code only flip to other mode one time
(NVMe->FCP) and remain on the last mode. Driver wrongly assumed target
port does not support PRLI NVMe, instead it was not ready to receive PRLI.
This patch will alternate back and forth on every PRLI failure until login
retry count has depleted or it is succeeded.
Link: https://lore.kernel.org/r/20210817051315.2477-10-njavali@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The abort callback gets called only when it gets posted to firmware. The
refcounting is done properly in the callback. On internal errors, the
callback is not invoked leading to a hung I/O. Fix this by having separate
error code when command gets returned from firmware.
Link: https://lore.kernel.org/r/20210817051315.2477-9-njavali@marvell.com
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Currently driver saves the personality type (FCP|NVMe) at the start of
first discovery of the remote device. If the remote device personality do
change over time, then qla driver needs to present that to user to decide.
Link: https://lore.kernel.org/r/20210817051315.2477-8-njavali@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
For initiator mode, always do secure login when authentication app started.
Also remove redundant flags to indicate secure connection.
Link: https://lore.kernel.org/r/20210817051315.2477-7-njavali@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
For EDIF + N2N to work, firmware 9.8 or later is required. The driver will
pause after PLOGI to allow app to authenticate. Once authentication
completes, app will tell driver to do PRLI.
Link: https://lore.kernel.org/r/20210817051315.2477-6-njavali@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The following hung task call trace was seen:
[ 1230.183294] INFO: task qla2xxx_wq:523 blocked for more than 120 seconds.
[ 1230.197749] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1230.205585] qla2xxx_wq D 0 523 2 0x80004000
[ 1230.205636] Workqueue: qla2xxx_wq qlt_free_session_done [qla2xxx]
[ 1230.205639] Call Trace:
[ 1230.208100] __schedule+0x2c4/0x700
[ 1230.211607] schedule+0x38/0xa0
[ 1230.214769] schedule_timeout+0x246/0x2f0
[ 1230.222651] wait_for_completion+0x97/0x100
[ 1230.226921] qlt_free_session_done+0x6a0/0x6f0 [qla2xxx]
[ 1230.232254] process_one_work+0x1a7/0x360
...when device side port resets were done.
Abort threads were getting out without processing due to the "deleted"
flag check. The delete thread, meanwhile, could not proceed with a
logout (that would have cleared out pending requests) as the logout IOCB
work was not progressing. It appears like the hung qlt_free_session_done()
thread is causing the ha->wq works on hold. The qlt_free_session_done()
was hung waiting for nvme_fc_unregister_remoteport() + localport_delete cb
to be complete, which would only happen when all I/Os are released.
Fix this by allowing abort to progress until device delete is completely
done. This should make the qlt_free_session_done() proceed without hang and
thus clear up the deadlock.
Link: https://lore.kernel.org/r/20210817051315.2477-5-njavali@marvell.com
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
edif_enabled is prematurely turned on if hardware is capable of handling
the feature. However, firmware also needs to support EDIF before enabling
this bit.
Link: https://lore.kernel.org/r/20210817051315.2477-4-njavali@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Reject inflight AUTH ELS if driver is going through session recovery.
Link: https://lore.kernel.org/r/20210817051315.2477-3-njavali@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When firmware indicates session has been torn down via UPDATE SA IOCB or
ELS Passthrough IOCB, the driver needs to also tear down the session.
Link: https://lore.kernel.org/r/20210817051315.2477-2-njavali@marvell.com
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
We never checked for errors on add_disk() as this function
returned void. Now that this is fixed, use the shiny new
error handling. The actual cleanup in case of error is
already handled by the caller of null_gendisk_register().
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We never checked for errors on add_disk() as this function
returned void. Now that this is fixed, use the shiny new
error handling.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210818144542.19305-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pass in a request_queue and assign disk->queue in __blk_alloc_disk to
ensure struct gendisk always has a valid ->queue pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210816131910.615153-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This was a leftover from the legacy alloc_disk interface. Switch
the scsi ULPs and dasd to set ->minors directly like all other
drivers and remove the argument.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefan Haberland <sth@linux.ibm.com> [dasd]
Link: https://lore.kernel.org/r/20210816131910.615153-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|