summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-01-13scsi: qla2xxx: Move some messages from debug to normal log levelSaurav Kashyap
This change will aid in debugging issues arising because of dropped frame, DIF errors, queue full etc where debug level is not set. Link: https://lore.kernel.org/r/20210111093134.1206-4-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: qla2xxx: Add error counters to debugfs nodeSaurav Kashyap
Display error counters via debugfs node. Link: https://lore.kernel.org/r/20210111093134.1206-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: qla2xxx: Implementation to get and manage host, target stats and ↵Saurav Kashyap
initiator port This statistics will help in debugging process and checking specific error counts. It also provides a capability to isolate the port or bring it out of isolation. Link: https://lore.kernel.org/r/20210111093134.1206-2-njavali@marvell.com Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: qedf: Simplify bool comparisonYANG LI
Fix the following coccicheck warning: ./drivers/scsi/qedf/qedf_main.c:3716:5-31: WARNING: Comparison to bool Link: https://lore.kernel.org/r/1610357368-62866-1-git-send-email-abaci-bugfix@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: YANG LI <abaci-bugfix@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: aha1542: Fix multi-line comment styleSergey Shtylyov
Some comments in this driver don't comply with the preferred multi-line comment style, as reported by 'scripts/checkpatch.pl': WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line Fix those comments, along with the (unreported for some reason?) starts of the multi-line comments not being /* on their own line... Link: https://lore.kernel.org/r/08c231e5-d86f-9d0b-19ac-ad46fa0c0b58@omprussia.ru Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: aha1542: Kill trailing whitespaceSergey Shtylyov
Some source lines (mostly the comments) in this driver end with spaces, as reported by 'scripts/checkpatch.pl'. Trim these lines. Link: https://lore.kernel.org/r/59829052-4932-4ea3-b504-857bbb19e6a0@omprussia.ru Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: aha1542: Clarify 'struct ccb' commentsSergey Shtylyov
This driver's original authors did pretty bad job of documenting the Command Control Block (CCB) structure -- especially its 2nd byte, where the bit numbers were completely left out. Sync up the 'struct ccb' comments to the Adaptec AHA-154xA manual. Link: https://lore.kernel.org/r/17a7be14-a9d2-9822-bb3e-1d7385f486b0@omprussia.ru Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: ufs: A tad optimization in query upiu traceAvri Altman
Remove a redundant if clause in ufshcd_add_query_upiu_trace. Link: https://lore.kernel.org/r/20210110084618.189371-1-avri.altman@wdc.com Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: target: file: Don't zero iter before iov_iter_bvecPavel Begunkov
iov_iter_bvec() initialises iterators well, no need to pre-zero it beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it out and generate extra code for that (confirmed with assembly). Link: https://lore.kernel.org/r/34cd22d6cec046e3adf402accb1453cc255b9042.1610207523.git.asml.silence@gmail.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: pm80xx: Log SATA IOMB completion status on failureVishakha Channapattan
Added a log message in SATA completion path to capture the status of failed command. If the status does not match any expected status, another message will be logged. On IO failure with known status, the log message will be: [ 1712.951735] pm80xx0:: mpi_sata_completion 2269: IO failed device_id 16385 status 0x1 tag XX If the firmware returns unexpected status, a message of the following format will be logged: [ 1712.951735] pm80xx0:: mpi_sata_completion XXXX: Unknown status device_id XXXXX status 0xX tag XX Link: https://lore.kernel.org/r/20210109123849.17098-8-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Vishakha Channapattan <vishakhavc@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: pm80xx: Simultaneous poll for all FW readinessBhavesh Jashnani
In check_fw_ready() we first wait for ILA to come up and then we wait for RAAE to come up and IOPs and so on. This is a sequential check. Because of this, ILA image seems to be not ready in the allocated time and so the driver marks it as "not ready" and then moves on to other FW images. ILA does become ready eventually, but is not checked again. The driver concludes that FW is not ready when it actually is. Instead of sequentially polling each image, we keep polling for all images to be ready. The timeout for the polling has been set to the sum of what was used for each individual image. Link: https://lore.kernel.org/r/20210109123849.17098-7-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Bhavesh Jashnani <bjashnani@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: pm80xx: Fix driver fatal dump failureViswas G
The function pm80xx_get_fatal_dump() has two issues that result in the fatal dump not being able to complete successfully. 1. Trying to collect fatal_logs from the application fails because we are not shifting the MEMBASE-II register properly. Once we read 64K region of data we have to shift the MEMBASE-II register and read the next chunk. Only then would we be able to get complete data. 2. If a timeout occurs, our application will get stuck. Link: https://lore.kernel.org/r/20210109123849.17098-6-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: pm80xx: Fix missing tag_free in NVMD DATA reqakshatzen
Tag was not freed in NVMD get/set data request failure scenario. This caused a tag leak each time a request failed. Link: https://lore.kernel.org/r/20210109123849.17098-5-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: akshatzen <akshatzen@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: pm80xx: Check main config table addressakshatzen
The driver initializes main configuration, general status, inbound queue and outbound queue table addresses based on a value read from MSGU_SCRATCH_PAD_0 register. We should validate these addresses before dereferencing them. Adds two validations: 1. Check if main configuration table offset lies within the pcibar mapped 2. Check if first dword of main configuration table reads "PMCS" There are two calls to init_pci_device_addresses() done during pm8001_pci_probe() in this sequence: 1. First inside chip_soft_rst, where if init_pci_device_addresses fails we will go ahead assuming MPI state is not ready and reset the device as long as bootloader is okay. This gives chance to second call of init_pci_device_addresses to set up the addresses after reset. 2. The second call is via pm80xx_chip_init, after soft reset is done and firmware is checked to be ready. Once that is done we are safe to go ahead and initialize default table values and use them. Tests: 1. Enabled debugging logs and observed no issues during initialization, with a controller with no issues: pm80xx0:: pm8001_setup_msix 1034: pci_alloc_irq_vectors request ret:64 no of intr 64 pm80xx0:: init_pci_device_addresses 917: Scratchpad 0 Offset: 0x2000 value 0x40002000 pm80xx0:: init_pci_device_addresses 925: Scratchpad 0 PCI BAR: 0 pm80xx0:: init_pci_device_addresses 952: VALID main config signature 0x53434d50 pm80xx0:: init_pci_device_addresses 975: GST OFFSET 0xc4 pm80xx0:: init_pci_device_addresses 978: INBND OFFSET 0x20000128 pm80xx0:: init_pci_device_addresses 981: OBND OFFSET 0x24000928 pm80xx0:: init_pci_device_addresses 984: IVT OFFSET 0x8001408 pm80xx0:: init_pci_device_addresses 987: PSPA OFFSET 0x8001608 pm80xx0:: init_pci_device_addresses 991: addr - main cfg (ptrval) general status (ptrval) pm80xx0:: init_pci_device_addresses 995: addr - inbnd (ptrval) obnd (ptrval) pm80xx0:: init_pci_device_addresses 999: addr - pspa (ptrval) ivt (ptrval) pm80xx0:: pm80xx_chip_soft_rst 1446: reset register before write : 0x0 pm80xx0:: pm80xx_chip_soft_rst 1478: reset register after write 0x40 pm80xx0:: pm80xx_chip_soft_rst 1544: SPCv soft reset Complete pm80xx0:: init_pci_device_addresses 917: Scratchpad 0 Offset: 0x2000 value 0x40002000 pm80xx0:: init_pci_device_addresses 925: Scratchpad 0 PCI BAR: 0 pm80xx0:: init_pci_device_addresses 952: VALID main config signature 0x53434d50 pm80xx0:: init_pci_device_addresses 975: GST OFFSET 0xc4 pm80xx0:: init_pci_device_addresses 978: INBND OFFSET 0x20000128 pm80xx0:: init_pci_device_addresses 981: OBND OFFSET 0x24000928 pm80xx0:: init_pci_device_addresses 984: IVT OFFSET 0x8001408 pm80xx0:: init_pci_device_addresses 987: PSPA OFFSET 0x8001608 pm80xx0:: init_pci_device_addresses 991: addr - main cfg (ptrval) general status (ptrval) pm80xx0:: init_pci_device_addresses 995: addr - inbnd (ptrval) obnd (ptrval) pm80xx0:: init_pci_device_addresses 999: addr - pspa (ptrval) ivt (ptrval) pm80xx0:: pm80xx_chip_init 1329: MPI initialize successful! 2. Tested controller with firmware known to have initialization issue and observed no crashes with this fix: pm80xx 0000:01:00.0: pm80xx: driver version 0.1.38 pm80xx 0000:01:00.0: Removing from 1:1 domain pm80xx 0000:01:00.0: Requesting non-1:1 mappings pm80xx0:: init_pci_device_addresses 948: BAD main config signature 0x0 pm80xx0:: mpi_uninit_check 1365: Failed to init pci addresses pm80xx0:: pm80xx_chip_soft_rst 1435: MPI state is not ready scratch:0:8:62a01000:0 pm80xx0:: pm80xx_chip_soft_rst 1518: Firmware is not ready! pm80xx0:: pm80xx_chip_soft_rst 1532: iButton Feature is not Available!!! pm80xx0:: pm80xx_chip_init 1301: Firmware is not ready! pm80xx0:: pm8001_pci_probe 1215: chip_init failed [ret: -16] pm80xx: probe of 0000:01:00.0 failed with error -16 pm80xx 0000:07:00.0: pm80xx: driver version 0.1.38 pm80xx 0000:07:00.0: Removing from 1:1 domain pm80xx 0000:07:00.0: Requesting non-1:1 mappings scsi host6: pm80xx pm80xx1:: pm8001_setup_sgpio 5568: failed sgpio_req timeout pm80xx1:: mpi_phy_start_resp 3447: phy start resp status:0x0, phyid:0x0 pm80xx 0000:08:00.0: pm80xx: driver version 0.1.38 pm80xx 0000:08:00.0: Removing from 1:1 domain pm80xx 0000:08:00.0: Requesting non-1:1 mappings 3. Without this fix we observe crash on the same controller: pm80xx 0000:01:00.0: pm80xx: driver version 0.1.38 pm80xx 0000:01:00.0: Removing from 1:1 domain pm80xx 0000:01:00.0: Requesting non-1:1 mappings [<ffffffffc0451b3b>] pm80xx_chip_soft_rst+0x6b/0x4c0 [pm80xx] [<ffffffffc043a933>] pm8001_pci_probe+0xa43/0x1630 [pm80xx] RIP: 0010:pm80xx_chip_soft_rst+0x71/0x4c0 [pm80xx] [<ffffffffc0451b3b>] ? pm80xx_chip_soft_rst+0x6b/0x4c0 [pm80xx] [<ffffffffc043a933>] pm8001_pci_probe+0xa43/0x1630 [pm80xx] pm80xx0:: mpi_uninit_check 1339: TIMEOUT:IBDB value/=2 pm80xx0:: pm80xx_chip_soft_rst 1387: MPI state is not ready scratch:0:8:62a01000:0 pm80xx0:: pm80xx_chip_soft_rst 1470: Firmware is not ready! pm80xx0:: pm80xx_chip_soft_rst 1484: iButton Feature is not Available!!! pm80xx0:: pm80xx_chip_init 1266: Firmware is not ready! pm80xx0:: pm8001_pci_probe 1207: chip_init failed [ret: -16] pm80xx: probe of 0000:01:00.0 failed with error -16 Link: https://lore.kernel.org/r/20210109123849.17098-4-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: akshatzen <akshatzen@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: pm80xx: Check for fatal errorakshatzen
When the controller runs into a fatal error, commands get stuck due to no response. If the controller is in fatal error state, abort requests issued to the controller get stuck too. Check the controller state for fatal error conditions. Link: https://lore.kernel.org/r/20210109123849.17098-3-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: akshatzen <akshatzen@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: pm80xx: Do not busy wait in MPI init checkakshatzen
We do not need to busy wait during mpi_init_check() since it is not being invoked in atomic context. mpi_init_check() is being called from pm8001_pci_resume(), pm8001_pci_probe(). Hence we are replacing udelay with msleep. Link: https://lore.kernel.org/r/20210109123849.17098-2-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: akshatzen <akshatzen@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12scsi: ufs-qcom: Fix ufs RST_n spec violationZiqi Chen
According to the spec (JESD220E chapter 7.2), while powering off/on the ufs device, RST_n signal should be between VSS(Ground) and VCCQ/VCCQ2. Link: https://lore.kernel.org/r/1610103385-45755-3-git-send-email-ziqichen@codeaurora.org Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Ziqi Chen <ziqichen@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12scsi: ufs: core: Fix ufs clk specs violationZiqi Chen
According to the spec (JESD220E chapter 7.2), while powering off/on the ufs device, REF_CLK signal should be between VSS(Ground) and VCCQ/VCCQ2. Link: https://lore.kernel.org/r/1610103385-45755-2-git-send-email-ziqichen@codeaurora.org Reviewed-by: Can Guo <cang@codeaurora.org> Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Ziqi Chen <ziqichen@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12scsi: isci: Remove the unneeded variable "status"YANG LI
The variable 'status' is being initialized with SCI_SUCCESS and never updated later with a new value. The initialization is redundant and can be removed. Link: https://lore.kernel.org/r/1609311860-102820-1-git-send-email-abaci-bugfix@linux.alibaba.com Reported-by: Abaci <abaci@linux.alibaba.com> Signed-off-by: YANG LI <abaci-bugfix@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12scsi: fnic: Fix memleak in vnic_dev_init_devcmd2Dinghao Liu
When ioread32() returns 0xFFFFFFFF, we should execute cleanup functions like other error handling paths before returning. Link: https://lore.kernel.org/r/20201225083520.22015-1-dinghao.liu@zju.edu.cn Acked-by: Karan Tilak Kumar <kartilak@cisco.com> Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12scsi: libfc: Avoid invoking response handler twice if ep is already completedJaved Hasan
A race condition exists between the response handler getting called because of exchange_mgr_reset() (which clears out all the active XIDs) and the response we get via an interrupt. Sequence of events: rport ba0200: Port timeout, state PLOGI rport ba0200: Port entered PLOGI state from PLOGI state xid 1052: Exchange timer armed : 20000 msecs  xid timer armed here rport ba0200: Received LOGO request while in state PLOGI rport ba0200: Delete port rport ba0200: work event 3 rport ba0200: lld callback ev 3 bnx2fc: rport_event_hdlr: event = 3, port_id = 0xba0200 bnx2fc: ba0200 - rport not created Yet!! /* Here we reset any outstanding exchanges before freeing rport using the exch_mgr_reset() */ xid 1052: Exchange timer canceled /* Here we got two responses for one xid */ xid 1052: invoking resp(), esb 20000000 state 3 xid 1052: invoking resp(), esb 20000000 state 3 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2 xid 1052: fc_rport_plogi_resp() : ep->resp_active 2 Skip the response if the exchange is already completed. Link: https://lore.kernel.org/r/20201215194731.2326-1-jhasan@marvell.com Signed-off-by: Javed Hasan <jhasan@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12scsi: scsi_transport_srp: Don't block target in failfast stateMartin Wilck
If the port is in SRP_RPORT_FAIL_FAST state when srp_reconnect_rport() is entered, a transition to SDEV_BLOCK would be illegal, and a kernel WARNING would be triggered. Skip scsi_target_block() in this case. Link: https://lore.kernel.org/r/20210111142541.21534-1-mwilck@suse.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12scsi: docs: ABI: sysfs-driver-ufs: Rectify table formattingLukas Bulwahn
Commit 0b2894cd0fdf ("scsi: docs: ABI: sysfs-driver-ufs: Add DeepSleep power mode") adds new entries in tables of sysfs-driver-ufs ABI documentation, but formatted the table incorrectly. Hence, make htmldocs warns: ./Documentation/ABI/testing/sysfs-driver-ufs:{915,956}: WARNING: Malformed table. Text in column margin in table line 15. Rectify table formatting for DeepSleep power mode. Link: https://lore.kernel.org/r/20210111102212.19377-1-lukas.bulwahn@gmail.com Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-12scsi: ufs: ufs-debugfs: Add error countersAdrian Hunter
People testing have a need to know how many errors might be occurring over time. Add error counters and expose them via debugfs. A module initcall is used to create a debugfs root directory for ufshcd-related items. In the case that modules are built-in, then initialization is done in link order, so move ufshcd-core to the top of the Makefile. Link: https://lore.kernel.org/r/20210107072538.21782-1-adrian.hunter@intel.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: storvsc: Validate length of incoming packet in ↵Andrea Parri (Microsoft)
storvsc_on_channel_callback() Check that the packet is of the expected size at least, don't copy data past the packet. Link: https://lore.kernel.org/r/20201217203321.4539-4-parri.andrea@gmail.com Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Reported-by: Saruhan Karademir <skarade@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: storvsc: Resolve data race in storvsc_probe()Andrea Parri (Microsoft)
vmscsi_size_delta can be written concurrently by multiple instances of storvsc_probe(), corresponding to multiple synthetic IDE/SCSI devices; cf. storvsc_drv's probe_type == PROBE_PREFER_ASYNCHRONOUS. Change the global variable vmscsi_size_delta to per-synthetic-IDE/SCSI-device. Link: https://lore.kernel.org/r/20201217203321.4539-3-parri.andrea@gmail.com Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Suggested-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: storvsc: Fix max_outstanding_req_per_channel for Win8 and newerAndrea Parri (Microsoft)
Current code overestimates the value of max_outstanding_req_per_channel for Win8 and newer hosts, since vmscsi_size_delta is set to the initial value of sizeof(vmscsi_win8_extension) rather than zero. This may lead to wrong decisions when using ring_avail_percent_lowater equals to zero. The estimate of max_outstanding_req_per_channel is 'exact' for Win7 and older hosts. A better choice, keeping the algorithm for the estimation simple, is to err the other way around, i.e., to underestimate for Win7 and older but to use the exact value for Win8 and newer. Link: https://lore.kernel.org/r/20201217203321.4539-2-parri.andrea@gmail.com Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Suggested-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Update lpfc version to 12.8.0.7James Smart
Update lpfc version to 12.8.0.7 Link: https://lore.kernel.org/r/20210104180240.46824-16-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Enhancements to LOG_TRACE_EVENT for better readabilityJames Smart
While testing recent discovery node rework, several items were seen that could be done better with respect to the new trace event logic. 1) in the following msg: kernel: lpfc 0000:44:00.0: start 35 end 35 cnt 0 If cnt is zero in the 1st message, there is no reason to display the 1st message, which is just giving start/end positioning. Fix by not displaying message if cnt is 0. 2) If the driver is loaded with module log verbosity off, and later a single NPIV host instance verbosity is enabled via sysfs, it enables messages on all instances. This is due to the trace log verbosity checks (lpfc_dmp_dbg) looking at the phba only. It should look at the phba and the vport. Fix by enabling a check on both phba and vport. 3) in the following messages: 2904 Firmware Dump Image Present on Adapter 2887 Reset Needed: Attempting Port Recovery... These messages are not necessary for the trace event log, which is primarily for discovery. Fix by changing log level on these 2 messages to LOG_SLI. Link: https://lore.kernel.org/r/20210104180240.46824-15-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Implement health checking when aborting I/OJames Smart
Several errors have occurred where the adapter stops or fails but does not raise the register values for the driver to detect failure. Thus driver is unaware of the failure. The failure typically results in I/O timeouts, the I/O timeout handler failing (after several seconds), and the error handler escalating recovery policy and resulting in more errors. Eventually, the driver is in a position where things have spiraled and it can't do recovery because other recovery ops are still outstanding and it becomes unusable. Resolve the situation by having the I/O timeout handler (actually a els, SCSI I/O, NVMe ls, or NVMe I/O timeout), in addition to aborting the I/O, perform a mailbox command and look for a response from the hardware. If the mailbox command fails, it will mark the adapter offline and then invoke the adapter reset handler to clean up. The new I/O timeout test will be limited to a test every 5s. If there are multiple I/O timeouts concurrently, only the 1st I/O timeout will generate the mailbox command. Further testing will only occur once a timeout occurs after a 5s delay from the last mailbox command has expired. Link: https://lore.kernel.org/r/20210104180240.46824-14-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Fix crash when nvmet transport calls host_releaseJames Smart
When lpfc is running in NVMET mode and supports the NVME-1 addendum changes, a LIP on a bound NVME Initiator or lipping the lpfc NVMET's link resulted in an Oops in lpfc_nvmet_host_release. The fix requires lpfc NVMET to maintain an additional reference on any node structure that acts as the hosthandle for the NVMET transport. This reference get is a one-time addition, is taken prior to the upcall of an unsolicited LS_REQ, and is released when the NVMET transport releases the hosthandle during the host_release downcall. Link: https://lore.kernel.org/r/20210104180240.46824-13-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Fix vport create loggingJames Smart
When with testing with large numbers of npiv vports and link bounces, the driver is flooding the messages file, even with log_verbose = 0. The new LOG_TRACE_EVENT messages are still generating events to the messages files. Fix by converting the vport create msg from LOG_TRACE_EVENT to LOG_VPORT. Link: https://lore.kernel.org/r/20210104180240.46824-12-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Fix NVMe recovery after mailbox timeoutJames Smart
If a mailbox command times out, the SLI port is deemed in error and the port is reset. The HBA cleanup is not returning I/Os to the NVMe layer before the port is unregistered. This is due to the HBA being marked offline (!SLI_ACTIVE) and cleanup being done by the mailbox timeout handler rather than an general adapter reset routine. The mailbox timeout handler mailbox handler only cleaned up SCSI I/Os. Fix by reworking the mailbox handler to: - After handling the mailbox error, detect the board is already in failure (may be due to another error), and leave cleanup to the other handler. - If the mailbox command timeout is initial detector of the port error, continue with the board cleanup and marking the adapter offline (!SLI_ACTIVE). Remove the SCSI-only I/O cleanup routine. The generic reset adapter routine that is subsequently invoked, will clean up the I/Os. - Have the reset adapter routine flush all NVMe and SCSI I/Os if the adapter has been marked failed (!SLI_ACTIVE). - Rework the NVMe I/O terminate routine to take a status code to fail the I/O with and update so that cleaned up I/O calls the wqe completion routine. Currently it is bypassing the wqe cleanup and calling the NVMe I/O completion directly. The wqe completion routine will take care of data structure and node cleanup then call the NVMe I/O completion handler. Link: https://lore.kernel.org/r/20210104180240.46824-11-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Fix target reset failingJames Smart
Target reset is failed by the target as an invalid command. The Target Reset TMF has been obsoleted in T10 for a while, but continues to be used. On (newer) devices, the TMF is rejected causing the reset handler to escalate to adapter resets. Fix by having Target Reset TMF rejections be translated into a LOGO and re-PLOGI with the target device. This provides the same semantic action (although, if the device also supports nvme traffic, it will terminate nvme traffic as well - but it's still recoverable). Link: https://lore.kernel.org/r/20210104180240.46824-10-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Fix error log messages being logged following SCSI task mgntJames Smart
A successful task mgmt command is logging errors, making it look like problems were encountered. This is due to log messages for the device/target and bus reset handlers having the LOG_TRACE_EVENT flag set. Fix by adjusting the event flag such that the call to the logging routine only receives a LOG_TRACE_EVENT if a prior call actually failed. Link: https://lore.kernel.org/r/20210104180240.46824-9-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Prevent duplicate requests to unregister with cpuhp frameworkJames Smart
In the lpfc offline routine, called for various reasons such as sysfs attribute, driver unload, or port error, the driver is calling __lpfc_cpuhp_remove() to destroy the hot plug data. If the offline routine is called while the driver is in the process of being unloaded, a request using lpfc_cpuhp_remove() is also made from lpfc_sli4_hba_unset(). The cpuhp elements are no longer valid when the second removal request is made. Fix by only calling the cpuhp removal once when the adapter is in the process of unloading. Link: https://lore.kernel.org/r/20210104180240.46824-8-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Fix FW reset action if I/Os are outstandingJames Smart
If the port is configured for NVME and has any outstanding IOs when a FW reset is requesteed, outstanding I/Os are not properly cleaned up. This causes the fw download request to fail. Fix by clearing the LPFC_SLI_ACTIVE flag to signify the I/O must be manually flushed by the driver on port reset. Link: https://lore.kernel.org/r/20210104180240.46824-7-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Use the nvme-fc transport supplied timeout for LS requestsJames Smart
When lpfc generates a GEN_REQUEST wqe for the nvme LS (such as Create Association), the timeout is set to R_A_TOV without regard to the timeout value supplied by the nvme-fc transport. The driver should be setting the timeout to the value passed into the routine. Additionally the caller should be setting the timeout value to the value in the ls request set by the nvme transport. Instead, it unconditionally is setting it to a driver defined value. So the driver actually overrode the value twice. Fix by using the timeout provided to the routine, and for the caller, set the timeout to the ls request timeout value. Link: https://lore.kernel.org/r/20210104180240.46824-6-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Fix crash when a fabric node is released prematurelyJames Smart
The driver's management of the fabric controller (aka pseudo-scsi initiator) node in SLI3 mode is causing this crash. The crash occurs because of a node reference imbalance that frees the fabric controller node while devloss is outstanding from the SCSI transport. This is triggered by an odd behavior where the switch reacts to a rejected RDP request with a PLOGI and nothing else, not even a LOGO. The driver ACKS the PLOGI and after successfully registering the RPI, incorrectly registers the fabric controller node because it has the NLP_FC4_FCP flag still set from the fabric controller PRLI. If a LIP is issued, the driver attempts to cleanup on Link Up and ends up executing too many puts. Fix by detecting the fabric node type and clearing out the nodes internal flags that triggered a SCSI transport registration and subsequence dev_loss event. The driver cannot count on any persistence from fabric controller nodes. Link: https://lore.kernel.org/r/20210104180240.46824-5-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Refresh ndlp when a new PRLI is received in the PRLI issue stateJames Smart
Testing with target ports coming and going, the driver eventually reached a state where it no longer discovered the target. When the driver has issued a PRLI and receives a PRLI from the target, it is not properly updating the node's initiator/target role flags. Thus, when a subsequent RSCN is received for a target loss, the driver mis-identifies the target as an initiator and does not initiate LUN scanning. Fix by always refreshing the ndlp with the latest PRLI state information whenever a PRLI is processed. Also clear the ndlp flags when processing a PLOGI so that there is no carry over through a re-login. Link: https://lore.kernel.org/r/20210104180240.46824-4-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Fix auto sli_mode and its effect on CONFIG_PORT for SLI3James Smart
A very long time ago, there was a feature: auto sli mode. It gave the user the ability to auto select the SLI mode (SLI2 or SLI3) to run the port in, or even force SLI2 mode if configured. Because of the convoluted logic, the CONFIG_PORT mbox command ends up being called 2 or 3 times. It should have been called only once. Additionally, the driver no longer supports SLI-2, so only SLI-3 mode should be allowed. The following changes were made: - Force module parameter to SLI3 only. - Rip out redundant CONFIG_PORT mbox commands. - Force CONFIG_PORT mbox command to be in beginning of enable ISR routine. - Added changes for offline to online behavior Link: https://lore.kernel.org/r/20210104180240.46824-3-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: lpfc: Fix PLOGI S_ID of 0 on pt2pt configJames Smart
Under some pt2pt situations, the other end of the link may issue a LOGO after successfully completing PLOGI and assigning addresses to the port. Thus the driver may attempt a new PLOGI to re-create the login, but the LOGO handling cleared the address back to 0. Once this happens, the other end, which may be address 0, gets all confused and this cannot be resolved without an administrative action to bounce the link. Fix by assuming that address assignment only occurs on the 1st PLOGI after link up, and regardless of login state, the address assignment sticks. The FC standards aren't particularly clear in this situation (it only describes initial PLOGI), but there is nothing that contradicts this and behaviors on the devices tested appears to conform to the understanding. Thus, don't reset the port address to 0 as part of LOGO handling. Port addresses will only reset on link down. Link: https://lore.kernel.org/r/20210104180240.46824-2-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: hisi_sas: Remove auto_affine_msi_experimental module_paramJohn Garry
Now that the driver always uses managed interrupts, delete auto_affine_msi_experimental module param. Link: https://lore.kernel.org/r/1609763622-34119-2-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: ufs: Fix tm request when non-fatal error happensJaegeuk Kim
When non-fatal error like line-reset happens, ufshcd_err_handler() starts to abort tasks by ufshcd_try_to_abort_task(). When it tries to issue a task management request, we hit two warnings: WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70 WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 blk_mq_get_tag+0x438/0x46c After fixing the above warnings we hit another tm_cmd timeout which may be caused by unstable controller state: __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out Then, ufshcd_err_handler() enters full reset, and kernel gets stuck. It turned out ufshcd_print_trs() printed too many messages on console which requires CPU locks. Likewise hba->silence_err_logs, we need to avoid too verbose messages. This is actually not an error case. Link: https://lore.kernel.org/r/20210107185316.788815-3-jaegeuk@kernel.org Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: ufs: Fix livelock of ufshcd_clear_ua_wluns()Jaegeuk Kim
When gate_work/ungate_work experience an error during hibern8_enter or exit we can livelock: ufshcd_err_handler() ufshcd_scsi_block_requests() ufshcd_reset_and_restore() ufshcd_clear_ua_wluns() -> stuck ufshcd_scsi_unblock_requests() In order to avoid this, ufshcd_clear_ua_wluns() can be called per recovery flows such as suspend/resume, link_recovery, and error_handler. Link: https://lore.kernel.org/r/20210107185316.788815-2-jaegeuk@kernel.org Fixes: 1918651f2d7e ("scsi: ufs: Clear UAC for RPMB after ufshcd resets") Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: ufs: Replace sprintf and snprintf with sysfs_emitBean Huo
sprintf and snprintf may cause output defect in sysfs content, it is better to use new added sysfs_emit function which knows the size of the temporary buffer. Link: https://lore.kernel.org/r/20210106211541.23039-1-huobean@gmail.com Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: ufs: Fix all Kconfig help text indentationRandy Dunlap
Use consistent and expected indentation for all Kconfig text. Link: https://lore.kernel.org/r/20210106205554.18082-1-rdunlap@infradead.org Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Avri Altman <avri.altman@wdc.com> Cc: linux-scsi@vger.kernel.org Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: ibmvfc: Relax locking around ibmvfc_queuecommand()Tyrel Datwyler
The driver's queuecommand routine is still wrapped to hold the host lock for the duration of the call. This will become problematic when moving to multiple queues due to the lock contention preventing asynchronous submissions to mulitple queues. There is no real legitimate reason to hold the host lock, and previous patches have insured proper protection of moving ibmvfc_event objects between free and sent lists. Link: https://lore.kernel.org/r/20210106201835.1053593-6-tyreld@linux.ibm.com Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: ibmvfc: Complete commands outside the host/queue lockTyrel Datwyler
Drain the command queue and place all commands on a completion list. Perform command completion on that list outside the host/queue locks. Further, move purged command compeletions outside the host_lock as well. Link: https://lore.kernel.org/r/20210106201835.1053593-5-tyreld@linux.ibm.com Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07scsi: ibmvfc: Define per-queue state/list locksTyrel Datwyler
Define per-queue locks for protecting queue state and event pool sent/free lists. The evt list lock is initially redundant but it allows the driver to be modified in the follow-up patches to relax the queue locking around submissions and completions. Link: https://lore.kernel.org/r/20210106201835.1053593-4-tyreld@linux.ibm.com Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>