summaryrefslogtreecommitdiff
path: root/drivers/scsi/qla2xxx/qla_nvme.c
AgeCommit message (Collapse)Author
2021-03-29scsi: qla2xxx: Fix crash in PCIe error handlingQuinn Tran
BUG: unable to handle kernel NULL pointer dereference at (null) IP: qla2x00_abort_isp+0x21/0x6b0 [qla2xxx] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 1715 Comm: kworker/0:2 Tainted: GOE 4.12.14-122.37-default #1 SLE12-SP5 Hardware name: HPE Superdome Flex/Superdome Flex, BIOS Bundle:3.30.100 SFW:IP147.007.004.017.000.2009211957 09/21/2020 Workqueue: events aer_recover_work_func task: ffff9e399c14ca80 task.stack: ffffc1c58e4ac000 RIP: 0010:qla2x00_abort_isp+0x21/0x6b0 [qla2xxx] RSP: 0018:ffffc1c58e4afd50 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff9e419cdef480 RCX: 0000000000000000 RDX: ffff9e399c14ca80 RSI: 0000000000000246 RDI: ffff9e419bbc27b8 RBP: ffff9e419bbc27b8 R08: 0000000000000004 R09: 00000000a0440000 R10: 0000000000000000 R11: ffff9e399416d1a0 R12: ffff9e419cdef000 R13: ffff9e3a7cfae800 R14: ffff9e3a7cfae800 R15: 00000000000000c0 FS: 0000000000000000(0000) GS:ffff9e39a0000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000006cd00a005 CR4: 00000000007606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: qla2xxx_pci_slot_reset+0x141/0x160 [qla2xxx] report_slot_reset+0x41/0x80 ? merge_result.part.4+0x30/0x30 pci_walk_bus+0x70/0x90 pcie_do_recovery+0x1db/0x2e0 aer_recover_work_func+0xc2/0xf0 process_one_work+0x14c/0x390 Disable board_disable logic where driver resources are freed while OS is in the process of recovering the adapter. Link: https://lore.kernel.org/r/20210329085229.4367-9-njavali@marvell.com Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-29scsi: qla2xxx: Simplify the calculation of variablesJiapeng Zhong
Fix the following coccicheck warnings: ./drivers/scsi/qla2xxx/qla_nvme.c:288:24-26: WARNING !A || A && B is equivalent to !A || B. Link: https://lore.kernel.org/r/1611650554-33019-1-git-send-email-abaci-bugfix@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Jiapeng Zhong <abaci-bugfix@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-13scsi: qla2xxx: Wait for ABTS response on I/O timeouts for NVMeBikash Hazarika
FW needs to wait for an ABTS response before completing the I/O. Link: https://lore.kernel.org/r/20210111093134.1206-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bikash Hazarika <bhazarika@marvell.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-09scsi: qla2xxx: Handle aborts correctly for port undergoing deletionSaurav Kashyap
Call trace observed while shutting down the adapter ports (LINK DOWN). Handle aborts correctly. localhost kernel: INFO: task nvme:44209 blocked for more than 120 seconds. localhost kernel: "echo 0 >/proc/sys/kernel/hung_task_timeout_secs" disables this message. localhost kernel: nvme D ffff88b45fb5acc0 0 44209 1 0x00000080 localhost kernel: Call Trace: localhost kernel: [<ffffffffbd187169>] schedule+0x29/0x70 localhost kernel: [<ffffffffbd184c51>] schedule_timeout+0x221/0x2d0 localhost kernel: [<ffffffffbcad7229>] ? ttwu_do_wakeup+0x19/0xe0 localhost kernel: [<ffffffffbcad735f>] ? ttwu_do_activate+0x6f/0x80 localhost kernel: [<ffffffffbcada830>] ? try_to_wake_up+0x190/0x390 localhost kernel: [<ffffffffbd18751d>] wait_for_completion+0xfd/0x140 localhost kernel: [<ffffffffbcadaaf0>] ? wake_up_state+0x20/0x20 localhost kernel: [<ffffffffbcabe3da>] flush_work+0x10a/0x1b0 localhost kernel: [<ffffffffbcabb0f0>] ? move_linked_works+0x90/0x90 localhost kernel: [<ffffffffbcabe6cf>] flush_delayed_work+0x3f/0x50 localhost kernel: [<ffffffffc0452767>] nvme_fc_init_ctrl+0x657/0x6a0 [nvme_fc] localhost kernel: [<ffffffffc045293a>] nvme_fc_create_ctrl+0x18a/0x210 [nvme_fc] localhost kernel: [<ffffffffc028962f>] nvmf_dev_write+0x98f/0xb35 [nvme_fabrics] localhost kernel: [<ffffffffbcd08927>] ? security_file_permission+0x27/0xa0 localhost kernel: [<ffffffffbcc4db50>] vfs_write+0xc0/0x1f0 localhost kernel: [<ffffffffbcc4e92f>] SyS_write+0x7f/0xf0 localhost kernel: [<ffffffffbd193f92>] system_call_fastpath+0x25/0x2a Link: https://lore.kernel.org/r/20201202132312.19966-11-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-09scsi: qla2xxx: Don't check for fw_started while posting NVMe commandSaurav Kashyap
NVMe commands can come only after successful addition of rport and NVMe connect, and rport is only registered after FW started bit is set. Remove the redundant check. Link: https://lore.kernel.org/r/20201202132312.19966-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-09scsi: qla2xxx: Return EBUSY on fcport deletionDaniel Wagner
When the fcport is about to be deleted we should return EBUSY instead of ENODEV. Only for EBUSY will the request be requeued in a multipath setup. Also return EBUSY when the firmware has not yet started to avoid dropping the request. Link: https://lore.kernel.org/r/20201014073048.36219-1-dwagner@suse.de Link: https://lore.kernel.org/r/20201202132312.19966-2-njavali@marvell.com Reviewed-by: Arun Easi <aeasi@marvell.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-23Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull more SCSI updates from James Bottomley: "The set of core changes here is Christoph's submission path cleanups. These introduced a couple of regressions when first proposed so they got held over from the initial merge window pull request to give more testing time, which they've now had and Syzbot has confirmed the regression it detected is fixed. The other main changes are two driver updates (arcmsr, pm80xx) and assorted minor clean ups" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (38 commits) scsi: qla2xxx: Fix return of uninitialized value in rval scsi: core: Set sc_data_direction to DMA_NONE for no-transfer commands scsi: sr: Initialize ->cmd_len scsi: arcmsr: Update driver version to v1.50.00.02-20200819 scsi: arcmsr: Add support for ARC-1886 series RAID controllers scsi: arcmsr: Fix device hot-plug monitoring timer stop scsi: arcmsr: Remove unnecessary syntax scsi: pm80xx: Driver version update scsi: pm80xx: Increase the number of outstanding I/O supported to 1024 scsi: pm80xx: Remove DMA memory allocation for ccb and device structures scsi: pm80xx: Increase number of supported queues scsi: sym53c8xx_2: Fix sizeof() mismatch scsi: isci: Fix a typo in a comment scsi: qla4xxx: Fix inconsistent format argument type scsi: myrb: Fix inconsistent format argument types scsi: myrb: Remove redundant assignment to variable timeout scsi: bfa: Fix error return in bfad_pci_init() scsi: fcoe: Simplify the return expression of fcoe_sysfs_setup() scsi: snic: Simplify the return expression of svnic_cq_alloc() scsi: fnic: Simplify the return expression of vnic_wq_copy_alloc() ...
2020-10-14Merge tag 'spdx-5.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX updates from Greg KH: "Here are some SPDX-specific changes for 5.10-rc1. They include: - driver fixes to make spdxcheck.pl work properly - add GFDL licenses as "deprecated" but required due to some of our documentation using them - add Zlib license as "deprecated" but required because we have code with this license in the tree. - convert some drivers to have SPDX identifiers that previously didn't have them. All have been in linux-next for a very long time with no reported issues" * tag 'spdx-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: scripts/spdxcheck.py: handle license identifiers in XML comments net/mlx5: IPsec: make spdxcheck.py happy LICENSES/deprecated: add Zlib license text LICENSE: add GFDL deprecated licenses net/qla3xxx: Convert to SPDX license identifiers net/qlge: Convert to SPDX license identifiers net/qlcnic: Convert to SPDX license identifiers scsi/qla2xxx: Convert to SPDX license identifiers scsi/qla4xxx: Convert to SPDX license identifiers
2020-10-08scsi: qla2xxx: Fix return of uninitialized value in rvalColin Ian King
A previous change removed the initialization of rval and there is now an error where an uninitialized rval is being returned on an error return path. Fix this by returning -ENODEV. Link: https://lore.kernel.org/r/20201008183239.200358-1-colin.king@canonical.com Fixes: b994718760fa ("scsi: qla2xxx: Use constant when it is known") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Acked-by: Pavel Machek (CIP) <pavel@denx.de> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Addresses-Coverity: ("Uninitialized scalar variable")
2020-10-07scsi: qla2xxx: Use constant when it is knownPavel Machek (CIP)
Directly return constant when it is known to make code easier to understand. Link: https://lore.kernel.org/r/20200921112340.GA19336@duo.ucw.cz Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Pavel Machek (CIP) <pavel@denx.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-22scsi: qla2xxx: Add SLER and PI control supportSaurav Kashyap
BIT_13 of extended FW attribute informs about NVMe-2 support. Set BIT_15 of special feature control block for enabling SLER in FW. Set bit 8 (SLER supported) to 1 for the service parameter information when sending NVMe PRLI request. Set BIT_14 of special feature control block for enabling PI Control in FW. Driver should set bit 9 (PI Control supported) to 1 for the service parameter information when sending NVMe PRLI request. Set BIT_13 for NVMe Async events. Link: https://lore.kernel.org/r/20200904045128.23631-13-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-22scsi: qla2xxx: Fix I/O errors during LIP reset testsArun Easi
In .fcp_io(), returning ENODEV as soon as remote port delete has started can cause I/O errors. Fix this by returning EBUSY until the remote port delete finishes. Link: https://lore.kernel.org/r/20200904045128.23631-9-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-22scsi: qla2xxx: Performance tweakQuinn Tran
Move statistics fields from vha struct to qpair to reduce memory thrashing. Link: https://lore.kernel.org/r/20200904045128.23631-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-22scsi: qla2xxx: Fix I/O failures during remote port toggle testingArun Easi
Driver was using a lower value for dev_loss_tmo making it more prone to I/O failures during remote port toggle testing. Set dev_loss_tmo to zero during remote port registration to allow nvme-fc default dev_loss_tmo to be used, which is higher than what driver was using. Link: https://lore.kernel.org/r/20200904045128.23631-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-16scsi/qla2xxx: Convert to SPDX license identifiersThomas Gleixner
All files in this driver directory contain the following notice: See LICENSE.qla2xxx for copyright and licensing details. LICENSE.qla2xxx can be found in Documentation/scsi/. The file contains: - A copyright notice This copyright notice is redundant as all files contain the same copyright notice already - A license notice You may modify and redistribute the device driver code under the GNU General Public License (a copy of which is attached hereto as Exhibit A) published by the Free Software Foundation (version 2). This can be replaced with the corresponding SPDX license identifier (GPL-2.0-only) in the source files which reference this license file. - The full GPLv2 license text A redundant copy of LICENSES/preferred/GPL-2.0 Remove the notices and add the SPDX license identifier GPL-2.0-only to the source files. Finally remove the now redundant LICENSE.qla2xxx file. Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Acked-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-15Merge branch '5.9/scsi-fixes' into 5.10/scsi-ufsMartin K. Petersen
Resolve UFS discrepancies between fixes and queue. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-24scsi: qla2xxx: Fix wrong return value in qla_nvme_register_hba()Tianjia Zhang
On an error exit path, a negative error code should be returned instead of a positive return value. Link: https://lore.kernel.org/r/20200802111530.5020-1-tianjia.zhang@linux.alibaba.com Fixes: 8777e4314d39 ("scsi: qla2xxx: Migrate NVME N2N handling into state machine") Cc: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Fix null pointer access during disconnect from subsystemQuinn Tran
NVMEAsync command is being submitted to QLA while the same NVMe controller is in the middle of reset. The reset path has deleted the association and freed aen_op->fcp_req.private. Add a check for this private pointer before issuing the command. ... 6 [ffffb656ca11fce0] page_fault at ffffffff8c00114e [exception RIP: qla_nvme_post_cmd+394] RIP: ffffffffc0d012ba RSP: ffffb656ca11fd98 RFLAGS: 00010206 RAX: ffff8fb039eda228 RBX: ffff8fb039eda200 RCX: 00000000000da161 RDX: ffffffffc0d4d0f0 RSI: ffffffffc0d26c9b RDI: ffff8fb039eda220 RBP: 0000000000000013 R8: ffff8fb47ff6aa80 R9: 0000000000000002 R10: 0000000000000000 R11: ffffb656ca11fdc8 R12: ffff8fb27d04a3b0 R13: ffff8fc46dd98a58 R14: 0000000000000000 R15: ffff8fc4540f0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 7 [ffffb656ca11fe08] nvme_fc_start_fcp_op at ffffffffc0241568 [nvme_fc] 8 [ffffb656ca11fe50] nvme_fc_submit_async_event at ffffffffc0241901 [nvme_fc] 9 [ffffb656ca11fe68] nvme_async_event_work at ffffffffc014543d [nvme_core] 10 [ffffb656ca11fe98] process_one_work at ffffffff8b6cd437 11 [ffffb656ca11fed8] worker_thread at ffffffff8b6cdcef 12 [ffffb656ca11ff10] kthread at ffffffff8b6d3402 13 [ffffb656ca11ff50] ret_from_fork at ffffffff8c000255 -- PID: 37824 TASK: ffff8fb033063d80 CPU: 20 COMMAND: "kworker/u97:451" 0 [ffffb656ce1abc28] __schedule at ffffffff8be629e3 1 [ffffb656ce1abcc8] schedule at ffffffff8be62fe8 2 [ffffb656ce1abcd0] schedule_timeout at ffffffff8be671ed 3 [ffffb656ce1abd70] wait_for_completion at ffffffff8be639cf 4 [ffffb656ce1abdd0] flush_work at ffffffff8b6ce2d5 5 [ffffb656ce1abe70] nvme_stop_ctrl at ffffffffc0144900 [nvme_core] 6 [ffffb656ce1abe80] nvme_fc_reset_ctrl_work at ffffffffc0243445 [nvme_fc] 7 [ffffb656ce1abe98] process_one_work at ffffffff8b6cd437 8 [ffffb656ce1abed8] worker_thread at ffffffff8b6cdb50 9 [ffffb656ce1abf10] kthread at ffffffff8b6d3402 10 [ffffb656ce1abf50] ret_from_fork at ffffffff8c000255 Link: https://lore.kernel.org/r/20200806111014.28434-10-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Fix WARN_ON in qla_nvme_register_hbaArun Easi
qla_nvme_register_hba() puts out a warning when there are not enough queue pairs available for FC-NVME. Just fail the NVME registration rather than a WARNING + call Trace. Link: https://lore.kernel.org/r/20200806111014.28434-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-06-23scsi: qla2xxx: Set NVMe status code for failed NVMe FCP requestDaniel Wagner
The qla2xxx driver knows when request was processed successfully or not. But it always sets the NVMe status code to 0/NVME_SC_SUCCESS. The upper layer needs to figure out from the rcv_rsplen and transferred_length variables if the request was transferred successfully. This is not always possible, e.g. when the request data length is 0, the transferred_length is also set 0 which is interpreted as success in nvme_fc_fcpio_done(). Let's inform the upper layer (nvme_fc_fcpio_done()) when something went wrong. nvme_fc_fcpio_done() maps all non-NVME_SC_SUCCESS status codes to NVME_SC_HOST_PATH_ERROR. There isn't any benefit to map the QLA status code to the NVMe status code. Therefore, use NVME_SC_INTERNAL to indicate an error which aligns it with the lpfc driver. Link: https://lore.kernel.org/r/20200604100745.89250-1-dwagner@suse.de Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-19scsi: qla2xxx: Fix endianness annotations in source filesBart Van Assche
Fix all endianness complaints reported by sparse (C=2) without affecting the behavior of the code on little endian CPUs. Link: https://lore.kernel.org/r/20200518211712.11395-16-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Daniel Wagner <dwagner@suse.de> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-19scsi: qla2xxx: Cast explicitly to uint16_t / uint32_tBart Van Assche
Casting a pointer to void * and relying on an implicit cast from void * to uint16_t or uint32_t suppresses sparse warnings about endianness. Hence cast explicitly to uint16_t and uint32_t. Additionally, remove superfluous void * casts. Link: https://lore.kernel.org/r/20200518211712.11395-13-bvanassche@acm.org Cc: Arun Easi <aeasi@marvell.com> Cc: Nilesh Javali <njavali@marvell.com> Cc: Daniel Wagner <dwagner@suse.de> Cc: Himanshu Madhani <himanshu.madhani@oracle.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-19scsi: qla2xxx: Change {RD,WRT}_REG_*() function names from upper case into ↵Bart Van Assche
lower case This was suggested by Daniel Wagner. Link: https://lore.kernel.org/r/20200518211712.11395-12-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-10Merge tag 'block-5.7-2020-04-10' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Here's a set of fixes that should go into this merge window. This contains: - NVMe pull request from Christoph with various fixes - Better discard support for loop (Evan) - Only call ->commit_rqs() if we have queued IO (Keith) - blkcg offlining fixes (Tejun) - fix (and fix the fix) for busy partitions" * tag 'block-5.7-2020-04-10' of git://git.kernel.dk/linux-block: block: fix busy device checking in blk_drop_partitions again block: fix busy device checking in blk_drop_partitions nvmet-rdma: fix double free of rdma queue blk-mq: don't commit_rqs() if none were queued nvme-fc: Revert "add module to ops template to allow module references" nvme: fix deadlock caused by ANA update wrong locking nvmet-rdma: fix bonding failover possible NULL deref loop: Better discard support for block devices loop: Report EOPNOTSUPP properly nvmet: fix NULL dereference when removing a referral nvme: inherit stable pages constraint in the mpath stack device blkcg: don't offline parent blkcg first blkcg: rename blkcg->cgwb_refcnt to ->online_pin and always use it nvme-tcp: fix possible crash in recv error flow nvme-tcp: don't poll a non-live queue nvme-tcp: fix possible crash in write_zeroes processing nvmet-fc: fix typo in comment nvme-rdma: Replace comma with a semicolon nvme-fcloop: fix deallocation of working context nvme: fix compat address handling in several ioctls
2020-04-04nvme-fc: Revert "add module to ops template to allow module references"James Smart
The original patch was to resolve the lldd being able to be unloaded while being used to talk to the boot device of the system. However, the end result of the original patch is that any driver unload while a nvme controller is live via the lldd is now being prohibited. Given the module reference, the module teardown routine can't be called, thus there's no way, other than manual actions to terminate the controllers. Fixes: 863fbae929c7 ("nvme_fc: add module to ops template to allow module references") Cc: <stable@vger.kernel.org> # v5.4+ Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-02-28scsi: qla2xxx: Convert MAKE_HANDLE() from a define into an inline functionBart Van Assche
This patch allows sparse to verify the endianness of the arguments passed to make_handle(). Link: https://lore.kernel.org/r/20200220043441.20504-5-bvanassche@acm.org Cc: Roman Bolshakov <r.bolshakov@yadro.com> Cc: Daniel Wagner <dwagner@suse.de> Cc: Martin Wilck <mwilck@suse.com> Cc: Quinn Tran <qutran@marvell.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-13Merge tag 'for-linus-20191212' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - stable fix for the bi_size overflow. Not a corruption issue, but a case wher we could merge but disallowed (Andreas) - NVMe pull request via Keith, with various fixes. - MD pull request from Song. - Merge window regression fix for the rq passthrough stats (Logan) - Remove unused blkcg_drain_queue() function (Guoqing) * tag 'for-linus-20191212' of git://git.kernel.dk/linux-block: blk-cgroup: remove blkcg_drain_queue block: fix NULL pointer dereference in account statistics with IDE md: make sure desc_nr less than MD_SB_DISKS md: raid1: check rdev before reference in raid1_sync_request func raid5: need to set STRIPE_HANDLE for batch head block: fix "check bi_size overflow before merge" nvme/pci: Fix read queue count nvme/pci Limit write queue sizes to possible cpus nvme/pci: Fix write and poll queue types nvme/pci: Remove last_cq_head nvme: Namepace identification descriptor list is optional nvme-fc: fix double-free scenarios on hw queues nvme: else following return is not needed nvme: add error message on mismatching controller ids nvme_fc: add module to ops template to allow module references nvmet-loop: Avoid preallocating big SGL for data nvme-fc: Avoid preallocating big SGL for data nvme-rdma: Avoid preallocating big SGL for data
2019-11-27nvme_fc: add module to ops template to allow module referencesJames Smart
In nvme-fc: it's possible to have connected active controllers and as no references are taken on the LLDD, the LLDD can be unloaded. The controller would enter a reconnect state and as long as the LLDD resumed within the reconnect timeout, the controller would resume. But if a namespace on the controller is the root device, allowing the driver to unload can be problematic. To reload the driver, it may require new io to the boot device, and as it's no longer connected we get into a catch-22 that eventually fails, and the system locks up. Fix this issue by taking a module reference for every connected controller (which is what the core layer did to the transport module). Reference is cleared when the controller is removed. Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2019-11-08scsi: qla2xxx: Fix double scsi_done for abort pathQuinn Tran
Current code assumes abort will remove the original command from the active list where scsi_done will not be called. Instead, the eh_abort thread will do the scsi_done. That is not the case. Instead, we have a double scsi_done calls triggering use after free. Abort will tell FW to release the command from FW possesion. The original command will return to ULP with error in its normal fashion via scsi_done. eh_abort path would wait for the original command completion before returning. eh_abort path will not perform the scsi_done call. Fixes: 219d27d7147e0 ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands") Cc: stable@vger.kernel.org # 5.2 Link: https://lore.kernel.org/r/20191105150657.8092-6-hmadhani@marvell.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-12scsi: qla2xxx: Introduce qla2xxx_get_next_handle()Bart Van Assche
This patch reduces code duplication. Cc: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-12scsi: qla2xxx: Enable type checking for the SRB free and done callback functionsBart Van Assche
Since all pointers passed to the srb_t.done() and srb_t.free() functions have type srb_t, change the type of the first argument of these functions from void * into struct srb *. This allows the compiler to verify the argument types for these functions. This patch does not change any functionality. Cc: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-12scsi: qla2xxx: Remove a superfluous pointer checkBart Van Assche
Checking a pointer after it has been dereferenced is not useful. This was detected by Coverity. Cc: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-12scsi: qla2xxx: Improve Linux kernel coding style conformanceBart Van Assche
Insert a space where required, surround complex expressions in macros with parentheses, use the UL suffix instead of the (unsigned long) cast, do not use line continuations when not necessary and do not explicitly initialize static variables to zero. Cc: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07scsi: qla2xxx: Allow NVMe IO to resume with short cable pullQuinn Tran
Current driver report dev_loss_tmo to 0 for NVMe devices with short cable pull. This causes NVMe controller to be freed along with NVMe namespace. The side affect is IO would stop. By not setting dev_loss_tmo to 0, NVMe namespace would stay until cable is plugged back in. This allows IO to resume afterward. [mkp: commit desc] Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27scsi: qla2xxx: move IO flush to the front of NVME rport unregistrationQuinn Tran
On session deletion, current qla code would unregister an NVMe session before flushing IOs. This patch would move the unregistration of NVMe session after IO flush. This way FC-NVMe layer would not have to wait for stuck IOs. In addition, qla2xxx would stop accepting new IOs during session deletion. Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race conditionQuinn Tran
This patch uses kref to protect access between fcp_abort path and nvme command and LS command completion path. Stack trace below shows the abort path is accessing stale memory (nvme_private->sp). When command kref reaches 0, nvme_private & srb resource will be disconnected from each other. Any subsequence nvme abort request will not be able to reference the original srb. [ 5631.003998] BUG: unable to handle kernel paging request at 00000010000005d8 [ 5631.004016] IP: [<ffffffffc087df92>] qla_nvme_abort_work+0x22/0x100 [qla2xxx] [ 5631.004086] Workqueue: events qla_nvme_abort_work [qla2xxx] [ 5631.004097] RIP: 0010:[<ffffffffc087df92>] [<ffffffffc087df92>] qla_nvme_abort_work+0x22/0x100 [qla2xxx] [ 5631.004109] Call Trace: [ 5631.004115] [<ffffffffaa4b8174>] ? pwq_dec_nr_in_flight+0x64/0xb0 [ 5631.004117] [<ffffffffaa4b9d4f>] process_one_work+0x17f/0x440 [ 5631.004120] [<ffffffffaa4bade6>] worker_thread+0x126/0x3c0 Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27scsi: qla2xxx: on session delete, return nvme cmdQuinn Tran
- on session delete or chip reset, reject all NVME commands. - on NVME command submission error, free srb resource. Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devicesArun Easi
BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffffc050d10c>] qla_nvme_unregister_remote_port+0x6c/0xf0 [qla2xxx] PGD 800000084cf41067 PUD 84d288067 PMD 0 Oops: 0000 [#1] SMP Call Trace: [<ffffffff98abcfdf>] process_one_work+0x17f/0x440 [<ffffffff98abdca6>] worker_thread+0x126/0x3c0 [<ffffffff98abdb80>] ? manage_workers.isra.26+0x2a0/0x2a0 [<ffffffff98ac4f81>] kthread+0xd1/0xe0 [<ffffffff98ac4eb0>] ? insert_kthread_work+0x40/0x40 [<ffffffff9918ad37>] ret_from_fork_nospec_begin+0x21/0x21 [<ffffffff98ac4eb0>] ? insert_kthread_work+0x40/0x40 RIP [<ffffffffc050d10c>] qla_nvme_unregister_remote_port+0x6c/0xf0 [qla2xxx] The crash is due to a bad entry in the nvme_rport_list. This list is not protected, and when a remoteport_delete callback is called, driver traverses the list and crashes. Actually, the list could be removed and driver could traverse the main fcport list instead. Fix does exactly that. Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-29scsi: qla2xxx: Complain loudly about reference count underflowBart Van Assche
A reference count underflow is a severe bug. Hence complain loudly if a reference count underflow happens. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-29scsi: qla2xxx: Use __le64 instead of uint32_t[2] for sending DMA addresses ↵Bart Van Assche
to firmware This patch makes the code easier to read and more compact. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-29scsi: qla2xxx: Introduce the dsd32 and dsd64 data structuresBart Van Assche
Introduce two structures for the (DMA address, length) combination instead of using separate structure members for the DMA address and length. This patch fixes several Coverity complaints about 'cur_dsd' being used to write outside the bounds of structure members. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-29scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commandsBart Van Assche
In the *_done() functions, instead of returning early if sp->ref_count >= 2, only decrement sp->ref_count. In qla2xxx_eh_abort(), instead of deciding what to do based on the value of sp->ref_count, decide which action to take depending on the completion status of the firmware abort. Remove srb.cwaitq and use srb.comp instead. In qla2x00_abort_srb(), call isp_ops->abort_command() directly instead of calling qla2xxx_eh_abort(). Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-29scsi: qla2xxx: Remove the fcport test from qla_nvme_abort_work()Bart Van Assche
Testing whether a pointer is not NULL after it has been dereferenced is not useful. Hence remove the if (fcport) test. This was detected by Coverity. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-15scsi: qla2xxx: Leave a blank line after declarationsBart Van Assche
This patch improves readability of the qla2xxx source code. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-08scsi: qla2xxx: Use get/put_unaligned where appropriateBart Van Assche
This patch makes the code easier to read but does not change any functionality. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-08scsi: qla2xxx: fix spelling mistake "alredy" -> "already"Colin Ian King
There is a spelling mistake in a ql_log message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03scsi: qla2xxx: Fix driver unload when FC-NVMe LUNs are connectedGiridhar Malavali
This patch allows driver to unload using "modprobe -r" when FC-NVMe LUNs are connected. Signed-off-by: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03scsi: qla2xxx: Set remote port devloss timeout to 0Giridhar Malavali
This patch sets remote_port_devloss value to 0. This indicates to FC-NVMe transport that driver is unloading and transport should not retry. Fixes: e476fe8af5ff ("scsi: qla2xxx: Fix unload when NVMe devices are configured") Cc: stable@vger.kernel.org Signed-off-by: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03scsi: qla2xxx: Increase the max_sgl_segments to 1024Giridhar Malavali
This patch increases max_sgl_segments value from 128 to the maximum supported which is 1024. Increasing max_sgl_segments will allow the driver to support larger I/O sizes [mkp: commit desc tweak] Signed-off-by: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19scsi: qla2xxx: Check for FW started flag before abortingHimanshu Madhani
For FC-NVMe, if the fw_started flag is not set or fcport is deleted, then do not send Abort command Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>