summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2022-03-15scsi: qla2xxx: Fix stuck session of PRLI rejectQuinn Tran
Remove stale recovery code that prevents normal path recovery. Link: https://lore.kernel.org/r/20220310092604.22950-11-njavali@marvell.com Fixes: 1cbc0efcd9be ("scsi: qla2xxx: Fix retry for PRLI RJT with reason of BUSY") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: qla2xxx: Reduce false trigger to loginQuinn Tran
While a session is in the middle of a relogin, a late RSCN can be delivered from switch. RSCN trigger fabric scan where the scan logic can trigger another session login while a login is in progress. Reduce the extra trigger to prevent multiple logins to the same session. Link: https://lore.kernel.org/r/20220310092604.22950-10-njavali@marvell.com Fixes: bee8b84686c4 ("scsi: qla2xxx: Reduce redundant ADISC command for RSCNs") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: qla2xxx: Fix laggy FC remote port session recoveryQuinn Tran
For session recovery, driver relies on the dpc thread to initiate certain operations. The dpc thread runs exclusively without the Mailbox interface being occupied. A recent code change for heartbeat check via mailbox cmd 0 is preventing the dpc thread from carrying out its operation. This patch allows the higher priority error recovery to run first before running the lower priority heartbeat check. Link: https://lore.kernel.org/r/20220310092604.22950-9-njavali@marvell.com Fixes: d94d8158e184 ("scsi: qla2xxx: Add heartbeat check") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: qla2xxx: Fix hang due to session stuckQuinn Tran
User experienced device lost. The log shows Get port data base command was queued up, failed, and requeued again. Every time it is requeued, it set the FCF_ASYNC_ACTIVE. This prevents any recovery code from occurring because driver thinks a recovery is in progress for this session. In essence, this session is hung. The reason it gets into this place is the session deletion got in front of this call due to link perturbation. Break the requeue cycle and exit. The session deletion code will trigger a session relogin. Link: https://lore.kernel.org/r/20220310092604.22950-8-njavali@marvell.com Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: qla2xxx: Fix N2N inconsistent PLOGIQuinn Tran
For N2N topology, ELS Passthrough is used to send PLOGI. On failure of ELS pass through PLOGI, driver flipped over to using LLIOCB PLOGI for N2N. This is not consistent. Delete the session to restart the connection where ELS pass through PLOGI would be used consistently. Link: https://lore.kernel.org/r/20220310092604.22950-7-njavali@marvell.com Fixes: c76ae845ea83 ("scsi: qla2xxx: Add error handling for PLOGI ELS passthrough") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: qla2xxx: Fix crash during module load unload testArun Easi
During purex packet handling the driver was incorrectly freeing a pre-allocated structure. Fix this by skipping that entry. System crashed with the following stack during a module unload test. Call Trace: sbitmap_init_node+0x7f/0x1e0 sbitmap_queue_init_node+0x24/0x150 blk_mq_init_bitmaps+0x3d/0xa0 blk_mq_init_tags+0x68/0x90 blk_mq_alloc_map_and_rqs+0x44/0x120 blk_mq_alloc_set_map_and_rqs+0x63/0x150 blk_mq_alloc_tag_set+0x11b/0x230 scsi_add_host_with_dma.cold+0x3f/0x245 qla2x00_probe_one+0xd5a/0x1b80 [qla2xxx] Call Trace with slub_debug and debug kernel: kasan_report_invalid_free+0x50/0x80 __kasan_slab_free+0x137/0x150 slab_free_freelist_hook+0xc6/0x190 kfree+0xe8/0x2e0 qla2x00_free_device+0x3bb/0x5d0 [qla2xxx] qla2x00_remove_one+0x668/0xcf0 [qla2xxx] Link: https://lore.kernel.org/r/20220310092604.22950-6-njavali@marvell.com Fixes: 62e9dd177732 ("scsi: qla2xxx: Change in PUREX to handle FPIN ELS requests") Cc: stable@vger.kernel.org Reported-by: Marco Patalano <mpatalan@redhat.com> Tested-by: Marco Patalano <mpatalan@redhat.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: qla2xxx: Fix missed DMA unmap for NVMe ls requestsArun Easi
At NVMe ELS request time, request structure is DMA mapped and never unmapped. Fix this by calling the unmap on ELS completion. Link: https://lore.kernel.org/r/20220310092604.22950-5-njavali@marvell.com Fixes: e84067d74301 ("scsi: qla2xxx: Add FC-NVMe F/W initialization and transport registration") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: qla2xxx: Fix loss of NVMe namespaces after driver reload testArun Easi
Driver registration of localport can race when it happens at the remote port discovery time. Fix this by calling the registration under a mutex. Link: https://lore.kernel.org/r/20220310092604.22950-4-njavali@marvell.com Fixes: e84067d74301 ("scsi: qla2xxx: Add FC-NVMe F/W initialization and transport registration") Cc: stable@vger.kernel.org Reported-by: Marco Patalano <mpatalan@redhat.com> Tested-by: Marco Patalano <mpatalan@redhat.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: qla2xxx: Fix disk failure to rediscoverQuinn Tran
User experienced some of the LUN failed to get rediscovered after long cable pull test. The issue is triggered by a race condition between driver setting session online state vs starting the LUN scan process at the same time. Current code set the online state after notifying the session is available. In this case, trigger to start the LUN scan process happened before driver could set the session in online state. LUN scan ends up with failure due to the session online check was failing. Set the online state before reporting of the availability of the session. Link: https://lore.kernel.org/r/20220310092604.22950-3-njavali@marvell.com Fixes: aecf043443d3 ("scsi: qla2xxx: Fix Remote port registration") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: qla2xxx: Fix incorrect reporting of task management failureQuinn Tran
User experienced no task management error while target device is responding with error. The RSP_CODE field in the status IOCB is in little endian. Driver assumes it's big endian and it picked up erroneous data. Convert the data back to big endian as is on the wire. Link: https://lore.kernel.org/r/20220310092604.22950-2-njavali@marvell.com Fixes: faef62d13463 ("[SCSI] qla2xxx: Fix Task Management command asynchronous handling") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: libiscsi: Teardown iscsi_cls_conn gracefullyWenchao Hao
Commit 1b8d0300a3e9 ("scsi: libiscsi: Fix UAF in iscsi_conn_get_param()/iscsi_conn_teardown()") fixed an UAF in iscsi_conn_get_param() and introduced 2 tmp_xxx varibles. We can gracefully fix this UAF with the help of device_del(). Calling iscsi_remove_conn() at the beginning of iscsi_conn_teardown would make userspace unable to see iscsi_cls_conn. This way we we can free memory safely. Remove iscsi_destroy_conn() since it is no longer used. Link: https://lore.kernel.org/r/20220310015759.3296841-4-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Signed-off-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: libiscsi: Add iscsi_cls_conn to sysfs after initializationWenchao Hao
iscsi_create_conn() exposed iscsi_cls_conn to sysfs prior to initialization of iscsi_conn's dd_data. When userspace tried to access an attribute such as the connect address, a NULL pointer dereference was observed. Do not add iscsi_cls_conn to sysfs until it has been initialized. Remove iscsi_create_conn() since it is no longer used. Link: https://lore.kernel.org/r/20220310015759.3296841-3-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Signed-off-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: iscsi: Add helper functions to manage iscsi_cls_connWenchao Hao
- iscsi_alloc_conn(): Allocate and initialize iscsi_cls_conn - iscsi_add_conn(): Expose iscsi_cls_conn to userspace via sysfs - iscsi_remove_conn(): Remove iscsi_cls_conn from sysfs Link: https://lore.kernel.org/r/20220310015759.3296841-2-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Signed-off-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: core: Remove unreachable code warningYang Li
The smatch tool reported the following warning: drivers/scsi/scsi_error.c:1988 scsi_decide_disposition() warn: ignoring unreachable code. Remove the "default:return FAILED;" instead of "return FAILED;" reported by smatch, because compilers can provide more useful diagnostics about switch/case statements that do not have a default statement, especially if the "switch" applies to a value with enumeration type. Link: https://lore.kernel.org/r/20220301080448.112813-1-yang.lee@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-15scsi: megasas: Clean up some inconsistent indentingYang Li
Eliminate the following smatch warning: drivers/scsi/megaraid/megaraid_sas_fusion.c:5104 megasas_reset_fusion() warn: inconsistent indenting Link: https://lore.kernel.org/r/20220225011605.130927-1-yang.lee@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14scsi: aacraid: Clean up some inconsistent indentingJiapeng Chong
Eliminate the following smatch warning: drivers/scsi/aacraid/linit.c:867 aac_eh_tmf_hard_reset_fib() warn: inconsistent indenting. Link: https://lore.kernel.org/r/20220309005031.126504-1-jiapeng.chong@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14scsi: mpt3sas: Page fault in reply q processingMatt Lupfer
A page fault was encountered in mpt3sas on a LUN reset error path: [ 145.763216] mpt3sas_cm1: Task abort tm failed: handle(0x0002),timeout(30) tr_method(0x0) smid(3) msix_index(0) [ 145.778932] scsi 1:0:0:0: task abort: FAILED scmd(0x0000000024ba29a2) [ 145.817307] scsi 1:0:0:0: attempting device reset! scmd(0x0000000024ba29a2) [ 145.827253] scsi 1:0:0:0: [sg1] tag#2 CDB: Receive Diagnostic 1c 01 01 ff fc 00 [ 145.837617] scsi target1:0:0: handle(0x0002), sas_address(0x500605b0000272b9), phy(0) [ 145.848598] scsi target1:0:0: enclosure logical id(0x500605b0000272b8), slot(0) [ 149.858378] mpt3sas_cm1: Poll ReplyDescriptor queues for completion of smid(0), task_type(0x05), handle(0x0002) [ 149.875202] BUG: unable to handle page fault for address: 00000007fffc445d [ 149.885617] #PF: supervisor read access in kernel mode [ 149.894346] #PF: error_code(0x0000) - not-present page [ 149.903123] PGD 0 P4D 0 [ 149.909387] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 149.917417] CPU: 24 PID: 3512 Comm: scsi_eh_1 Kdump: loaded Tainted: G S O 5.10.89-altav-1 #1 [ 149.934327] Hardware name: DDN 200NVX2 /200NVX2-MB , BIOS ATHG2.2.02.01 09/10/2021 [ 149.951871] RIP: 0010:_base_process_reply_queue+0x4b/0x900 [mpt3sas] [ 149.961889] Code: 0f 84 22 02 00 00 8d 48 01 49 89 fd 48 8d 57 38 f0 0f b1 4f 38 0f 85 d8 01 00 00 49 8b 45 10 45 31 e4 41 8b 55 0c 48 8d 1c d0 <0f> b6 03 83 e0 0f 3c 0f 0f 85 a2 00 00 00 e9 e6 01 00 00 0f b7 ee [ 149.991952] RSP: 0018:ffffc9000f1ebcb8 EFLAGS: 00010246 [ 150.000937] RAX: 0000000000000055 RBX: 00000007fffc445d RCX: 000000002548f071 [ 150.011841] RDX: 00000000ffff8881 RSI: 0000000000000001 RDI: ffff888125ed50d8 [ 150.022670] RBP: 0000000000000000 R08: 0000000000000000 R09: c0000000ffff7fff [ 150.033445] R10: ffffc9000f1ebb68 R11: ffffc9000f1ebb60 R12: 0000000000000000 [ 150.044204] R13: ffff888125ed50d8 R14: 0000000000000080 R15: 34cdc00034cdea80 [ 150.054963] FS: 0000000000000000(0000) GS:ffff88dfaf200000(0000) knlGS:0000000000000000 [ 150.066715] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 150.076078] CR2: 00000007fffc445d CR3: 000000012448a006 CR4: 0000000000770ee0 [ 150.086887] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 150.097670] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 150.108323] PKRU: 55555554 [ 150.114690] Call Trace: [ 150.120497] ? printk+0x48/0x4a [ 150.127049] mpt3sas_scsih_issue_tm.cold.114+0x2e/0x2b3 [mpt3sas] [ 150.136453] mpt3sas_scsih_issue_locked_tm+0x86/0xb0 [mpt3sas] [ 150.145759] scsih_dev_reset+0xea/0x300 [mpt3sas] [ 150.153891] scsi_eh_ready_devs+0x541/0x9e0 [scsi_mod] [ 150.162206] ? __scsi_host_match+0x20/0x20 [scsi_mod] [ 150.170406] ? scsi_try_target_reset+0x90/0x90 [scsi_mod] [ 150.178925] ? blk_mq_tagset_busy_iter+0x45/0x60 [ 150.186638] ? scsi_try_target_reset+0x90/0x90 [scsi_mod] [ 150.195087] scsi_error_handler+0x3a5/0x4a0 [scsi_mod] [ 150.203206] ? __schedule+0x1e9/0x610 [ 150.209783] ? scsi_eh_get_sense+0x210/0x210 [scsi_mod] [ 150.217924] kthread+0x12e/0x150 [ 150.224041] ? kthread_worker_fn+0x130/0x130 [ 150.231206] ret_from_fork+0x1f/0x30 This is caused by mpt3sas_base_sync_reply_irqs() using an invalid reply_q pointer outside of the list_for_each_entry() loop. At the end of the full list traversal the pointer is invalid. Move the _base_process_reply_queue() call inside of the loop. Link: https://lore.kernel.org/r/d625deae-a958-0ace-2ba3-0888dd0a415b@ddn.com Fixes: 711a923c14d9 ("scsi: mpt3sas: Postprocessing of target and LUN reset") Cc: stable@vger.kernel.org Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Matt Lupfer <mlupfer@ddn.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14scsi: hisi_sas: Use libsas internal abort supportJohn Garry
Use the common libsas internal abort functionality. In addition, this driver has special handling for internal abort timeouts - specifically whether to reset the controller in that instance, so extend the API for that. Timeout is now increased to 20 * Hz from 6 * Hz. We also retry for failure now, but this should not make a difference. Link: https://lore.kernel.org/r/1647001432-239276-5-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14scsi: pm8001: Use libsas internal abort supportJohn Garry
New special handling is added for SAS_PROTOCOL_INTERNAL_ABORT proto so that we may use the common queue command API. Link: https://lore.kernel.org/r/1647001432-239276-4-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14scsi: libsas: Add sas_execute_internal_abort_dev()John Garry
Add support for a "device" variant of internal abort, which will abort all pending I/Os for a specific device. Link: https://lore.kernel.org/r/1647001432-239276-3-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14scsi: libsas: Add sas_execute_internal_abort_single()John Garry
The internal abort feature is common to hisi_sas and pm8001 HBAs, and the driver support is similar also, so add a common handler. Two modes of operation will be supported: - single: Abort a single tagged command - device: Abort all commands associated with a specific domain device A new protocol is added, SAS_PROTOCOL_INTERNAL_ABORT, so the common queue command API may be re-used. Only add "single" support as a first step. Link: https://lore.kernel.org/r/1647001432-239276-2-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-14scsi: lpfc: Remove failing soft_wwn supportJames Smart
The soft_wwpn/soft_wwn functionality, which allows the driver to modify service parameters in an attempt to override the adapter-assigned WWN, was originally attempted to be removed roughly 6 yrs ago as new fabric features were being introduced that clashed with the implementation. In the end, the feature was left in with the user being responsible if things went south. We've reached a point where soft_wwn is no longer functional and is failing in almost all production use cases. Use of Fabric features such as Fabric Assigned WWPN and Automatic DPORT is now prevalent and the features require coordination between the adapter and driver that can't be solved by the simplistic update of the service parameters. As it is no longer functional, the feature is to be removed. There are still ways to override the adapter-assigned WWN but they require the admin to invoke bios/efi level menus. Link: https://lore.kernel.org/r/20220310154845.11125-1-jsmart2021@gmail.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-08scsi: ufs: core: scsi_get_lba() error fixPeter Wang
When ufs initializes without scmd->device->sector_size set, scsi_get_lba() will get a wrong shift number and trigger an ubsan error. The shift exponent 4294967286 is too large for the 64-bit type 'sector_t' (aka 'unsigned long long'). Call scsi_get_lba() only when opcode is READ_10/WRITE_10/UNMAP. Link: https://lore.kernel.org/r/20220307111752.10465-1-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-08scsi: mpt3sas: Fix incorrect 4GB boundary checkSreekanth Reddy
The driver must perform its 4GB boundary check using the pool's DMA address instead of using the virtual address. Link: https://lore.kernel.org/r/20220303140230.13098-1-sreekanth.reddy@broadcom.com Fixes: d6adc251dd2f ("scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region") Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-08scsi: mpt3sas: Remove scsi_dma_map() error messagesSreekanth Reddy
When scsi_dma_map() fails by returning a sges_left value less than zero, the amount of logging produced can be extremely high. In a recent end-user environment, 1200 messages per second were being sent to the log buffer. This eventually overwhelmed the system and it stalled. These error messages are not needed. Remove them. Link: https://lore.kernel.org/r/20220303140203.12642-1-sreekanth.reddy@broadcom.com Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-08scsi: libfc: Fix use after free in fc_exch_abts_resp()Jianglei Nie
fc_exch_release(ep) will decrease the ep's reference count. When the reference count reaches zero, it is freed. But ep is still used in the following code, which will lead to a use after free. Return after the fc_exch_release() call to avoid use after free. Link: https://lore.kernel.org/r/20220303015115.459778-1-niejianglei2021@163.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-08scsi: scsi_debug: Fix qc_lock use in sdebug_blk_mq_poll()Damien Le Moal
The use of the 'locked' boolean variable to control locking and unlocking of the qc_lock spinlock of struct sdebug_queue confuses sparse, leading to a warning about an unexpected unlock. Simplify the qc_lock lock/unlock handling code of this function to avoid this warning by removing the 'locked' boolean variable. This change also fixes unlocked access to the in_use_bm bitmap with the find_first_bit() function. Link: https://lore.kernel.org/r/20220301113009.595857-3-damien.lemoal@opensource.wdc.com Fixes: b05d4e481eff ("scsi: scsi_debug: Refine sdebug_blk_mq_poll()") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-08scsi: scsi_debug: Silence unexpected unlock warningsDamien Le Moal
The return statement inside the sdeb_read_lock(), sdeb_read_unlock(), sdeb_write_lock() and sdeb_write_unlock() confuse sparse, leading to many warnings about unexpected unlocks in the resp_xxx() functions. Modify the lock/unlock functions using the __acquire() and __release() inline annotations for the sdebug_no_rwlock == true case to avoid these warnings. Link: https://lore.kernel.org/r/20220301113009.595857-2-damien.lemoal@opensource.wdc.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-08sr: implement ->free_disk to simplify refcountingChristoph Hellwig
Simplify the refcounting and remove the need to clear disk->private_data by implementing the ->free_disk method. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220308055200.735835-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08sd: implement ->free_disk to simplify refcountingChristoph Hellwig
Implement the ->free_disk method to to put struct scsi_disk when the last gendisk reference count goes away. This removes the need to clear ->private_data and thus freeze the queue on unbind. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220308055200.735835-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08sd: delay calling free_opal_devChristoph Hellwig
Call free_opal_dev from scsi_disk_release as the opal_dev field is accessed from the ioctl handler, which isn't synchronized vs sd_release and thus can be accessed during or after sd_release was called. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220308055200.735835-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08sd: call sd_zbc_release_disk before releasing the scsi_device referenceChristoph Hellwig
sd_zbc_release_disk accesses disk->device, so ensure that actually still has a valid reference. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220308055200.735835-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08sd: rename the scsi_disk.dev fieldChristoph Hellwig
dev is very hard to grep for. Give the field a more descriptive name and documents its purpose. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220308055200.735835-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08scsi: don't use disk->private_data to find the scsi_driverChristoph Hellwig
Requiring every ULP to have the scsi_drive as first member of the private data is rather fragile and not necessary anyway. Just use the driver hanging off the SCSI device instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220308055200.735835-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-07xen/scsifront: don't use gnttab_query_foreign_access() for mapped statusJuergen Gross
It isn't enough to check whether a grant is still being in use by calling gnttab_query_foreign_access(), as a mapping could be realized by the other side just after having called that function. In case the call was done in preparation of revoking a grant it is better to do so via gnttab_try_end_foreign_access() and check the success of that operation instead. This is CVE-2022-23038 / part of XSA-396. Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> --- V2: - use gnttab_try_end_foreign_access()
2022-03-01scsi: ufs: Fix runtime PM messages never-ending cycleAdrian Hunter
Kernel messages produced during runtime PM can cause a never-ending cycle because user space utilities (e.g. journald or rsyslog) write the messages back to storage, causing runtime resume, more messages, and so on. Messages that tell of things that are expected to happen, are arguably unnecessary, so suppress them. UFS driver messages are changes to from dev_err() to dev_dbg() which means they will not display unless activated by dynamic debug of building with -DDEBUG. sdev->silence_suspend is set to skip messages from sd_suspend_common() "Synchronizing SCSI cache", "Stopping disk" and scsi_report_sense() "Power-on or device reset occurred" message (Note, that message appears when the LUN is accessed after runtime PM, not during runtime PM) Example messages from Ubuntu 21.10: $ dmesg | tail [ 1620.380071] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[1, 1], lane[1, 1], pwr[SLOWAUTO_MODE, SLOWAUTO_MODE], rate = 0 [ 1620.408825] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[4, 4], lane[2, 2], pwr[FAST MODE, FAST MODE], rate = 2 [ 1620.409020] ufshcd 0000:00:12.5: ufshcd_find_max_sup_active_icc_level: Regulator capability was not set, actvIccLevel=0 [ 1620.409524] sd 0:0:0:0: Power-on or device reset occurred [ 1622.938794] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 1622.939184] ufs_device_wlun 0:0:0:49488: Power-on or device reset occurred [ 1625.183175] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[1, 1], lane[1, 1], pwr[SLOWAUTO_MODE, SLOWAUTO_MODE], rate = 0 [ 1625.208041] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[4, 4], lane[2, 2], pwr[FAST MODE, FAST MODE], rate = 2 [ 1625.208311] ufshcd 0000:00:12.5: ufshcd_find_max_sup_active_icc_level: Regulator capability was not set, actvIccLevel=0 [ 1625.209035] sd 0:0:0:0: Power-on or device reset occurred Note for stable: depends on patch "scsi: core: sd: Add silence_suspend flag to suppress some PM messages". Link: https://lore.kernel.org/r/20220228113652.970857-3-adrian.hunter@intel.com Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: core: sd: Add silence_suspend flag to suppress some PM messagesAdrian Hunter
Kernel messages produced during runtime PM can cause a never-ending cycle because user space utilities (e.g. journald or rsyslog) write the messages back to storage, causing runtime resume, more messages, and so on. Messages that tell of things that are expected to happen are arguably unnecessary, so add a flag to suppress them. This flag is used by the UFS driver. Link: https://lore.kernel.org/r/20220228113652.970857-2-adrian.hunter@intel.com Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: lpfc: Use rport as argument for lpfc_chk_tgt_mapped()Hannes Reinecke
We only need the rport structure for lpfc_chk_tgt_mapped(). Link: https://lore.kernel.org/r/20220301143718.40913-6-hare@suse.de Cc: James Smart <james.smart@broadcom.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: lpfc: Use rport as argument for lpfc_send_taskmgmt()Hannes Reinecke
Instead of passing in a scsi_cmnd we should be using the rport; we already have the target and LUN ID as parameters, so there's no need to pass the scsi_cmnd too. Link: https://lore.kernel.org/r/20220301143718.40913-5-hare@suse.de Cc: James Smart <james.smart@broadcom.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: lpfc: Use fc_block_rport()Hannes Reinecke
Use fc_block_rport() instead of fc_block_scsi_eh() as the SCSI command will be removed as argument for SCSI EH functions. Link: https://lore.kernel.org/r/20220301143718.40913-4-hare@suse.de Cc: James Smart <james.smart@broadcom.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: lpfc: Drop lpfc_no_handler()Hannes Reinecke
The default SCSI EH action for a non-existing EH callback is to return FAILED, so having a callback just returning FAILED is pointless. Link: https://lore.kernel.org/r/20220301143718.40913-3-hare@suse.de Cc: James Smart <james.smart@broadcom.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: lpfc: Kill lpfc_bus_reset_handler()Hannes Reinecke
lpfc_bus_reset_handler() is really just a loop calling lpfc_target_reset_handler() over all targets, which is what the error handler will be doing anyway. Link: https://lore.kernel.org/r/20220301143718.40913-2-hare@suse.de Cc: James Smart <james.smart@broadcom.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: wd719x: Return proper error code when dma_set_mask() failsZheyu Ma
During the process of driver probing, the probe function should return < 0 for failure, otherwise the kernel will treat value >= 0 as success. Set 'err' to the error value returned by dma_set_mask() in case of failure. Link: https://lore.kernel.org/r/1646060055-11361-1-git-send-email-zheyuma97@gmail.com Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: iscsi: Drop temp workq_nameMike Christie
When the workqueue code was created it didn't allow variable args so we have been using a temp buffer. Drop that. Link: https://lore.kernel.org/r/20220226230435.38733-7-michael.christie@oracle.com Reviewed-by: Chris Leech <cleech@redhat.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: iscsi: Use the session workqueue for recoveryMike Christie
Use the session workqueue for recovery and unbinding. If there are delays during device blocking/cleanup then it will no longer affect other sessions. Link: https://lore.kernel.org/r/20220226230435.38733-6-michael.christie@oracle.com Reviewed-by: Chris Leech <cleech@redhat.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: iscsi: ql4xxx: Use per-session workqueue for unbindingMike Christie
We currently allocate a workqueue per host and only use it for removing the target. For the session per host case we could be using this workqueue to be able to do recoveries (block, unblock, timeout handling) in parallel. To also allow offload drivers to do their session recoveries in parallel, this drops the per host workqueue and replaces it with a per session one. Link: https://lore.kernel.org/r/20220226230435.38733-5-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: iscsi: Remove iscsi_scan_finished()Mike Christie
qla4xxx does not use iscsi_scan_finished() anymore so remove it. Link: https://lore.kernel.org/r/20220226230435.38733-4-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: iscsi: Speed up session unblocking and removalMike Christie
When the iSCSI class was added upstream, blocking a queue was fast because it just set some flag bits and didn't handle I/O that was in the process of being sent to the driver. That's no longer the case so blocking a queue is expensive and we can end up with a backlog of blocks by the time we have relogged in and are trying to start the queues. For the session unblock case, this has try to cancel the block and recovery work in case they are still queued so we can avoid unneeded queue manipulations. For removal, we also now try to cancel all the recovery related works since a couple lines down we will set the session and device state so running those functions are not necessary. Link: https://lore.kernel.org/r/20220226230435.38733-3-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: iscsi: Fix recovery and unblocking raceMike Christie
If the user sets the iscsi_eh_timer_workq/iscsi_eh workqueue's max_active to greater than 1, the recovery_work could be running when __iscsi_unblock_session() runs. The cancel_delayed_work() will then not wait for the running work and we can race where we end up with the wrong session state and scsi_device state set. This replaces the cancel_delayed_work() with the sync version. Link: https://lore.kernel.org/r/20220226230435.38733-2-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: scsi_transport_fc: Fix FPIN Link Integrity statistics countersJames Smart
In the original FPIN commit, stats were incremented by the event_count. Event_count is the minimum # of events that must occur before an FPIN is sent. Thus, its not the actual number of events, and could be significantly off (too low) as it doesn't reflect anything not reported. Rather than attempt to count events, have the statistic count how many FPINS cross the threshold and were reported. Link: https://lore.kernel.org/r/20220301175536.60250-1-jsmart2021@gmail.com Fixes: 3dcfe0de5a97 ("scsi: fc: Parse FPIN packets and update statistics") Cc: <stable@vger.kernel.org> # v5.11+ Cc: Shyam Sundar <ssundar@marvell.com> Cc: Nilesh Javali <njavali@marvell.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>