summaryrefslogtreecommitdiff
path: root/drivers/scsi/cxlflash/main.c
AgeCommit message (Collapse)Author
2019-12-02Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: aacraid, ufs, zfcp, NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx plus a whole load of minor updates and fixes. The major core changes are Al Viro's reworking of sg's handling of copy to/from user, Ming Lei's removal of the host busy counter to avoid contention in the multiqueue case and Damien Le Moal's fixing of residual tracking across error handling" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (251 commits) scsi: bnx2fc: timeout calculation invalid for bnx2fc_eh_abort() scsi: target: core: Fix a pr_debug() argument scsi: iscsi: Don't send data to unbound connection scsi: target: iscsi: Wait for all commands to finish before freeing a session scsi: target: core: Release SPC-2 reservations when closing a session scsi: target: core: Document target_cmd_size_check() scsi: bnx2i: fix potential use after free Revert "scsi: qla2xxx: Fix memory leak when sending I/O fails" scsi: NCR5380: Add disconnect_mask module parameter scsi: NCR5380: Unconditionally clear ICR after do_abort() scsi: NCR5380: Call scsi_set_resid() on command completion scsi: scsi_debug: num_tgts must be >= 0 scsi: lpfc: use hdwq assigned cpu for allocation scsi: arcmsr: fix indentation issues scsi: qla4xxx: fix double free bug scsi: pm80xx: Modified the logic to collect fatal dump scsi: pm80xx: Tie the interrupt name to the module instance scsi: pm80xx: Controller fatal error through sysfs scsi: pm80xx: Do not request 12G sas speeds scsi: pm80xx: Cleanup command when a reset times out ...
2019-10-23compat_ioctl: move more drivers to compat_ptr_ioctlArnd Bergmann
The .ioctl and .compat_ioctl file operations have the same prototype so they can both point to the same function, which works great almost all the time when all the commands are compatible. One exception is the s390 architecture, where a compat pointer is only 31 bit wide, and converting it into a 64-bit pointer requires calling compat_ptr(). Most drivers here will never run in s390, but since we now have a generic helper for it, it's easy enough to use it consistently. I double-checked all these drivers to ensure that all ioctl arguments are used as pointers or are ignored, but are not interpreted as integer values. Acked-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: David Sterba <dsterba@suse.com> Acked-by: Darren Hart (VMware) <dvhart@infradead.org> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-10-22scsi: cxlflash: remove set but not used variable 'ioarcb'YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning: drivers/scsi/cxlflash/main.c:47:22: warning: variable ioarcb set but not used [-Wunused-but-set-variable] It is never used, so can be removed. Link: https://lore.kernel.org/r/20191021141957.18828-1-yuehaibing@huawei.com Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Matthew R. Ochs <mrochs@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-30scsi: cxlflash: Mark expected switch fall-throughsGustavo A. R. Silva
Mark switch cases where we are expecting to fall through. This patch fixes the following warnings: drivers/scsi/cxlflash/main.c: In function 'send_afu_cmd': drivers/scsi/cxlflash/main.c:2347:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (rc) { ^ drivers/scsi/cxlflash/main.c:2357:2: note: here case -EAGAIN: ^~~~ drivers/scsi/cxlflash/main.c: In function 'term_intr': drivers/scsi/cxlflash/main.c:754:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (index == PRIMARY_HWQ) ^ drivers/scsi/cxlflash/main.c:756:2: note: here case UNMAP_TWO: ^~~~ drivers/scsi/cxlflash/main.c:757:3: warning: this statement may fall through [-Wimplicit-fallthrough=] cfg->ops->unmap_afu_irq(hwq->ctx_cookie, 2, hwq); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/cxlflash/main.c:758:2: note: here case UNMAP_ONE: ^~~~ drivers/scsi/cxlflash/main.c:759:3: warning: this statement may fall through [-Wimplicit-fallthrough=] cfg->ops->unmap_afu_irq(hwq->ctx_cookie, 1, hwq); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/cxlflash/main.c:760:2: note: here case FREE_IRQ: ^~~~ drivers/scsi/cxlflash/main.c: In function 'cxlflash_remove': drivers/scsi/cxlflash/main.c:975:3: warning: this statement may fall through [-Wimplicit-fallthrough=] cxlflash_release_chrdev(cfg); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/cxlflash/main.c:976:2: note: here case INIT_STATE_SCSI: ^~~~ drivers/scsi/cxlflash/main.c:978:3: warning: this statement may fall through [-Wimplicit-fallthrough=] scsi_remove_host(cfg->host); ^~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/cxlflash/main.c:979:2: note: here case INIT_STATE_AFU: ^~~~ drivers/scsi/cxlflash/main.c:980:3: warning: this statement may fall through [-Wimplicit-fallthrough=] term_afu(cfg); ^~~~~~~~~~~~~ drivers/scsi/cxlflash/main.c:981:2: note: here case INIT_STATE_PCI: ^~~~ drivers/scsi/cxlflash/main.c:983:3: warning: this statement may fall through [-Wimplicit-fallthrough=] pci_disable_device(pdev); ^~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/cxlflash/main.c:984:2: note: here case INIT_STATE_NONE: ^~~~ drivers/scsi/cxlflash/main.c: In function 'num_hwqs_store': drivers/scsi/cxlflash/main.c:3018:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (cfg->state == STATE_NORMAL) ^ drivers/scsi/cxlflash/main.c:3020:2: note: here default: ^~~~~~~ Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Matthew R. Ochs <mrochs@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-09Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc, hisi_sas, target/iscsi and target/core. Additionally Christoph refactored gdth as part of the dma changes. The major mid-layer change this time is the removal of bidi commands and with them the whole of the osd/exofs driver and filesystem. This is a major simplification for block and mq in particular" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits) scsi: cxgb4i: validate tcp sequence number only if chip version <= T5 scsi: cxgb4i: get pf number from lldi->pf scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c scsi: mpt3sas: Add missing breaks in switch statements scsi: aacraid: Fix missing break in switch statement scsi: kill command serial number scsi: csiostor: drop serial_number usage scsi: mvumi: use request tag instead of serial_number scsi: dpt_i2o: remove serial number usage scsi: st: osst: Remove negative constant left-shifts scsi: ufs-bsg: Allow reading descriptors scsi: ufs: Allow reading descriptor via raw upiu scsi: ufs-bsg: Change the calling convention for write descriptor scsi: ufs: Remove unused device quirks Revert "scsi: ufs: disable vccq if it's not needed by UFS device" scsi: megaraid_sas: Remove a bunch of set but not used variables scsi: clean obsolete return values of eh_timed_out scsi: sd: Optimal I/O size should be a multiple of physical block size scsi: MAINTAINERS: SCSI initiator and target tweaks scsi: fcoe: make use of fip_mode enum complete ...
2019-02-08scsi: ata: Use unsigned int for cmd's type in ioctls in scsi_host_templateNathan Chancellor
Clang warns several times in the scsi subsystem (trimmed for brevity): drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to switch condition type (2147762695 to 18446744071562347015) [-Wswitch] case CCISS_GETBUSTYPES: ^ drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to switch condition type (2147762694 to 18446744071562347014) [-Wswitch] case CCISS_GETHEARTBEAT: ^ The root cause is that the _IOC macro can generate really large numbers, which don't fit into type 'int', which is used for the cmd parameter in the ioctls in scsi_host_template. My research into how GCC and Clang are handling this at a low level didn't prove fruitful. However, looking at the rest of the kernel tree, all ioctls use an 'unsigned int' for the cmd parameter, which will fit all of the _IOC values in the scsi/ata subsystems. Make that change because none of the ioctls expect a negative value for any command, it brings the ioctls inline with the reset of the kernel, and it removes ambiguity, which is never good when dealing with compilers. Link: https://github.com/ClangBuiltLinux/linux/issues/85 Link: https://github.com/ClangBuiltLinux/linux/issues/154 Link: https://github.com/ClangBuiltLinux/linux/issues/157 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Bradley Grove <bgrove@attotech.com> Acked-by: Don Brace <don.brace@microsemi.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-04scsi: cxlflash: Prevent deadlock when adapter probe failsVaibhav Jain
Presently when an error is encountered during probe of the cxlflash adapter, a deadlock is seen with cpu thread stuck inside cxlflash_remove(). Below is the trace of the deadlock as logged by khungtaskd: cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16 INFO: task kworker/80:1:890 blocked for more than 120 seconds. Not tainted 5.0.0-rc4-capi2-kexec+ #2 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/80:1 D 0 890 2 0x00000808 Workqueue: events work_for_cpu_fn Call Trace: 0x4d72136320 (unreliable) __switch_to+0x2cc/0x460 __schedule+0x2bc/0xac0 schedule+0x40/0xb0 cxlflash_remove+0xec/0x640 [cxlflash] cxlflash_probe+0x370/0x8f0 [cxlflash] local_pci_probe+0x6c/0x140 work_for_cpu_fn+0x38/0x60 process_one_work+0x260/0x530 worker_thread+0x280/0x5d0 kthread+0x1a8/0x1b0 ret_from_kernel_thread+0x5c/0x80 INFO: task systemd-udevd:5160 blocked for more than 120 seconds. The deadlock occurs as cxlflash_remove() is called from cxlflash_probe() without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never successfully probed the 'cxlflash_cfg->state' never changes from STATE_PROBING hence the deadlock occurs. We fix this deadlock by setting the variable 'cxlflash_cfg->state' to STATE_PROBED in case an error occurs during cxlflash_probe() and just before calling cxlflash_remove(). Cc: stable@vger.kernel.org Fixes: c21e0bbfc485("cxlflash: Base support for IBM CXL Flash Adapter") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-28Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: smarpqi, lpfc, qedi, megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas. Additionally, we have a pile of annotation, unused variable and minor updates. The big API change is the updates for Christoph's DMA rework which include removing the DISABLE_CLUSTERING flag. And finally there are a couple of target tree updates" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits) scsi: isci: request: mark expected switch fall-through scsi: isci: remote_node_context: mark expected switch fall-throughs scsi: isci: remote_device: Mark expected switch fall-throughs scsi: isci: phy: Mark expected switch fall-through scsi: iscsi: Capture iscsi debug messages using tracepoints scsi: myrb: Mark expected switch fall-throughs scsi: megaraid: fix out-of-bound array accesses scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through scsi: fcoe: remove set but not used variable 'port' scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown() scsi: smartpqi: fix build warnings scsi: smartpqi: update driver version scsi: smartpqi: add ofa support scsi: smartpqi: increase fw status register read timeout scsi: smartpqi: bump driver version scsi: smartpqi: add smp_utils support scsi: smartpqi: correct lun reset issues scsi: smartpqi: correct volume status scsi: smartpqi: do not offline disks for transient did no connect conditions scsi: smartpqi: allow for larger raid maps ...
2018-12-18scsi: flip the default on use_clusteringChristoph Hellwig
Most SCSI drivers want to enable "clustering", that is merging of segments so that they might span more than a single page. Remove the ENABLE_CLUSTERING define, and require drivers to explicitly set DISABLE_CLUSTERING to disable this feature. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-07scsi: kill off the legacy IO pathJens Axboe
This removes the legacy (non-mq) IO path for SCSI. Cc: linux-scsi@vger.kernel.org Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-18scsi: cxlflash: Isolate external module dependenciesUma Krishnan
Depending on the underlying transport, cxlflash has a dependency on either the CXL or OCXL drivers, which are enabled via their Kconfig option. Instead of having a module wide dependency on these config options, it is better to isolate the object modules that are dependent on the CXL and OCXL drivers and adjust the module dependencies accordingly. This commit isolates the object files that are dependent on CXL and/or OCXL. The cxl/ocxl fops used in the core driver are tucked under an ifdef to avoid compilation errors. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18scsi: cxlflash: Abstract hardware dependent assignmentsUma Krishnan
As a staging cleanup to support transport specific builds of the cxlflash module, relocate device dependent assignments to header files. This will avoid littering the core driver with conditional compilation logic. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18scsi: cxlflash: Use local mutex for AFU serializationMatthew R. Ochs
AFUs can only process a single AFU command at a time. This is enforced with a global mutex situated within the AFU send routine. As this mutex has a global scope, it has the potential to unnecessarily block commands destined for other AFUs. Instead of using a global mutex, transition the mutex to be per-AFU. This will allow commands to only be blocked by siblings of the same AFU. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18scsi: cxlflash: Limit the debug logs in the IO pathUma Krishnan
The kernel log can get filled with debug messages from send_cmd_ioarrin() when dynamic debug is enabled for the cxlflash module and there is a lot of legacy I/O traffic. While these messages are necessary to debug issues that involve command tracking, the abundance of data can overwrite other useful data in the log. The best option available is to limit the messages that should serve most of the common use cases. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18scsi: cxlflash: Yield to active send threadsUma Krishnan
The following Oops may be encountered if the device is reset, i.e. EEH recovery, while there is heavy I/O traffic: 59:mon> t [c000200db64bb680] c008000009264c40 cxlflash_queuecommand+0x3b8/0x500 [cxlflash] [c000200db64bb770] c00000000090d3b0 scsi_dispatch_cmd+0x130/0x2f0 [c000200db64bb7f0] c00000000090fdd8 scsi_request_fn+0x3c8/0x8d0 [c000200db64bb900] c00000000067f528 __blk_run_queue+0x68/0xb0 [c000200db64bb930] c00000000067ab80 __elv_add_request+0x140/0x3c0 [c000200db64bb9b0] c00000000068daac blk_execute_rq_nowait+0xec/0x1a0 [c000200db64bba00] c00000000068dbb0 blk_execute_rq+0x50/0xe0 [c000200db64bba50] c0000000006b2040 sg_io+0x1f0/0x520 [c000200db64bbaf0] c0000000006b2e94 scsi_cmd_ioctl+0x534/0x610 [c000200db64bbc20] c000000000926208 sd_ioctl+0x118/0x280 [c000200db64bbcc0] c00000000069f7ac blkdev_ioctl+0x7fc/0xe30 [c000200db64bbd20] c000000000439204 block_ioctl+0x84/0xa0 [c000200db64bbd40] c0000000003f8514 do_vfs_ioctl+0xd4/0xa00 [c000200db64bbde0] c0000000003f8f04 SyS_ioctl+0xc4/0x130 [c000200db64bbe30] c00000000000b184 system_call+0x58/0x6c When there is no room to send the I/O request, the cached room is refreshed by reading the memory mapped command room value from the AFU. The AFU register mapping is refreshed during a reset, creating a race condition that can lead to the Oops above. During a device reset, the AFU should not be unmapped until all the active send threads quiesce. An atomic counter, cmds_active, is currently used to track internal AFU commands and quiesce during reset. This same counter can also be used for the active send threads. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: cxlflash: Handle spurious interruptsUma Krishnan
The following Oops can occur when there is heavy I/O traffic and the host is reset by a tool such as sg_reset. [c000200fff3fbc90] c00800001690117c process_cmd_doneq+0x104/0x500 [cxlflash] (unreliable) [c000200fff3fbd80] c008000016901648 cxlflash_rrq_irq+0xd0/0x150 [cxlflash] [c000200fff3fbde0] c000000000193130 __handle_irq_event_percpu+0xa0/0x310 [c000200fff3fbea0] c0000000001933d8 handle_irq_event_percpu+0x38/0x90 [c000200fff3fbee0] c000000000193494 handle_irq_event+0x64/0xb0 [c000200fff3fbf10] c000000000198ea0 handle_fasteoi_irq+0xc0/0x230 [c000200fff3fbf40] c00000000019182c generic_handle_irq+0x4c/0x70 [c000200fff3fbf60] c00000000001794c __do_irq+0x7c/0x1c0 [c000200fff3fbf90] c00000000002a390 call_do_irq+0x14/0x24 [c000200e5828fab0] c000000000017b2c do_IRQ+0x9c/0x130 [c000200e5828fb00] c000000000009b04 h_virt_irq_common+0x114/0x120 When a context is reset, the pending commands are flushed and the AFU is notified. Before the AFU handles this request there could be command completion interrupts queued to PHB which are yet to be delivered to the context. In this scenario, a context could receive an interrupt for a command that has been flushed, leading to a possible crash when the memory for the flushed command is accessed. To resolve this problem, a boolean will indicate if the hardware queue is ready to process interrupts or not. This can be evaluated in the interrupt handler before proessing an interrupt. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: cxlflash: Remove commmands from pending list on timeoutUma Krishnan
The following Oops can occur if an internal command sent to the AFU does not complete within the timeout: [c000000ff101b810] c008000016020d94 term_mc+0xfc/0x1b0 [cxlflash] [c000000ff101b8a0] c008000016020fb0 term_afu+0x168/0x280 [cxlflash] [c000000ff101b930] c0080000160232ec cxlflash_pci_error_detected+0x184/0x230 [cxlflash] [c000000ff101b9e0] c00800000d95d468 cxl_vphb_error_detected+0x90/0x150[cxl] [c000000ff101ba20] c00800000d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl] [c000000ff101bac0] c00000000003eaf8 eeh_report_error+0xd8/0x1b0 [c000000ff101bb20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170 [c000000ff101bbb0] c00000000003f438 eeh_handle_normal_event+0x198/0x580 [c000000ff101bc60] c00000000003fba4 eeh_handle_event+0x2a4/0x338 [c000000ff101bd10] c0000000000400b8 eeh_event_handler+0x1f8/0x200 [c000000ff101bdc0] c00000000013da48 kthread+0x1a8/0x1b0 [c000000ff101be30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4 When an internal command times out, the command buffer is freed while it is still in the pending commands list of the context. This corrupts the list and when the context is cleaned up, a crash is encountered. To resolve this issue, when an AFU command or TMF command times out, the command should be deleted from the hardware queue pending command list before freeing the buffer. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: cxlflash: Synchronize reset and remove opsUma Krishnan
The following Oops can be encountered if a device removal or system shutdown is initiated while an EEH recovery is in process: [c000000ff2f479c0] c008000015256f18 cxlflash_pci_slot_reset+0xa0/0x100 [cxlflash] [c000000ff2f47a30] c00800000dae22e0 cxl_pci_slot_reset+0x168/0x290 [cxl] [c000000ff2f47ae0] c00000000003ef1c eeh_report_reset+0xec/0x170 [c000000ff2f47b20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170 [c000000ff2f47bb0] c00000000003f80c eeh_handle_normal_event+0x56c/0x580 [c000000ff2f47c60] c00000000003fba4 eeh_handle_event+0x2a4/0x338 [c000000ff2f47d10] c0000000000400b8 eeh_event_handler+0x1f8/0x200 [c000000ff2f47dc0] c00000000013da48 kthread+0x1a8/0x1b0 [c000000ff2f47e30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4 The remove handler frees AFU memory while the EEH recovery is in progress, leading to a race condition. This can result in a crash if the recovery thread tries to access this memory. To resolve this issue, the cxlflash remove handler will evaluate the device state and yield to any active reset or probing threads. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: cxlflash: Enable OCXL operationsUma Krishnan
This commit enables the OCXL operations for the OCXL devices. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: cxlflash: Setup LISNs for master contextsUma Krishnan
Similar to user contexts, master contexts also require that the per-context LISN registers be programmed for certain AFUs. The mapped trigger page is obtained from underlying transport and registered with AFU for each master context. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: cxlflash: Hardware AFU for OCXLUma Krishnan
When an adapter is initialized, transport specific configuration and MMIO mapping details need to be saved. For CXL, this data is managed by the underlying kernel module. To maintain a separation between the cxlflash core and underlying transports, introduce a new structure to store data specific to the OCXL AFU. Initially only the pointers to underlying PCI and generic devices are added to this new structure - it will be expanded further in future commits. Services to create and destroy this hardware AFU are added and integrated in the probe and exit paths of the driver. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: cxlflash: Avoid clobbering context control register valueMatthew R. Ochs
The SISLite specification originally defined the context control register with a single field of bits to represent the LISN and also stipulated that the register reset value be 0. The cxlflash driver took advantage of this when programming the LISN for the master contexts via an unconditional write - no other bits were preserved. When unmap support was added, SISLite was updated to define bit 0 of the context control register as a way for the AFU to notify the context owner that unmap operations were supported. Thus the assumptions under which the register is setup changed and the existing unconditional write is clobbering the unmap state for master contexts. This is presently not an issue due to the order in which the context control register is programmed in relation to the unmap bit being queried but should be addressed to avoid a future regression in the event this code is moved elsewhere. To remedy this issue, preserve the bits when programming the LISN field in the context control register. Since the LISN will now be programmed using a read value, assert that the initial state of the LISN field is as described in SISLite (0). Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: cxlflash: Preserve number of interrupts for master contextsUma Krishnan
The number of interrupts requested for user contexts are stored in the context specific structures and utilized to manage the interrupts. For the master contexts, this number is only used once and therefore not saved. To prepare for future commits where the number of interrupts will be required in more than one place, preserve the value in the master context structure. [mkp: typo in comment] Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-10scsi: cxlflash: Staging to support future acceleratorsMatthew R. Ochs
As staging to support future accelerator transports, add a shim layer such that the underlying services the cxlflash driver requires can be conditional upon the accelerator infrastructure. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-10scsi: cxlflash: Adapter context init can return errorUma Krishnan
Adapter context creation can return either NULL or an error pointer. Updating the check condition to reflect this. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-10scsi: cxlflash: Remove embedded CXL work structuresMatthew R. Ochs
The CXL-specific work structure used to request the number of interrupts currently resides as a nested member of both the context information and hardware queue structures. It is used to cache values (specifically the number of interrupts) required by the CXL layer when starting a context. To facilitate staging that will ultimately allow the cxlflash core to become agnostic of the underlying accelerator transport, remove these embedded work structures. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-10scsi: cxlflash: Update cxl-specific arguments to generic cookieUma Krishnan
Convert cxl-specific pointers to generic cookies to facilitate future enhancements. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-10scsi: cxlflash: Reset command ioascUma Krishnan
In the event of a command failure, cxlflash returns the failure to the upper layers to process. After processing the error, when the command is queued again, the private command structure will not be zeroed and the ioasc could be stale. Per the SISLite specification, the AFU only sets the ioasc in the presence of a failure. Thus, even though the original command succeeds the second time, the command is considered a failure due to stale ioasc. This cycle repeats indefinitely and can cause a hang or IO failure. To fix the issue, clear the ioasc before queuing any command. [mkp: added Cc: stable per request] Fixes: 479ad8e9d48c ("scsi: cxlflash: Remove zeroing of private command data") Cc: <stable@vger.kernel.org> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-11-17Merge branch 'misc.compat' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull compat and uaccess updates from Al Viro: - {get,put}_compat_sigset() series - assorted compat ioctl stuff - more set_fs() elimination - a few more timespec64 conversions - several removals of pointless access_ok() in places where it was followed only by non-__ variants of primitives * 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (24 commits) coredump: call do_unlinkat directly instead of sys_unlink fs: expose do_unlinkat for built-in callers ext4: take handling of EXT4_IOC_GROUP_ADD into a helper, get rid of set_fs() ipmi: get rid of pointless access_ok() pi433: sanitize ioctl cxlflash: get rid of pointless access_ok() mtdchar: get rid of pointless access_ok() r128: switch compat ioctls to drm_ioctl_kernel() selection: get rid of field-by-field copyin VT_RESIZEX: get rid of field-by-field copyin i2c compat ioctls: move to ->compat_ioctl() sched_rr_get_interval(): move compat to native, get rid of set_fs() mips: switch to {get,put}_compat_sigset() sparc: switch to {get,put}_compat_sigset() s390: switch to {get,put}_compat_sigset() ppc: switch to {get,put}_compat_sigset() parisc: switch to {get,put}_compat_sigset() get_compat_sigset() get rid of {get,put}_compat_itimerspec() io_getevents: Use timespec64 to represent timeouts ...
2017-10-31scsi: cxlflash: Allow cards without WWPN VPD to configureMatthew R. Ochs
Currently, all adapters that cxlflash supports must have WWPN VPD keywords to complete configuration. This was required as cards with external FC ports needed to be programmed with WWPNs. Newer supported cards do not have an external FC interface and therefore do not require WWPN. To support backwards compatibility, these devices have included 'dummy' WWPN VPD with WWPN values of zero. This however places a dependency that all future cards have WWPN VPD, which may not always be the case. Allow for cards to not have WWPN, designating which cards are expected to have it in order to configure properly. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-17cxlflash: get rid of pointless access_ok()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-08-25scsi: cxlflash: Remove unnecessary existence checkMatthew R. Ochs
The AFU termination sequence has been refactored over time such that the main tear down routine, term_afu(), can no longer can be invoked with a NULL AFU pointer. Remove the unnecessary existence check from term_afu(). Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-12scsi: cxlflash: return -EFAULT if copy_from_user() failsDan Carpenter
The copy_from/to_user() functions return the number of bytes remaining to be copied but we had intended to return -EFAULT here. Fixes: bc88ac47d5cb ("scsi: cxlflash: Support AFU debug") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-01scsi: cxlflash: Update debug prints in reset handlersMatthew R. Ochs
The device and host reset handler contain debug prints to help identify the entities being reset. Today these reset handlers are based on a SCSI EH design that uses a SCSI command reference as a means of identifying the target entity. As such, the debug trace includes the SCSI command pointer and associated CDB. This is not necessary as the SCSI command is simply the messenger in these scenarios. Refactor the debug prints in the host and reset handlers to only present information that is applicable given the function scope. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-01scsi: cxlflash: Update send_tmf() parametersMatthew R. Ochs
The current send_tmf() implementation is based on the caller providing a SCSI command reference. In reality all that is needed is a SCSI device reference as the routine uses a private command. Refactor send_tmf() to pass the private adapter configuration reference and a SCSI device reference. As a nice side effect, this will ease the burden of converting caller routines to be based solely off of a SCSI device reference. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-01scsi: cxlflash: Avoid double free of character deviceMatthew R. Ochs
The device_unregister() service used when cleaning up the character device is already responsible for the internal state associated with the device upon successful creation. As the cxlflash driver does not obtain a second reference to the character device, the explicit call to put_device() is not required and can lead to an inconsistent sysfs among other issues as the reference is no longer valid after the first put_device() is performed. Remove the unnecessary put_device() to remedy this issue. Fixes: a834a36b57d9 ("scsi: cxlflash: Create character device to provide host management interface") Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Update TMF command processingMatthew R. Ochs
Currently, the SCSI command presented to the device reset handler is used to send TMFs to the AFU for a device reset. This behavior is incorrect as the command presented is an actual command and not a special notification. As such, it should only be used for reference and not be acted upon. Additionally, the existing TMF transmission routine does not account for actual errors from the hardware, only reflecting failure when a timeout occurs. This can lead to a condition where the device reset handler is presented with a false 'success'. Update send_tmf() to dynamically allocate a private command for sending the TMF command and properly reflect failure when the completed command indicates an error or was aborted. Detect TMF commands during response processing and avoid scsi_done() for these types of commands. Lastly, update comments in the TMF processing paths to describe the new behavior. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Remove zeroing of private command dataMatthew R. Ochs
The SCSI core now zeroes the per-command private data area prior to calling into the LLD. Replace the clearing operation that takes place when the private command data reference is obtained with a routine that performs common initializations. The zeroing that takes place in the device reset path remains intact as the private command data associated with the specified SCSI command is not guaranteed to be cleared. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Support WS16 unmapMatthew R. Ochs
The cxlflash driver supports performing a write-same16 to scrub virtual luns when they are released by a user. To date, AFUs for adapters that are supported by cxlflash do not have the capability to unmap as part of the WS operation. This can lead to fragmented flash devices which results in performance degradation. Future AFUs can optionally support unmap write-same commands and reflects this support via the context control register. This provides userspace applications with direct visibility such that they need not depend on a host API. Detect unmap support during cxlflash initialization by reading the context control register associated with the primary hardware queue. Update the existing write_same16() routine to set the unmap bit in the CDB when unmap is supported by the host. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Support AFU debugMatthew R. Ochs
Adopt the SISLite AFU debug capability to allow future CXL Flash adapters the ability to better debug AFU issues. Update the SISLite header with the changes necessary to support AFU debug operations and create a host ioctl interface for user debug software. Also update the cxlflash documentation to describe this new host ioctl. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Support LUN provisioningMatthew R. Ochs
Adopt the SISLite AFU LUN provisioning capability to allow future CXL Flash adapters the ability to better manage storage. Update the SISLite header with the changes necessary to support LUN provision operations and create a host ioctl interface for user LUN management software. Also update the cxlflash documentation to describe this new host ioctl. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Introduce host ioctl supportMatthew R. Ochs
As staging for supporting various host management functions, add a host ioctl infrastructure to filter ioctl commands and perform operations that are common for all host ioctls. Also update the cxlflash documentation to create a new section for documenting host ioctls. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Separate AFU internal command handling from AFU sync specificsMatthew R. Ochs
To date the only supported internal AFU command is AFU sync. The logic to send an internal AFU command is embedded in the specific AFU sync handler and would need to be duplicated for new internal AFU commands. In order to support new internal AFU commands, separate code that is common for AFU internal commands into a generic transmission routine and support passing back command status through an IOASA structure. The first user of this new routine is the existing AFU sync command. As a cleanup, use a descriptive name for the AFU sync command instead of a magic number. Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Create character device to provide host management interfaceUma Krishnan
The cxlflash driver currently lacks host management interface. Future devices supported by cxlflash will provide a variety of host-wide management functions. Examples include LUN provisioning, hardware debug support, and firmware download. In order to provide a way to manage the device, a character device will be created during probe of each adapter. This device will support a set of ioctls defined in the SISLite specification from which administrators can manage the adapter. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Add scsi command abort handlerUma Krishnan
To date, CXL flash devices do not support a single command abort operation. Instead, the SISLite specification provides a context reset operation to cleanup all pending commands for a given context. When a context reset is successful, it is guaranteed that the AFU has aborted all currently pending I/O. This sequence is less invasive than a device or host reset and can be executed to support scsi command abort requests. Add eh_abort_handler callback support to process command timeouts and abort requests. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Flush pending commands in cleanup pathUma Krishnan
When the AFU is reset in an error path, pending scsi commands can be silently dropped without completion or a formal abort. This puts the onus on the cxlflash driver to notify mid-layer and indicating that the command can be retried. Once the card has been quiesced, the hardware send queue lock is acquired to prevent any data movement while the pending commands are processed. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Track pending scsi commands in each hardware queueUma Krishnan
Currently, there is no book keeping of the pending scsi commands in the cxlflash driver. This lack of tracking in-flight requests is too restrictive and requires a heavy-hammer reset each time an adapter error is encountered. Additionally, it does not allow for commands to be properly retried. In order to avoid this problem and to better handle error path command cleanup, introduce a linked list for each hardware queue that tracks pending commands. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Schedule asynchronous reset of the hostUma Krishnan
A context reset failure indicates the AFU is in a bad state. At present, when such a situation occurs, no further action is taken. This leaves the adapter in an unusable state with no recoverable actions. To avoid this situation, context reset failures will be escalated to a host reset operation. This will be done asynchronously to allow the acting thread to return to the user with a failure. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26scsi: cxlflash: Reset hardware queue context via specified registerUma Krishnan
Per the SISLite specification, context_reset() writes 0x1 to the LSB of the reset register. When the AFU processes this reset request, it is expected to clear the bit after reset is complete. The current implementation simply checks that the entire value read back is not 1, instead of masking off the LSB and evaluating it for a change to 0. Should the AFU manipulate other bits during the reset (reading back a value of 0xF for example), successful completion will be prematurely indicated given the existing logic. Additionally, in the event that the context reset operation fails, there does not currently exist a way to provide feedback to the initiator of the reset. This poses a problem for the rare case that a context reset fails as the caller will proceed on the assumption that all is well. To remedy these issues, refactor the context reset routine to only mask off the LSB when evaluating for success and return status to the caller. Also update the context reset handler parameters to pass a hardware queue reference instead of a single command to better reflect that the entire queue associated with the context is impacted by the reset. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>