summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2024-12-04scsi: qla2xxx: Fix use after free on unloadQuinn Tran
System crash is observed with stack trace warning of use after free. There are 2 signals to tell dpc_thread to terminate (UNLOADING flag and kthread_stop). On setting the UNLOADING flag when dpc_thread happens to run at the time and sees the flag, this causes dpc_thread to exit and clean up itself. When kthread_stop is called for final cleanup, this causes use after free. Remove UNLOADING signal to terminate dpc_thread. Use the kthread_stop as the main signal to exit dpc_thread. [596663.812935] kernel BUG at mm/slub.c:294! [596663.812950] invalid opcode: 0000 [#1] SMP PTI [596663.812957] CPU: 13 PID: 1475935 Comm: rmmod Kdump: loaded Tainted: G IOE --------- - - 4.18.0-240.el8.x86_64 #1 [596663.812960] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/20/2012 [596663.812974] RIP: 0010:__slab_free+0x17d/0x360 ... [596663.813008] Call Trace: [596663.813022] ? __dentry_kill+0x121/0x170 [596663.813030] ? _cond_resched+0x15/0x30 [596663.813034] ? _cond_resched+0x15/0x30 [596663.813039] ? wait_for_completion+0x35/0x190 [596663.813048] ? try_to_wake_up+0x63/0x540 [596663.813055] free_task+0x5a/0x60 [596663.813061] kthread_stop+0xf3/0x100 [596663.813103] qla2x00_remove_one+0x284/0x440 [qla2xxx] Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: qla2xxx: Fix abort in bsg timeoutQuinn Tran
Current abort of bsg on timeout prematurely clears the outstanding_cmds[]. Abort does not allow FW to return the IOCB/SRB. In addition, bsg_job_done() is not called to return the BSG (i.e. leak). Abort the outstanding bsg/SRB and wait for the completion. The completion IOCB will wake up the bsg_timeout thread. If abort is not successful, then driver will forcibly call bsg_job_done() and free the srb. Err Inject: - qaucli -z - assign CT Passthru IOCB's NportHandle with another initiator nport handle to trigger timeout. Remote port will drop CT request. - bsg_job_done is properly called as part of cleanup kernel: qla2xxx [0000:21:00.1]-7012:7: qla2x00_process_ct : 286 : Error Inject. kernel: qla2xxx [0000:21:00.1]-7016:7: bsg rqst type: FC_BSG_HST_CT else type: 101 - loop-id=1 portid=fffffa. kernel: qla2xxx [0000:21:00.1]-70bb:7: qla24xx_bsg_timeout CMD timeout. bsg ptr ffff9971a42f0838 msgcode 80000004 vendor cmd fa010000 kernel: qla2xxx [0000:21:00.1]-507c:7: Abort command issued - hdl=4b, type=5 kernel: qla2xxx [0000:21:00.1]-5040:7: ELS-CT pass-through-ct pass-through error hdl=4b comp_status-status=0x5 error subcode 1=0x0 error subcode 2=0xaf882e80. kernel: qla2xxx [0000:21:00.1]-7009:7: qla2x00_bsg_job_done: sp hdl 4b, result=70000 bsg ptr ffff9971a42f0838 kernel: qla2xxx [0000:21:00.1]-802c:7: Aborting bsg ffff9971a42f0838 sp=ffff99760b87ba80 handle=4b rval=0 kernel: qla2xxx [0000:21:00.1]-708a:7: bsg abort success. bsg ffff9971a42f0838 sp=ffff99760b87ba80 handle=0x4b kernel: qla2xxx [0000:21:00.1]-7012:7: qla2x00_process_ct : 286 : Error Inject. kernel: qla2xxx [0000:21:00.1]-7016:7: bsg rqst type: FC_BSG_HST_CT else type: 101 - loop-id=1 portid=fffffa. kernel: qla2xxx [0000:21:00.1]-70bb:7: qla24xx_bsg_timeout CMD timeout. bsg ptr ffff9971a42f43b8 msgcode 80000004 vendor cmd fa010000 kernel: qla2xxx [0000:21:00.1]-7012:7: qla_bsg_found : 2206 : Error Inject 2. kernel: qla2xxx [0000:21:00.1]-802c:7: Aborting bsg ffff9971a42f43b8 sp=ffff99762c304440 handle=5e rval=5 kernel: qla2xxx [0000:21:00.1]-704f:7: bsg abort fail. bsg=ffff9971a42f43b8 sp=ffff99762c304440 rval=5. kernel: qla2xxx [0000:21:00.1]-7051:7: qla_bsg_found bsg_job_done : bsg ffff9971a42f43b8 result 0xfffffffa sp ffff99762c304440. Cc: stable@vger.kernel.org Fixes: c449b4198701 ("scsi: qla2xxx: Use QP lock to search for bsg") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20241115130313.46826-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: mpi3mr: Update driver version to 8.12.0.3.50Ranjan Kumar
Update driver version to 8.12.0.3.50. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-6-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: mpi3mr: Handling of fault code for insufficient powerRanjan Kumar
Before retrying initialization, check and abort if the fault code indicates insufficient power. Also mark the controller as unrecoverable instead of issuing reset in the watch dog timer if the fault code indicates insufficient power. Signed-off-by: Prayas Patel <prayas.patel@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-5-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: mpi3mr: Start controller indexing from 0Ranjan Kumar
Instead of displaying the controller index starting from '1' make the driver display the controller index starting from '0'. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: mpi3mr: Fix corrupt config pages PHY state is switched in sysfsRanjan Kumar
The driver, through the SAS transport, exposes a sysfs interface to enable/disable PHYs in a controller/expander setup. When multiple PHYs are disabled and enabled in rapid succession, the persistent and current config pages related to SAS IO unit/SAS Expander pages could get corrupted. Use separate memory for each config request. Signed-off-by: Prayas Patel <prayas.patel@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: mpi3mr: Synchronize access to ioctl data bufferRanjan Kumar
The driver serializes ioctls through a mutex lock but access to the ioctl data buffer is not guarded by the mutex. This results in multiple user threads being able to write to the driver's ioctl buffer simultaneously. Protect the ioctl buffer with the ioctl mutex. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: mpt3sas: Update driver version to 51.100.00.00Ranjan Kumar
Update driver version to 51.100.00.00. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110173341.11595-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-04scsi: mpt3sas: Diag-Reset when Doorbell-In-Use bit is set during driver load ↵Ranjan Kumar
time Issue a Diag-Reset when the "Doorbell-In-Use" bit is set during the driver load/initialization. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110173341.11595-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-12-02Merge branch '6.13/scsi-queue' into 6.13/scsi-fixesMartin K. Petersen
Pull in outstanding changes from 6.13/scsi-queue. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-29Merge tag 'driver-core-6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is a small set of driver core changes for 6.13-rc1. Nothing major for this merge cycle, except for the two simple merge conflicts are here just to make life interesting. Included in here are: - sysfs core changes and preparations for more sysfs api cleanups that can come through all driver trees after -rc1 is out - fw_devlink fixes based on many reports and debugging sessions - list_for_each_reverse() removal, no one was using it! - last-minute seq_printf() format string bug found and fixed in many drivers all at once. - minor bugfixes and changes full details in the shortlog" * tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits) Fix a potential abuse of seq_printf() format string in drivers cpu: Remove spurious NULL in attribute_group definition s390/con3215: Remove spurious NULL in attribute_group definition perf: arm-ni: Remove spurious NULL in attribute_group definition driver core: Constify bin_attribute definitions sysfs: attribute_group: allow registration of const bin_attribute firmware_loader: Fix possible resource leak in fw_log_firmware_info() drivers: core: fw_devlink: Fix excess parameter description in docstring driver core: class: Correct WARN() message in APIs class_(for_each|find)_device() cacheinfo: Use of_property_present() for non-boolean properties cdx: Fix cdx_mmap_resource() after constifying attr in ->mmap() drivers: core: fw_devlink: Make the error message a bit more useful phy: tegra: xusb: Set fwnode for xusb port devices drm: display: Set fwnode for aux bus devices driver core: fw_devlink: Stop trying to optimize cycle detection logic driver core: Constify attribute arguments of binary attributes sysfs: bin_attribute: add const read/write callback variants sysfs: implement all BIN_ATTR_* macros in terms of __BIN_ATTR() sysfs: treewide: constify attribute callback of bin_attribute::llseek() sysfs: treewide: constify attribute callback of bin_attribute::mmap() ...
2024-11-25Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, lpfc, hisi_sas, st). Amazingly enough, no core changes with the biggest set of driver changes being ufs (which conflicted with it's own fixes a bit, hence the merges) and the rest being minor fixes and updates" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (97 commits) scsi: st: New session only when Unit Attention for new tape scsi: st: Add MTIOCGET and MTLOAD to ioctls allowed after device reset scsi: st: Don't modify unknown block number in MTIOCGET scsi: ufs: core: Restore SM8650 support scsi: sun3: Mark driver struct with __refdata to prevent section mismatch scsi: sg: Enable runtime power management scsi: qedi: Fix a possible memory leak in qedi_alloc_and_init_sb() scsi: qedf: Fix a possible memory leak in qedf_alloc_and_init_sb() scsi: fusion: Remove unused variable 'rc' scsi: bfa: Fix use-after-free in bfad_im_module_exit() scsi: esas2r: Remove unused esas2r_build_cli_req() scsi: target: Fix incorrect function name in pscsi_create_type_disk() scsi: ufs: Replace deprecated PCI functions scsi: Switch back to struct platform_driver::remove() scsi: pm8001: Increase request sg length to support 4MiB requests scsi: pm8001: Initialize devices in pm8001_alloc_dev() scsi: pm8001: Use module param to set pcs event log severity scsi: ufs: ufs-mediatek: Configure individual LU queue flags scsi: MAINTAINERS: Update UFS Exynos entry scsi: lpfc: Copyright updates for 14.4.0.6 patches ...
2024-11-20scsi: megaraid_sas: Fix for a potential deadlockTomas Henzl
This fixes a 'possible circular locking dependency detected' warning CPU0 CPU1 ---- ---- lock(&instance->reset_mutex); lock(&shost->scan_mutex); lock(&instance->reset_mutex); lock(&shost->scan_mutex); Fix this by temporarily releasing the reset_mutex. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20240923174833.45345-1-thenzl@redhat.com Acked-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-20scsi: lpfc: Fix spelling errors 'asynchronously'liujing
Signed-off-by: liujing <liujing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20241108070935.10427-1-liujing@cmss.chinamobile.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-20scsi: bfa: Remove unused parsersDr. David Alan Gilbert
bfa has a set of structure parsers, of which quite a few are unused. Remove the unused set. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20241117135215.38771-3-linux@treblig.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-20scsi: bfa: Remove unused structure buildersDr. David Alan Gilbert
bfa has a large set of structure builders, of which only about 60% are used; remove the rest. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20241117135215.38771-2-linux@treblig.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-20scsi: qla1280: Fix hw revision numbering for ISP1020/1040Magnus Lindholm
Fix the hardware revision numbering for Qlogic ISP1020/1040 boards. HWMASK suggests that the revision number only needs four bits, this is consistent with how NetBSD does things in their ISP driver. Verified on a IPS1040B which is seen as rev 5 not as BIT_4. Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Link: https://lore.kernel.org/r/20241113225636.2276-1-linmag7@gmail.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-20Merge branch '6.12/scsi-fixes' into 6.13/scsi-stagingMartin K. Petersen
Pull in the fixes branch to resolve conflict in UFS core. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-19Merge tag 'irq-core-2024-11-18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull interrupt subsystem updates from Thomas Gleixner: "Tree wide: - Make nr_irqs static to the core code and provide accessor functions to remove existing and prevent future aliasing problems with local variables or function arguments of the same name. Core code: - Prevent freeing an interrupt in the devres code which is not managed by devres in the first place. - Use seq_put_decimal_ull_width() for decimal values output in /proc/interrupts which increases performance significantly as it avoids parsing the format strings over and over. - Optimize raising the timer and hrtimer soft interrupts by using the 'set bit only' variants instead of the combined version which checks whether ksoftirqd should be woken up. The latter is a pointless exercise as both soft interrupts are raised in the context of the timer interrupt and therefore never wake up ksoftirqd. - Delegate timer/hrtimer soft interrupt processing to a dedicated thread on RT. Timer and hrtimer soft interrupts are always processed in ksoftirqd on RT enabled kernels. This can lead to high latencies when other soft interrupts are delegated to ksoftirqd as well. The separate thread allows to run them seperately under a RT scheduling policy to reduce the latency overhead. Drivers: - New drivers or extensions of existing drivers to support Renesas RZ/V2H(P), Aspeed AST27XX, T-HEAD C900 and ATMEL sam9x7 interrupt chips - Support for multi-cluster GICs on MIPS. MIPS CPUs can come with multiple CPU clusters, where each CPU cluster has its own GIC (Generic Interrupt Controller). This requires to access the GIC of a remote cluster through a redirect register block. This is encapsulated into a set of helper functions to keep the complexity out of the actual code paths which handle the GIC details. - Support for encrypted guests in the ARM GICV3 ITS driver The ITS page needs to be shared with the hypervisor and therefore must be decrypted. - Small cleanups and fixes all over the place" * tag 'irq-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) irqchip/riscv-aplic: Prevent crash when MSI domain is missing genirq/proc: Use seq_put_decimal_ull_width() for decimal values softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT. timers: Use __raise_softirq_irqoff() to raise the softirq. hrtimer: Use __raise_softirq_irqoff() to raise the softirq riscv: defconfig: Enable T-HEAD C900 ACLINT SSWI drivers irqchip: Add T-HEAD C900 ACLINT SSWI driver dt-bindings: interrupt-controller: Add T-HEAD C900 ACLINT SSWI device irqchip/stm32mp-exti: Use of_property_present() for non-boolean properties irqchip/mips-gic: Fix selection of GENERIC_IRQ_EFFECTIVE_AFF_MASK irqchip/mips-gic: Prevent indirect access to clusters without CPU cores irqchip/mips-gic: Multi-cluster support irqchip/mips-gic: Setup defaults in each cluster irqchip/mips-gic: Support multi-cluster in for_each_online_cpu_gic() irqchip/mips-gic: Replace open coded online CPU iterations genirq/irqdesc: Use str_enabled_disabled() helper in wakeup_show() genirq/devres: Don't free interrupt which is not managed by devres irqchip/gic-v3-its: Fix over allocation in itt_alloc_pool() irqchip/aspeed-intc: Add AST27XX INTC support dt-bindings: interrupt-controller: Add support for ASPEED AST27XX INTC ...
2024-11-18Merge tag 'for-6.13/block-20241118' of git://git.kernel.dk/linuxLinus Torvalds
Pull block updates from Jens Axboe: - NVMe updates via Keith: - Use uring_cmd helper (Pavel) - Host Memory Buffer allocation enhancements (Christoph) - Target persistent reservation support (Guixin) - Persistent reservation tracing (Guixen) - NVMe 2.1 specification support (Keith) - Rotational Meta Support (Matias, Wang, Keith) - Volatile cache detection enhancment (Guixen) - MD updates via Song: - Maintainers update - raid5 sync IO fix - Enhance handling of faulty and blocked devices - raid5-ppl atomic improvement - md-bitmap fix - Support for manually defining embedded partition tables - Zone append fixes and cleanups - Stop sending the queued requests in the plug list to the driver ->queue_rqs() handle in reverse order. - Zoned write plug cleanups - Cleanups disk stats tracking and add support for disk stats for passthrough IO - Add preparatory support for file system atomic writes - Add lockdep support for queue freezing. Already found a bunch of issues, and some fixes for that are in here. More will be coming. - Fix race between queue stopping/quiescing and IO queueing - ublk recovery improvements - Fix ublk mmap for 64k pages - Various fixes and cleanups * tag 'for-6.13/block-20241118' of git://git.kernel.dk/linux: (118 commits) MAINTAINERS: Update git tree for mdraid subsystem block: make struct rq_list available for !CONFIG_BLOCK block/genhd: use seq_put_decimal_ull for diskstats decimal values block: don't reorder requests in blk_mq_add_to_batch block: don't reorder requests in blk_add_rq_to_plug block: add a rq_list type block: remove rq_list_move virtio_blk: reverse request order in virtio_queue_rqs nvme-pci: reverse request order in nvme_queue_rqs btrfs: validate queue limits block: export blk_validate_limits nvmet: add tracing of reservation commands nvme: parse reservation commands's action and rtype to string nvmet: report ns's vwc not present md/raid5: Increase r5conf.cache_name size block: remove the ioprio field from struct request block: remove the write_hint field from struct request nvme: check ns's volatile write cache not present nvme: add rotational support nvme: use command set independent id ns if available ...
2024-11-12block: remove the write_hint field from struct requestChristoph Hellwig
The write_hint is only used for read/write requests, which must have a bio attached to them. Just use the bio field instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241112170050.1612998-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-11block: pre-calculate max_zone_append_sectorsChristoph Hellwig
max_zone_append_sectors differs from all other queue limits in that the final value used is not stored in the queue_limits but needs to be obtained using queue_limits_max_zone_append_sectors helper. This not only adds (tiny) extra overhead to the I/O path, but also can be easily forgotten in file system code. Add a new max_hw_zone_append_sectors value to queue_limits which is set by the driver, and calculate max_zone_append_sectors from that and the other inputs in blk_validate_zoned_limits, similar to how max_sectors is calculated to fix this. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241104073955.112324-3-hch@lst.de Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20241108154657.845768-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-08Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two small fixes, the drivers one in ufs simply delays running a work queue and the generic one in zoned storage switches to a more correct API that tries the standard buddy allocator first (for small allocations); this fixes an allocation problem with small allocations seen under memory pressure" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Start the RTC update work later scsi: sd_zbc: Use kvzalloc() to allocate REPORT ZONES buffer
2024-11-07Revert "block: pre-calculate max_zone_append_sectors"Jens Axboe
This causes issue on, at least, nvme-mpath where my boot fails with: WARNING: CPU: 354 PID: 2729 at block/blk-settings.c:75 blk_validate_limits+0x356/0x380 Modules linked in: tg3(+) nvme usbcore scsi_mod ptp i2c_piix4 libphy nvme_core crc32c_intel scsi_common usb_common pps_core i2c_smbus CPU: 354 UID: 0 PID: 2729 Comm: kworker/u2061:1 Not tainted 6.12.0-rc6+ #181 Hardware name: Dell Inc. PowerEdge R7625/06444F, BIOS 1.8.3 04/02/2024 Workqueue: async async_run_entry_fn RIP: 0010:blk_validate_limits+0x356/0x380 Code: f6 47 01 04 75 28 83 bf 94 00 00 00 00 75 39 83 bf 98 00 00 00 00 75 34 83 7f 68 00 75 32 31 c0 83 7f 5c 00 0f 84 9b fd ff ff <0f> 0b eb 13 0f 0b eb 0f 48 c7 c0 74 12 58 92 48 89 c7 e8 13 76 46 RSP: 0018:ffffa8a1dfb93b30 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff9232829c8388 RCX: 0000000000000088 RDX: 0000000000000080 RSI: 0000000000000200 RDI: ffffa8a1dfb93c38 RBP: 000000000000000c R08: 00000000ffffffff R09: 000000000000ffff R10: 0000000000000000 R11: 0000000000000000 R12: ffff9232829b9000 R13: ffff9232829b9010 R14: ffffa8a1dfb93c38 R15: ffffa8a1dfb93c38 FS: 0000000000000000(0000) GS:ffff923867c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055c1b92480a8 CR3: 0000002484ff0002 CR4: 0000000000370ef0 Call Trace: <TASK> ? __warn+0xca/0x1a0 ? blk_validate_limits+0x356/0x380 ? report_bug+0x11a/0x1a0 ? handle_bug+0x5e/0x90 ? exc_invalid_op+0x16/0x40 ? asm_exc_invalid_op+0x16/0x20 ? blk_validate_limits+0x356/0x380 blk_alloc_queue+0x7a/0x250 __blk_alloc_disk+0x39/0x80 nvme_mpath_alloc_disk+0x13d/0x1b0 [nvme_core] nvme_scan_ns+0xcc7/0x1010 [nvme_core] async_run_entry_fn+0x27/0x120 process_scheduled_works+0x1a0/0x360 worker_thread+0x2bc/0x350 ? pr_cont_work+0x1b0/0x1b0 kthread+0x111/0x120 ? kthread_unuse_mm+0x90/0x90 ret_from_fork+0x30/0x40 ? kthread_unuse_mm+0x90/0x90 ret_from_fork_asm+0x11/0x20 </TASK> ---[ end trace 0000000000000000 ]--- presumably due to max_zone_append_sectors not being cleared to zero, resulting in blk_validate_zoned_limits() complaining and failing. This reverts commit 2a8f6153e1c2db06a537a5c9d61102eb591776f1. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-06Merge patch series "scsi: st: Device reset patches"Martin K. Petersen
Kai Mäkisara <Kai.Makisara@kolumbus.fi> says: These three patches were developed in response to Bugzilla report https://bugzilla.kernel.org/show_bug.cgi?id=219419 After device reset, the tape driver allows only operations that don't write or read anything from tape. The reason for this is that many (most ?) drives rewind the tape after reset and the subsequent reads or writes would not be at the tape location the user expects. Reading and writing is allowed again when the user does something to position the tape (e.g., rewind). The Bugzilla report considers the case when a user, after reset, tries to read the drive status with MTIOCGET ioctl, but it fails. MTIOCGET does not return much useful data after reset, but it can be allowed. MTLOAD positions the tape and it should be allowed. The second patch adds these to the set of allowed operations after device reset. The first patch fixes a bug seen when developing the second patch. V2: The third patch is added to fix a bug that resulted in not blocking writes if reset occurs while the device file is not open. Link: https://lore.kernel.org/r/20241106095723.63254-1-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: st: New session only when Unit Attention for new tapeKai Mäkisara
Currently the code starts new tape session when any Unit Attention (UA) is seen when opening the device. This leads to incorrectly clearing pos_unknown when the UA is for reset. Set new session only when the UA is for a new tape. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20241106095723.63254-4-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: st: Add MTIOCGET and MTLOAD to ioctls allowed after device resetKai Mäkisara
Most drives rewind the tape when the device is reset. Reading and writing are not allowed until something is done to make the tape position match the user's expectation (e.g., rewind the tape). Add MTIOCGET and MTLOAD to operations allowed after reset. MTIOCGET is modified to not touch the tape if pos_unknown is non-zero. The tape location is known after MTLOAD. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219419#c14 Link: https://lore.kernel.org/r/20241106095723.63254-3-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: st: Don't modify unknown block number in MTIOCGETKai Mäkisara
Struct mtget field mt_blkno -1 means it is unknown. Don't add anything to it. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219419#c14 Link: https://lore.kernel.org/r/20241106095723.63254-2-Kai.Makisara@kolumbus.fi Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: sun3: Mark driver struct with __refdata to prevent section mismatchGeert Uytterhoeven
As described in the added code comment, a reference to .exit.text is ok for drivers registered via module_platform_driver_probe(). Make this explicit to prevent the following section mismatch warnings WARNING: modpost: drivers/scsi/sun3_scsi: section mismatch in reference: sun3_scsi_driver+0x4 (section: .data) -> sun3_scsi_remove (section: .exit.text) WARNING: modpost: drivers/scsi/sun3_scsi_vme: section mismatch in reference: sun3_scsi_driver+0x4 (section: .data) -> sun3_scsi_remove (section: .exit.text) that trigger on a Sun 3 allmodconfig build. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/b2c56fa3556505befe9b4cb9a830d9e2a962e72c.1730831769.git.geert@linux-m68k.org Acked-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: sg: Enable runtime power managementBart Van Assche
In 2010, runtime power management support was implemented in the SCSI core. The description of patch "[SCSI] implement runtime Power Management" mentions that the sg driver is skipped but not why. This patch enables runtime power management even if an instance of the sg driver is held open. Enabling runtime PM for the sg driver is safe because all interactions of the sg driver with the SCSI device pass through the block layer (blk_execute_rq_nowait()) and the block layer already supports runtime PM. Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Douglas Gilbert <dgilbert@interlog.com> Fixes: bc4f24014de5 ("[SCSI] implement runtime Power Management") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241030220310.1373569-1-bvanassche@acm.org Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: qedi: Fix a possible memory leak in qedi_alloc_and_init_sb()Zhen Lei
Hook "qedi_ops->common->sb_init = qed_sb_init" does not release the DMA memory sb_virt when it fails. Add dma_free_coherent() to free it. This is the same way as qedr_alloc_mem_sb() and qede_alloc_mem_sb(). Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20241026125711.484-3-thunder.leizhen@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: qedf: Fix a possible memory leak in qedf_alloc_and_init_sb()Zhen Lei
Hook "qed_ops->common->sb_init = qed_sb_init" does not release the DMA memory sb_virt when it fails. Add dma_free_coherent() to free it. This is the same way as qedr_alloc_mem_sb() and qede_alloc_mem_sb(). Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20241026125711.484-2-thunder.leizhen@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: bfa: Fix use-after-free in bfad_im_module_exit()Ye Bin
BUG: KASAN: slab-use-after-free in __lock_acquire+0x2aca/0x3a20 Read of size 8 at addr ffff8881082d80c8 by task modprobe/25303 Call Trace: <TASK> dump_stack_lvl+0x95/0xe0 print_report+0xcb/0x620 kasan_report+0xbd/0xf0 __lock_acquire+0x2aca/0x3a20 lock_acquire+0x19b/0x520 _raw_spin_lock+0x2b/0x40 attribute_container_unregister+0x30/0x160 fc_release_transport+0x19/0x90 [scsi_transport_fc] bfad_im_module_exit+0x23/0x60 [bfa] bfad_init+0xdb/0xff0 [bfa] do_one_initcall+0xdc/0x550 do_init_module+0x22d/0x6b0 load_module+0x4e96/0x5ff0 init_module_from_file+0xcd/0x130 idempotent_init_module+0x330/0x620 __x64_sys_finit_module+0xb3/0x110 do_syscall_64+0xc1/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> Allocated by task 25303: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x7f/0x90 fc_attach_transport+0x4f/0x4740 [scsi_transport_fc] bfad_im_module_init+0x17/0x80 [bfa] bfad_init+0x23/0xff0 [bfa] do_one_initcall+0xdc/0x550 do_init_module+0x22d/0x6b0 load_module+0x4e96/0x5ff0 init_module_from_file+0xcd/0x130 idempotent_init_module+0x330/0x620 __x64_sys_finit_module+0xb3/0x110 do_syscall_64+0xc1/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 25303: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x38/0x50 kfree+0x212/0x480 bfad_im_module_init+0x7e/0x80 [bfa] bfad_init+0x23/0xff0 [bfa] do_one_initcall+0xdc/0x550 do_init_module+0x22d/0x6b0 load_module+0x4e96/0x5ff0 init_module_from_file+0xcd/0x130 idempotent_init_module+0x330/0x620 __x64_sys_finit_module+0xb3/0x110 do_syscall_64+0xc1/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Above issue happens as follows: bfad_init error = bfad_im_module_init() fc_release_transport(bfad_im_scsi_transport_template); if (error) goto ext; ext: bfad_im_module_exit(); fc_release_transport(bfad_im_scsi_transport_template); --> Trigger double release Don't call bfad_im_module_exit() if bfad_im_module_init() failed. Fixes: 7725ccfda597 ("[SCSI] bfa: Brocade BFA FC SCSI driver") Signed-off-by: Ye Bin <yebin10@huawei.com> Link: https://lore.kernel.org/r/20241023011809.63466-1-yebin@huaweicloud.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: esas2r: Remove unused esas2r_build_cli_req()Dr. David Alan Gilbert
esas2r_build_cli_req() has been unused since it was added in 2013 by commit 26780d9e12ed ("[SCSI] esas2r: ATTO Technology ExpressSAS 6G SAS/SATA RAID Adapter Driver") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20241102220336.80541-1-linux@treblig.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: Switch back to struct platform_driver::remove()Uwe Kleine-König
After commit 0edb555a65d1 ("platform: Make platform_driver::remove() return void") .remove() is (again) the right callback to implement for platform drivers. Convert all platform drivers below drivers/scsi to use .remove(), with the eventual goal to drop struct platform_driver::remove_new(). As .remove() and .remove_new() have the same prototypes, conversion is done by just changing the structure member name in the driver initializer. On the way do a few whitespace changes to make indention consistent. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/20241028080754.429191-2-u.kleine-koenig@baylibre.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: pm8001: Increase request sg length to support 4MiB requestsIgor Pylypiv
Increasing the per-request size maximum to 4MiB (8192 sectors x 512 bytes) runs into the per-device DMA scatter gather list limit (max_segments) for users of the io vector system calls (e.g. readv and writev). Increase the max scatter gather list length to 1024 to enable kernel to send 4MiB (1024 * 4KiB page size) requests. Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Link: https://lore.kernel.org/r/20241025185009.3278297-1-ipylypiv@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: pm8001: Initialize devices in pm8001_alloc_dev()Igor Pylypiv
Devices can be allocated and freed at runtime. For example during a soft reset all devices are freed and reallocated upon discovery. Currently the driver fully initializes devices once in pm8001_alloc(). Allows initialization steps to happen during runtime, avoiding any leftover states from the device being freed. Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: Terrence Adams <tadamsjr@google.com> Link: https://lore.kernel.org/r/20241021201828.1378858-1-tadamsjr@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06scsi: pm8001: Use module param to set pcs event log severitySalomon Dushimirimana
The pm8001 driver sets pcs event log threshold very high which causes most of the FW log messages to not be captured. Add a module parameter to configure pcs event log severity with 3 (medium severity) as the default. Co-developed-by: Bhavesh Jashnani <bjashnani@google.com> Signed-off-by: Bhavesh Jashnani <bjashnani@google.com> Signed-off-by: Salomon Dushimirimana <salomondush@google.com> Link: https://lore.kernel.org/r/20241016220944.370539-1-salomondush@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06Merge branch '6.12/scsi-fixes' into 6.13/scsi-stagingMartin K. Petersen
Pull in 6.12 fixes branch to resolve a merge conflict in ufs-mcq.c. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-06Merge patch series "Update lpfc to revision 14.4.0.6"Martin K. Petersen
Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.4.0.6 This patch set contains bug fixes related to congestion handling, accounting for internal remoteport objects, resource release during HBA unload and reset, and clean up regarding the abuse of a global spinlock. The patches were cut against Martin's 6.13/scsi-queue tree. Link: https://lore.kernel.org/r/20241031223219.152342-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-05sysfs: treewide: constify attribute callback of bin_is_visible()Thomas Weißschuh
The is_bin_visible() callbacks should not modify the struct bin_attribute passed as argument. Enforce this by marking the argument as const. As there are not many callback implementers perform this change throughout the tree at once. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-5-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-04block: pre-calculate max_zone_append_sectorsChristoph Hellwig
max_zone_append_sectors differs from all other queue limits in that the final value used is not stored in the queue_limits but needs to be obtained using queue_limits_max_zone_append_sectors helper. This not only adds (tiny) extra overhead to the I/O path, but also can be easily forgotten in file system code. Add a new max_hw_zone_append_sectors value to queue_limits which is set by the driver, and calculate max_zone_append_sectors from that and the other inputs in blk_validate_zoned_limits, similar to how max_sectors is calculated to fix this. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241104073955.112324-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-02scsi: lpfc: Copyright updates for 14.4.0.6 patchesJustin Tee
Update copyrights to 2024 for files modified in the 14.4.0.6 patch set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-12-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-02scsi: lpfc: Update lpfc version to 14.4.0.6Justin Tee
Update lpfc version to 14.4.0.6 Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-11-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-02scsi: lpfc: Change lpfc_nodelist nlp_flag member into a bitmaskJustin Tee
In attempt to reduce the amount of unnecessary ndlp->lock acquisitions in the lpfc driver, change nlpa_flag into an unsigned long bitmask and use clear_bit/test_bit bitwise atomic APIs instead of reliance on ndlp->lock for synchronization. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-10-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-02scsi: lpfc: Remove NLP_RELEASE_RPI flag from nodelist structureJustin Tee
An RPI is tightly bound to an NDLP structure and is freed only upon release of an NDLP object. As such, there should be no logic that frees an RPI outside of the lpfc_nlp_release() routine. In order to reinforce the original design usage of RPIs, remove the NLP_RELEASE_RPI flag and related logic. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-9-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-02scsi: lpfc: Prevent NDLP reference count underflow in dev_loss_tmo callbackJustin Tee
Current dev_loss_tmo handling checks whether there has been a previous call to unregister with SCSI transport. If so, the NDLP kref count is decremented a second time in dev_loss_tmo as the final kref release. However, this can sometimes result in a reference count underflow if there is also a race to unregister with NVMe transport as well. Add a check for NVMe transport registration before decrementing the final kref. If NVMe transport is still registered, then the NVMe transport unregistration is designated as the final kref decrement. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-02scsi: lpfc: Add cleanup of nvmels_wq after HBA resetJustin Tee
An HBA reset request that is executed when there are outstanding NVME-LS commands can cause delays for the reset process to complete. Fix by introducing a new routine called lpfc_nvmels_flush_cmd() that walks the phba->nvmels_wq list and cancels outstanding submitted NVME-LS requests speeding up the HBA reset process. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-02scsi: lpfc: Check SLI_ACTIVE flag in FDMI cmpl before submitting follow up FDMIJustin Tee
The lpfc_cmpl_ct_disc_fdmi() routine has incorrect logic that treats an FDMI completion with error LOCAL_REJECT/SLI_ABORTED as a success status. Under the erroneous assumption of successful completion, the routine proceeds to issue follow up FDMI commands, which may never complete if the HBA is in an errata state as indicated by the errored completion status. Fix by freeing FDMI cmd resources and early return when the LPFC_SLI_ACTIVE flag is not set and a LOCAL_REJECT/SLI_ABORTED or SLI_DOWN status is received. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-11-02scsi: lpfc: Update lpfc_els_flush_cmd() to check for SLI_ACTIVE before BSG flagJustin Tee
During firmware errata events, the lpfc_els_flush_cmd() routine is responsible for the clean up of outstanding ELS and CT command submissions. Thus, move the LPFC_SLI_ACTIVE flag check into the txcmplq list walk and mark a piocb object for canceling if determined the HBA is not active. Clean up should be regardless of application or driver layer origin. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20241031223219.152342-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>