Age | Commit message (Collapse) | Author |
|
Now that common helpers exist, add the ability to receive NVME LS requests
to the driver. New requests will be delivered to the transport by
nvme_fc_rcv_ls_req().
In order to complete the LS, add support for Send LS Response and send
LS response completion handling to the driver.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently, the ability to send an NVME LS response is limited to the nvmet
(controller/target) side of the driver. In preparation of both the nvme
and nvmet sides supporting Send LS Response, rework the existing send
ls_rsp and ls_rsp completion routines such that there is common code that
can be used by both sides.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Send LS Abort support is needed when Send LS Request is supported.
Currently, the ability to abort an NVME LS request is limited to the nvme
(host) side of the driver. In preparation of both the nvme and nvmet sides
supporting Send LS Abort, rework the existing ls_req abort routines such
that there is common code that can be used by both sides.
While refactoring it was seen the logic in the abort routine was incorrect.
It attempted to abort all NVME LS's on the indicated port. As such, the
routine was reworked to abort only the NVME LS request that was specified.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently, the ability to send an NVME LS request is limited to the nvme
(host) side of the driver. In preparation of both the nvme and nvmet sides
support Send LS Request, rework the existing send ls_req and ls_req
completion routines such that there is common code that can be used by
both sides.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In preparation for supporting both intiator mode and target mode
receiving NVME LS's, commonize the existing NVME LS request receive
handling found in the base driver and in the nvmet side.
Using the original lpfc_nvmet_unsol_ls_event() and
lpfc_nvme_unsol_ls_buffer() routines as a templates, commonize the
reception of an NVME LS request. The common routine will validate the LS
request, that it was received from a logged-in node, and allocate a
lpfc_async_xchg_ctx that is used to manage the LS request. The role of
the port is then inspected to determine which handler is to receive the
LS - nvme or nvmet. As such, the nvmet handler is tied back in. A handler
is created in nvme and is stubbed out.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The last step of commonization is to remove the 'T' suffix from
state and flag field definitions. This is minor, but removes the
mental association that it solely applies to nvmet use.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
To support FC-NVME-2 support (actually FC-NVME (rev 1) with Ammendment 1),
both the nvme (host) and nvmet (controller/target) sides will need to be
able to receive LS requests. Currently, this support is in the nvmet side
only. To prepare for both sides supporting LS receive, rename
lpfc_nvmet_rcv_ctx to lpfc_async_xchg_ctx and commonize the definition.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
A lot of files in lpfc include nvme headers, building up relationships that
require a file to change for its headers when there is no other change
necessary. It would be better to localize the nvme headers.
There is also no need for separate nvme (initiator) and nvmet (tgt)
header files.
Refactor the inclusion of nvme headers so that all nvme items are
included by lpfc_nvme.h
Merge lpfc_nvmet.h into lpfc_nvme.h so that there is a single header used
by both the nvme and nvmet sides. This prepares for structure sharing
between the two roles. Prep to add shared function prototypes for upcoming
shared routines.
Signed-off-by: Paul Ely <paul.ely@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The current LLDD api has:
nvme-fc: contains api for transport to do LS requests (and aborts of
them). However, there is no interface for reception of LS's and sending
responses for them.
nvmet-fc: contains api for transport to do reception of LS's and sending
of responses for them. However, there is no interface for doing LS
requests.
Revise the api's so that both nvme-fc and nvmet-fc can send LS's, as well
as receiving LS's and sending their responses.
Change name of the rcv_ls_req struct to better reflect generic use as
a context to used to send an ls rsp. Specifically:
nvmefc_tgt_ls_req -> nvmefc_ls_rsp
nvmefc_tgt_ls_req.nvmet_fc_private -> nvmefc_ls_rsp.nvme_fc_private
Change nvmet_fc_rcv_ls_req() calling sequence to provide handle that
can be used by transport in later LS request sequences for an association.
nvme-fc nvmet_fc nvme_fcloop:
Revise to adapt to changed names in api header.
Change calling sequence to nvmet_fc_rcv_ls_req() for hosthandle.
Add stubs for new interfaces:
host/fc.c: nvme_fc_rcv_ls_req()
target/fc.c: nvmet_fc_invalidate_host()
lpfc:
Revise to adapt code to changed names in api header.
Change calling sequence to nvmet_fc_rcv_ls_req() for hosthandle.
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Update lpfc version to 12.8.0.1
Link: https://lore.kernel.org/r/20200501214310.91713-10-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The MDS diagnostic enablement bit for the adapter interface is incorrect in
the driver header.
Correct the bit position for the SET_FEATURE MDS bit.
Link: https://lore.kernel.org/r/20200501214310.91713-9-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Running make C=1 M=drivers/scsi/lpfc triggers sparse warnings
Correct the code generating the following errors:
- Incompatible address space assignment without proper conversion.
- Deference of usespace and per-cpu pointers.
Link: https://lore.kernel.org/r/20200501214310.91713-8-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In an audit of lockdep calls in the driver, there are multiple lockdep
checks in successive calling layers. E.g. a routine checks, and then calls
a lower routine that also checks, and so on. Calling sequences result in
many redundant checks.
Refine the code to remove lower-level lockdep checks. Update comments on
the lock, correcting a few places where lock object in comment was
incorrect.
Link: https://lore.kernel.org/r/20200501214310.91713-7-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
By default, the driver attempts to allocate a hdwq per logical cpu in order
to provide good cpu affinity. Some systems have extremely high cpu counts
and this can significantly raise memory consumption.
In testing on x86 platforms (non-AMD) it is found that sharing of a hdwq by
a physical cpu and its HT cpu can occur with little performance
degredation. By sharing, the hdwq count can be halved, significantly
reducing the memory overhead.
Change the default behavior of the driver on non-AMD x86 platforms to
share a hdwq by the cpu and its HT cpu.
Link: https://lore.kernel.org/r/20200501214310.91713-6-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Implementation of a previous patch added a condition to an if check that
always end up with the if test being true. Execution of the else clause was
inadvertently negated. The additional condition check was incorrect and
unnecessary after the other modifications had been done in that patch.
Remove the check from the if series.
Link: https://lore.kernel.org/r/20200501214310.91713-5-jsmart2021@gmail.com
Fixes: b95b21193c85 ("scsi: lpfc: Fix loss of remote port after devloss due to lack of RPIs")
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The lldd rebinds the ndlp with rport during a nvme rport registration (via
nvme_fc_register_remoteport). If rport & ndlp pointers are same as the
previous one, the lldd will re-use the ndlp and rport association without
re-initialization. This assumption is incorrect. The lldd should be
ignorant of whether the returned rport pointer is new or not, and should
always assume it is new.
Remove the re-binding code, always assumes that rport pointer received from
transport is a new pointer.
Link: https://lore.kernel.org/r/20200501214310.91713-4-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
A previous change introduced the atomic use of queue_claimed flag for eq's
and cq's. The code works fine, but the clearing of the queue_claimed flag
is not atomic.
Change queue_claimed = 0 into xchg(&queue_claimed, 0) to be consistent for
change under atomicity.
Link: https://lore.kernel.org/r/20200501214310.91713-3-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
During code reviews several instances of duplicate module unloading checks
were found.
Remove the duplicate checks.
Link: https://lore.kernel.org/r/20200421203354.49420-1-jsmart2021@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull more SCSI updates from James Bottomley:
"This is a batch of changes that didn't make it in the initial pull
request because the lpfc series had to be rebased to redo an incorrect
split.
It's basically driver updates to lpfc, target, bnx2fc and ufs with the
rest being minor updates except the sr_block_release one which fixes a
use after free introduced by the removal of the global mutex in the
first patch set"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (35 commits)
scsi: core: Add DID_ALLOC_FAILURE and DID_MEDIUM_ERROR to hostbyte_table
scsi: ufs: Use ufshcd_config_pwr_mode() when scaling gear
scsi: bnx2fc: fix boolreturn.cocci warnings
scsi: zfcp: use fallthrough;
scsi: aacraid: do not overwrite retval in aac_reset_adapter()
scsi: sr: Fix sr_block_release()
scsi: aic7xxx: Remove more FreeBSD-specific code
scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug
scsi: ufs: set device as active power mode after resetting device
scsi: iscsi: Report unbind session event when the target has been removed
scsi: lpfc: Change default SCSI LUN QD to 64
scsi: libfc: rport state move to PLOGI if all PRLI retry exhausted
scsi: libfc: If PRLI rejected, move rport to PLOGI state
scsi: bnx2fc: Update the driver version to 2.12.13
scsi: bnx2fc: Fix SCSI command completion after cleanup is posted
scsi: bnx2fc: Process the RQE with CQE in interrupt context
scsi: target: use the stack for XCOPY passthrough cmds
scsi: target: increase XCOPY I/O size
scsi: target: avoid per-loop XCOPY buffer allocations
scsi: target: drop xcopy DISK BLOCK LENGTH debug
...
|
|
Pull block fixes from Jens Axboe:
"Here's a set of fixes that should go into this merge window. This
contains:
- NVMe pull request from Christoph with various fixes
- Better discard support for loop (Evan)
- Only call ->commit_rqs() if we have queued IO (Keith)
- blkcg offlining fixes (Tejun)
- fix (and fix the fix) for busy partitions"
* tag 'block-5.7-2020-04-10' of git://git.kernel.dk/linux-block:
block: fix busy device checking in blk_drop_partitions again
block: fix busy device checking in blk_drop_partitions
nvmet-rdma: fix double free of rdma queue
blk-mq: don't commit_rqs() if none were queued
nvme-fc: Revert "add module to ops template to allow module references"
nvme: fix deadlock caused by ANA update wrong locking
nvmet-rdma: fix bonding failover possible NULL deref
loop: Better discard support for block devices
loop: Report EOPNOTSUPP properly
nvmet: fix NULL dereference when removing a referral
nvme: inherit stable pages constraint in the mpath stack device
blkcg: don't offline parent blkcg first
blkcg: rename blkcg->cgwb_refcnt to ->online_pin and always use it
nvme-tcp: fix possible crash in recv error flow
nvme-tcp: don't poll a non-live queue
nvme-tcp: fix possible crash in write_zeroes processing
nvmet-fc: fix typo in comment
nvme-rdma: Replace comma with a semicolon
nvme-fcloop: fix deallocation of working context
nvme: fix compat address handling in several ioctls
|
|
The original patch was to resolve the lldd being able to be unloaded
while being used to talk to the boot device of the system. However, the
end result of the original patch is that any driver unload while a nvme
controller is live via the lldd is now being prohibited. Given the module
reference, the module teardown routine can't be called, thus there's no
way, other than manual actions to terminate the controllers.
Fixes: 863fbae929c7 ("nvme_fc: add module to ops template to allow module references")
Cc: <stable@vger.kernel.org> # v5.4+
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Revert sysfs "rescan" renames that broke apps (Kelsey Skunberg)
- Add more 32 GT/s link speed decoding and improve the implementation
(Yicong Yang)
Resource management:
- Add support for sizing programmable host bridge apertures and fix a
related alpha Nautilus regression (Ivan Kokshaysky)
Interrupts:
- Add boot interrupt quirk mechanism for Xeon chipsets and document
boot interrupts (Sean V Kelley)
PCIe native device hotplug:
- When possible, disable in-band presence detect and use PDS
(Alexandru Gagniuc)
- Add DMI table for devices that don't use in-band presence detection
but don't advertise that correctly (Stuart Hayes)
- Fix hang when powering slots up/down via sysfs (Lukas Wunner)
- Fix an MSI interrupt race (Stuart Hayes)
Virtualization:
- Add ACS quirks for Zhaoxin devices (Raymond Pang)
Error handling:
- Add Error Disconnect Recover (EDR) support so firmware can report
devices disconnected via DPC and we can try to recover (Kuppuswamy
Sathyanarayanan)
Peer-to-peer DMA:
- Add Intel Sky Lake-E Root Ports B, C, D to the whitelist (Andrew
Maier)
ASPM:
- Reduce severity of common clock config message (Chris Packham)
- Clear the correct bits when enabling L1 substates, so we don't go
to the wrong state (Yicong Yang)
Endpoint framework:
- Replace EPF linkup ops with notifier call chain and improve locking
(Kishon Vijay Abraham I)
- Fix concurrent memory allocation in OB address region (Kishon Vijay
Abraham I)
- Move PF function number assignment to EPC core to support multiple
function creation methods (Kishon Vijay Abraham I)
- Fix issue with clearing configfs "start" entry (Kunihiko Hayashi)
- Fix issue with endpoint MSI-X ignoring BAR Indicator and Table
Offset (Kishon Vijay Abraham I)
- Add support for testing DMA transfers (Kishon Vijay Abraham I)
- Add support for testing > 10 endpoint devices (Kishon Vijay Abraham I)
- Add support for tests to clear IRQ (Kishon Vijay Abraham I)
- Add common DT schema for endpoint controllers (Kishon Vijay Abraham I)
Amlogic Meson PCIe controller driver:
- Add DT bindings for AXG PCIe PHY, shared MIPI/PCIe analog PHY (Remi
Pommarel)
- Add Amlogic AXG PCIe PHY, AXG MIPI/PCIe analog PHY drivers (Remi
Pommarel)
Cadence PCIe controller driver:
- Add Root Complex/Endpoint DT schema for Cadence PCIe (Kishon Vijay
Abraham I)
Intel VMD host bridge driver:
- Add two VMD Device IDs that require bus restriction mode (Sushma
Kalakota)
Mobiveil PCIe controller driver:
- Refactor and modularize mobiveil driver (Hou Zhiqiang)
- Add support for Mobiveil GPEX Gen4 host (Hou Zhiqiang)
Microsoft Hyper-V host bridge driver:
- Add support for Hyper-V PCI protocol version 1.3 and
PCI_BUS_RELATIONS2 (Long Li)
- Refactor to prepare for virtual PCI on non-x86 architectures (Boqun
Feng)
- Fix memory leak in hv_pci_probe()'s error path (Dexuan Cui)
NVIDIA Tegra PCIe controller driver:
- Use pci_parse_request_of_pci_ranges() (Rob Herring)
- Add support for endpoint mode and related DT updates (Vidya Sagar)
- Reduce -EPROBE_DEFER error message log level (Thierry Reding)
Qualcomm PCIe controller driver:
- Restrict class fixup to specific Qualcomm devices (Bjorn Andersson)
Synopsys DesignWare PCIe controller driver:
- Refactor core initialization code for endpoint mode (Vidya Sagar)
- Fix endpoint MSI-X to use correct table address (Kishon Vijay
Abraham I)
TI DRA7xx PCIe controller driver:
- Fix MSI IRQ handling (Vignesh Raghavendra)
TI Keystone PCIe controller driver:
- Allow AM654 endpoint to raise MSI-X interrupt (Kishon Vijay Abraham I)
Miscellaneous:
- Quirk ASMedia XHCI USB to avoid "PME# from D0" defect (Kai-Heng
Feng)
- Use ioremap(), not phys_to_virt(), for platform ROM to fix video
ROM mapping with CONFIG_HIGHMEM (Mikel Rychliski)"
* tag 'pci-v5.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (96 commits)
misc: pci_endpoint_test: remove duplicate macro PCI_ENDPOINT_TEST_STATUS
PCI: tegra: Print -EPROBE_DEFER error message at debug level
misc: pci_endpoint_test: Use full pci-endpoint-test name in request_irq()
misc: pci_endpoint_test: Fix to support > 10 pci-endpoint-test devices
tools: PCI: Add 'e' to clear IRQ
misc: pci_endpoint_test: Add ioctl to clear IRQ
misc: pci_endpoint_test: Avoid using module parameter to determine irqtype
PCI: keystone: Allow AM654 PCIe Endpoint to raise MSI-X interrupt
PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address
PCI: endpoint: Fix ->set_msix() to take BIR and offset as arguments
misc: pci_endpoint_test: Add support to get DMA option from userspace
tools: PCI: Add 'd' command line option to support DMA
misc: pci_endpoint_test: Use streaming DMA APIs for buffer allocation
PCI: endpoint: functions/pci-epf-test: Print throughput information
PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data
PCI: pciehp: Fix MSI interrupt race
PCI: pciehp: Fix indefinite wait on sysfs requests
PCI: endpoint: Fix clearing start entry in configfs
PCI: tegra: Add support for PCIe endpoint mode in Tegra194
PCI: sysfs: Revert "rescan" file renames
...
|
|
The default lun queue depth by the driver has been 30 for many years.
However, this value, when used with more recent hardware, has actually
throttled some tests that concentrate io on a lun.
Increase the default lun queue depth to 64.
Queue full handling, reported by the target, remains in effect and
unchanged.
Link: https://lore.kernel.org/r/20200323161935.40341-1-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update lpfc version to 12.8.0.0
Link: https://lore.kernel.org/r/20200322181304.37655-13-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
During code review, identified dss feature that was a prototype only and
was never productized in SLI3. They shouldn't be there and prevents reuse
of the command areas.
Remove any code in the driver to deal with dss, including code to deal with
fips, which is associated with the dss feature.
Link: https://lore.kernel.org/r/20200322181304.37655-12-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Currently driver ktime stats, measuring code paths, is NVME-specific.
Convert the stats routines such that the code paths are generic, providing
status for NVME and SCSI. Added ktime stat calls in SCSI queuecommand and
cmpl routines.
Link: https://lore.kernel.org/r/20200322181304.37655-11-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The cpu io statistics were capped by a hard define limit of 128. This
effectively was a max number of CPUs, not an actual CPU count, nor actual
CPU numbers which can be even larger than both of those values. This made
stats off/misleading and on large CPU count systems, wrong.
Fix the stats so that all CPUs can have a stats struct. Fix the looping
such that it loops by hdwq, finds CPUs that used the hdwq, and sum the
stats, then display.
Link: https://lore.kernel.org/r/20200322181304.37655-9-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The AER interfaces to clear error status registers were a confusing mess:
- pci_cleanup_aer_uncorrect_error_status() cleared non-fatal errors
from the Uncorrectable Error Status register.
- pci_aer_clear_fatal_status() cleared fatal errors from the
Uncorrectable Error Status register.
- pci_cleanup_aer_error_status_regs() cleared the Root Error Status
register (for Root Ports), the Uncorrectable Error Status register,
and the Correctable Error Status register.
Rename them to make them consistent:
From To
---------------------------------------- -------------------------------
pci_cleanup_aer_uncorrect_error_status() pci_aer_clear_nonfatal_status()
pci_aer_clear_fatal_status() pci_aer_clear_fatal_status()
pci_cleanup_aer_error_status_regs() pci_aer_clear_status()
Since pci_cleanup_aer_error_status_regs() (renamed to
pci_aer_clear_status()) is only used within drivers/pci/, move the
declaration from <linux/aer.h> to drivers/pci/pci.h.
[bhelgaas: commit log, add renames]
Link: https://lore.kernel.org/r/d1310a75dc3d28f7e8da4e99c45fbd3e60fe238e.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Kernel is crashing with the following stacktrace:
BUG: unable to handle kernel NULL pointer dereference at
00000000000005bc
IP: lpfc_nvme_register_port+0x1a8/0x3a0 [lpfc]
...
Call Trace:
lpfc_nlp_state_cleanup+0x2b2/0x500 [lpfc]
lpfc_nlp_set_state+0xd7/0x1a0 [lpfc]
lpfc_cmpl_prli_prli_issue+0x1f7/0x450 [lpfc]
lpfc_disc_state_machine+0x7a/0x1e0 [lpfc]
lpfc_cmpl_els_prli+0x16f/0x1e0 [lpfc]
lpfc_sli_sp_handle_rspiocb+0x5b2/0x690 [lpfc]
lpfc_sli_handle_slow_ring_event_s4+0x182/0x230 [lpfc]
lpfc_do_work+0x87f/0x1570 [lpfc]
kthread+0x10d/0x130
ret_from_fork+0x35/0x40
During target side fault injections, it is possible to hit the
NLP_WAIT_FOR_UNREG case in lpfc_nvme_remoteport_delete. A prior commit
fixed a rebind and delete race condition, but called lpfc_nlp_put
unconditionally. This triggered a deletion and the crash.
Fix by movng nlp_put to inside the NLP_WAIT_FOR_UNREG case, where the nlp
will be being unregistered/removed. Leave the reference if the flag isn't
set.
Link: https://lore.kernel.org/r/20200322181304.37655-8-jsmart2021@gmail.com
Fixes: b15bd3e6212e ("scsi: lpfc: Fix nvme remoteport registration race conditions")
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The lpfc_sli4_wq_release() routine iterates for each interim value when
updating the wq consuemr index. This wastes cycles and possibly confuses
things as thevalue itterates (and the modulo logic is being applied).
There's no reason for this. Just set it to the value from the hw.
Link: https://lore.kernel.org/r/20200322181304.37655-7-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Injecting EEH on a 32GB card is causing kernel oops
The pci error handler is doing an IO flush and the offline code is also
doing an IO flush. When the 1st flush is complete the hdwq is destroyed
(freed), yet the second flush accesses the hdwq and crashes.
Added a check in lpfc_sli4_fush_io_rings to check both the HBA_IOQ_FLUSH
flag and the hdwq pointer to see if it is already set and not already
freed.
Link: https://lore.kernel.org/r/20200322181304.37655-6-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
SCSI layer sends driver IOs with more s/g segments than driver can handle.
This results in "Too many sg segments from dma_map_sg. Config 64, seg_cnt
219" error messages from the lpfc_scsi_prep_dma_buf_s3() routine.
The was due to use the driver using individual templates for pport and
vport, host reset enabled or not, nvme vs scsi, etc. In the end, there was
a combination for a vport that didn't match the pport.
Rather than enumerating more templates and more discretionary assignments,
revert to a base template that is copied to a template specific to the
pport/vport. Then, based on role, attributes and sli type, modify the
fields that are different for that port. Added a log message to
lpfc_create_port to validate values.
Link: https://lore.kernel.org/r/20200322181304.37655-5-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In lpfc_nvmet_prep_fcp_wqe() the line "rsp->sg_cnt = 0" is modifying the
transport's data structure. This may result in the transport believing the
s/g list was already freed, thus may not unmap/free it properly. Lpfc
driver should not modify the transport data structure.
The zeroing of the sg_cnt is to avoid use of the transport's sgl in a
subsequent loop where the driver builds the necessary requests for the
adapter firmware to complete the IO.
Change LLDD to use a local copy of the transport sg_cnt when building
requests to be passed to the adapter fw.
Link: https://lore.kernel.org/r/20200322181304.37655-4-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The following lockdep error was reported when unloading the lpfc driver:
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
...
Call Trace:
dump_stack+0x96/0xe0
register_lock_class+0x8b8/0x8c0
? lockdep_hardirqs_on+0x190/0x280
? is_dynamic_key+0x150/0x150
? wait_for_completion_interruptible+0x2a0/0x2a0
? wake_up_q+0xd0/0xd0
__lock_acquire+0xda/0x21a0
? register_lock_class+0x8c0/0x8c0
? synchronize_rcu_expedited+0x500/0x500
? __call_rcu+0x850/0x850
lock_acquire+0xf3/0x1f0
? del_timer_sync+0x5/0xb0
del_timer_sync+0x3c/0xb0
? del_timer_sync+0x5/0xb0
lpfc_pci_remove_one.cold.102+0x8b7/0x935 [lpfc]
...
Unloading the driver resulted in a call to del_timer_sync for the
cpuhp_poll_timer. However the call to setup the timer had never been made,
so the timer structures used by lockdep checking were not initialized.
Unconditionally call setup_timer for the cpuhp_poll_timer during driver
initialization. Calls to start the timer remain "as needed".
Link: https://lore.kernel.org/r/20200322181304.37655-3-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The following kasan bug was called out:
BUG: KASAN: slab-out-of-bounds in lpfc_unreg_login+0x7c/0xc0 [lpfc]
Read of size 2 at addr ffff889fc7c50a22 by task lpfc_worker_3/6676
...
Call Trace:
dump_stack+0x96/0xe0
? lpfc_unreg_login+0x7c/0xc0 [lpfc]
print_address_description.constprop.6+0x1b/0x220
? lpfc_unreg_login+0x7c/0xc0 [lpfc]
? lpfc_unreg_login+0x7c/0xc0 [lpfc]
__kasan_report.cold.9+0x37/0x7c
? lpfc_unreg_login+0x7c/0xc0 [lpfc]
kasan_report+0xe/0x20
lpfc_unreg_login+0x7c/0xc0 [lpfc]
lpfc_sli_def_mbox_cmpl+0x334/0x430 [lpfc]
...
When processing the completion of a "Reg Rpi" login mailbox command in
lpfc_sli_def_mbox_cmpl, a call may be made to lpfc_unreg_login. The vpi is
extracted from the completing mailbox context and passed as an input for
the next. However, the vpi stored in the mailbox command context is an
absolute vpi, which for SLI4 represents both base + offset. When used with
a non-zero base component, (function id > 0) this results in an
out-of-range access beyond the allocated phba->vpi_ids array.
Fix by subtracting the function's base value to get an accurate vpi number.
Link: https://lore.kernel.org/r/20200322181304.37655-2-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There is a spelling mistake in a lpfc_printf_vlog info message. Fix it.
[mkp: fix spelling mistake in commit description]
Link: https://lore.kernel.org/linux-scsi/20200221154841.77791-1-colin.king@canonical.com
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This patch modifies lpfc to register for Link Integrity events via the use
of an RDF ELS and to perform Link Integrity FPIN logging.
Specifically, the driver was modified to:
- Format and issue the RDF ELS immediately following SCR registration.
This registers the ability of the driver to receive FPIN ELS.
- Adds decoding of the FPIN els into the received descriptors, with
logging of the Link Integrity event information. After decoding, the ELS
is delivered to the scsi fc transport to be delivered to any user-space
applications.
- To aid in logging, simple helpers were added to create enum to name
string lookup functions that utilize the initialization helpers from the
fc_els.h header.
- Note: base header definitions for the ELS's don't populate the
descriptor payloads. As such, lpfc creates it's own version of the
structures, using the base definitions (mostly headers) and additionally
declaring the descriptors that will complete the population of the ELS.
Link: https://lore.kernel.org/r/20200210173155.547-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update copyrights to 2020 for files modified in the 12.6.0.4 patch set.
Link: https://lore.kernel.org/r/20200128002312.16346-13-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update lpfc version to 12.6.0.4
Link: https://lore.kernel.org/r/20200128002312.16346-12-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The current code does some odd +1 over maximum xri count checks and
requires that the lun_queue_count can't be bigger than maximum xri count
divided by 8. These items are bogus.
Clean the code up to cap lun_queue_count to maximum xri count.
Link: https://lore.kernel.org/r/20200128002312.16346-10-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There was report of an odd "Fix me..." log message, which was tracked down
to the lpfc_els_rcv_rps() routine. This was in handling of a very old and
obsolete ELS - Read Port Status. The RPS ELS was defined in FC-LS-1, but
deprecated in FC-LS-2, and removed from all later FC-LS revisions. It was
replaced by the Read Diagnostic Parameters (RDP) ELS and the Link Error
Status Block descriptor.
There should be no support for the RSP ELS. Remove support from driver.
Link: https://lore.kernel.org/r/20200128002312.16346-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Coverity reported a memory corruption error for the fdmi attributes
routines:
CID 15768 [Memory Corruption] Out-of-bounds access on FDMI
Sloppy coding of the fmdi structures. In both the lpfc_fdmi_attr_def and
lpfc_fdmi_reg_port_list structures, a field was placed at the start of
payload that may have variable content. The field was given an arbitrary
type (uint32_t). The code then uses the field name to derive an address,
which it used in things such as memset and memcpy. The memset sizes or
memcpy lengths were larger than the arbitrary type, thus coverity reported
an error.
Fix by replacing the arbitrary fields with the real field structures
describing the payload.
Link: https://lore.kernel.org/r/20200128002312.16346-8-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The following error is see from the compiler:
drivers/scsi/lpfc/lpfc_init.c: In function
‘lpfc_cpuhp_get_eq’: drivers/scsi/lpfc/lpfc_init.c:12660:1:
error: the frame size of 1032 bytes is larger than 1024 bytes
[-Werror=frame-larger-than=]
The issue is due to allocating a cpumask on the stack.
Fix by converting to a dynamical allocation of the cpu mask.
Link: https://lore.kernel.org/r/20200128002312.16346-7-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When performing reset testing, the eq's list for related hwqs was getting
corrupted. In cases where there is not a 1:1 eq to hwq, the eq is
shared. The eq maintains a list of hwqs utilizing it in case of cpu
offlining and polling. During the reset, the hwqs are being torn down so
they can be recreated. The recreation was getting confused by seeing a
non-null eq assignment on the eq and the eq list became corrupt.
Correct by clearing the hdwq eq assignment when the hwq is cleaned up.
Link: https://lore.kernel.org/r/20200128002312.16346-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Adjust FC4 Types in FDMI settings
The driver sets FDMI information registring ELS as a FC4 type. ELS is a
fc3 type and should not be registered.
Fix by removing ELS type bit when we register for FDMI Port attributes.
Link: https://lore.kernel.org/r/20200128002312.16346-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When driver is set to enable bb credit recovery, the switch displayed the
setting as inactive. If the link bounces, it switches to Active.
During link up processing, the driver currently does a MBX_READ_SPARAM
followed by a MBX_CONFIG_LINK. These mbox commands are queued to be
executed, one at a time and the completion is processed by the worker
thread. Since the MBX_READ_SPARAM is done BEFORE the MBX_CONFIG_LINK, the
BB_SC_N bit is never set the the returned values. BB Credit recovery status
only gets set after the driver requests the feature in CONFIG_LINK, which
is done after the link up. Thus the ordering of READ_SPARAM needs to follow
the CONFIG_LINK.
Fix by reordering so that READ_SPARAM is done after CONFIG_LINK. Added a
HBA_DEFER_FLOGI flag so that any FLOGI handling waits until after the
READ_SPARAM is done so that the proper BB credit value is set in the FLOGI
payload.
Fixes: 6bfb16208298 ("scsi: lpfc: Fix configuration of BB credit recovery in service parameters")
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20200128002312.16346-4-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
If a call to lpfc_get_cmd_rsp_buf_per_hdwq returns NULL (memory allocation
failure), a previously allocated lpfc_io_buf resource is leaked.
Fix by releasing the lpfc_io_buf resource in the failure path.
Fixes: d79c9e9d4b3d ("scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.")
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20200128002312.16346-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The driver is occasionally seeing the following SLI Port error, requiring
reset and reinit:
Port Status Event: ... error 1=0x52004a01, error 2=0x218
The failure means an RQ timeout. That is, the adapter had received
asynchronous receive frames, ran out of buffer slots to place the frames,
and the driver did not replenish the buffer slots before a timeout
occurred. The driver should not be so slow in replenishing buffers that a
timeout can occur.
When the driver received all the frames of a sequence, it allocates an IOCB
to put the frames in. In a situation where there was no IOCB available for
the frame of a sequence, the RQ buffer corresponding to the first frame of
the sequence was not returned to the FW. Eventually, with enough traffic
encountering the situation, the timeout occurred.
Fix by releasing the buffer back to firmware whenever there is no IOCB for
the first frame.
[mkp: typo]
Link: https://lore.kernel.org/r/20200128002312.16346-2-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Fix sparse warning:
drivers/scsi/lpfc/lpfc_nportdisc.c:344:1: warning:
symbol 'lpfc_defer_acc_rsp' was not declared. Should it be static?
Link: https://lore.kernel.org/r/20200107014956.41748-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull compat_ioctl cleanup from Arnd. Here's his description:
This series concludes the work I did for linux-5.5 on the compat_ioctl()
cleanup, killing off fs/compat_ioctl.c and block/compat_ioctl.c by moving
everything into drivers.
Overall this would be a reduction both in complexity and line count, but
as I'm also adding documentation the overall number of lines increases
in the end.
My plan was originally to keep the SCSI and block parts separate.
This did not work easily because of interdependencies: I cannot
do the final SCSI cleanup in a good way without first addressing the
CDROM ioctls, so this is one series that I hope could be merged through
either the block or the scsi git trees, or possibly both if you can
pull in the same branch.
The series comes in these steps:
1. clean up the sg v3 interface as suggested by Linus. I have
talked about this with Doug Gilbert as well, and he would
rebase his sg v4 patches on top of "compat: scsi: sg: fix v3
compat read/write interface"
2. Actually moving handlers out of block/compat_ioctl.c and
block/scsi_ioctl.c into drivers, mixed in with cleanup
patches
3. Document how to do this right. I keep getting asked about this,
and it helps to point to some documentation file.
The branch is based on another one that fixes a couple of bugs found
during the creation of this series.
Changes since v3:
https://lore.kernel.org/lkml/20200102145552.1853992-1-arnd@arndb.de/
- Move sr_compat_ioctl fixup to correct patch (Ben Hutchings)
- Add Reviewed-by tags
Changes since v2:
https://lore.kernel.org/lkml/20191217221708.3730997-1-arnd@arndb.de/
- Rebase to v5.5-rc4, which contains the earlier bugfixes
- Fix sr_block_compat_ioctl() error handling bug found by
Ben Hutchings
- Fix idecd_locked_compat_ioctl() compat_ptr() bug
- Don't try to handle HDIO_DRIVE_TASKFILE in drivers/ide
- More documentation improvements
Changes since v1:
https://lore.kernel.org/lkml/20191211204306.1207817-1-arnd@arndb.de/
- move out the bugfixes into a branch for itself
- clean up scsi sg driver further as suggested by Christoph Hellwig
- avoid some ifdefs by moving compat_ptr() out of asm/compat.h
- split out the blkdev_compat_ptr_ioctl function; bug spotted by
Ben Hutchings
- Improve formatting of documentation
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|