summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-05-20nvme-multipath: introduce delayed removal of the multipath head nodeNilay Shroff
Currently, the multipath head node of an NVMe disk is removed immediately as soon as all paths of the disk are removed. However, this can cause issues in scenarios where: - The disk hot-removal followed by re-addition. - Transient PCIe link failures that trigger re-enumeration, temporarily removing and then restoring the disk. In these cases, removing the head node prematurely may lead to a head disk node name change upon re-addition, requiring applications to reopen their handles if they were performing I/O during the failure. To address this, introduce a delayed removal mechanism of head disk node. During transient failure, instead of immediate removal of head disk node, the system waits for a configurable timeout, allowing the disk to recover. During transient disk failure, if application sends any IO then we queue it instead of failing such IO immediately. If the disk comes back online within the timeout, the queued IOs are resubmitted to the disk ensuring seamless operation. In case disk couldn't recover from the failure then queued IOs are failed to its completion and application receives the error. So this way, if disk comes back online within the configured period, the head node remains unchanged, ensuring uninterrupted workloads without requiring applications to reopen device handles. A new sysfs attribute, named "delayed_removal_secs" is added under head disk blkdev for user who wish to configure time for the delayed removal of head disk node. The default value of this attribute is set to zero second ensuring no behavior change unless explicitly configured. Link: https://lore.kernel.org/linux-nvme/Y9oGTKCFlOscbPc2@infradead.org/ Link: https://lore.kernel.org/linux-nvme/Y+1aKcQgbskA2tra@kbusch-mbp.dhcp.thefacebook.com/ Suggested-by: Keith Busch <kbusch@kernel.org> Suggested-by: Christoph Hellwig <hch@infradead.org> [nilay: reworked based on the original idea/POC from Christoph and Keith] Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvme-pci: derive and better document max segments limitsChristoph Hellwig
Redefine the max segments and max integrity limits based on the limiting factors. This keeps exactly the same values for 4k PAGE_SIZE systems, but increases the number of segments for larger page size as it properly derives the scatterlist allocation based limit for them instead of assuming a 4k PAGE_SIZE. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org>
2025-05-20nvme-pci: use struct_size for allocation struct nvme_devChristoph Hellwig
This avoids open coding the variable size array arithmetics. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Leon Romanovsky <leon@kernel.org>
2025-05-20nvme-pci: add a symolic name for the small pool sizeLeon Romanovsky
Open coding magic numbers in multiple places is never a good idea. Signed-off-by: Leon Romanovsky <leon@kernel.org> [hch: split from a larger patch] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
2025-05-20nvme-pci: use a better encoding for small prp pool allocationsChristoph Hellwig
Add a separate flag to encode that the transfer is using the small page sized pool, and use a normal 0..n count for the number of descriptors. Contains improvements and suggestions from Kanchan Joshi <joshi.k@samsung.com> and Leon Romanovsky <leon@kernel.org>. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Leon Romanovsky <leon@kernel.org>
2025-05-20nvme-pci: rename the descriptor poolsChristoph Hellwig
They are used for both PRPs and SGLs, and we use descriptor elsewhere when referring to their allocations, so use that name here as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Leon Romanovsky <leon@kernel.org>
2025-05-20nvme-pci: remove struct nvme_descriptorChristoph Hellwig
There is no real point in having a union of two pointer types here, just use a void pointer as we mix and match types between the arms of the union between the allocation and freeing side already. Also rename the nr_allocations field to nr_descriptors to better describe what it does. Signed-off-by: Christoph Hellwig <hch@lst.de> [leon: ported forward to include metadata SGL support] Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
2025-05-20nvme-pci: store aborted state in flags variableLeon Romanovsky
Instead of keeping dedicated "bool aborted" variable, switch to a flags flags that can be used for other flags as well. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
2025-05-20nvme-pci: don't try to use SGLs for metadata on the admin queueChristoph Hellwig
No admin command defined in an NVMe specification supports metadata, but to protect against vendor specific commands using metadata ensure that we don't try to use SGLs for metadata on the admin queue, as NVMe does not support SGLs on the admin queue for the PCI transport. Do this by checking if the data transfer has been setup using SGLs as that is required for using SGLs for metadata. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Leon Romanovsky <leon@kernel.org>
2025-05-20nvme-pci: make PRP list DMA pools per-NUMA-nodeCaleb Sander Mateos
NVMe commands with over 8 KB of discontiguous data allocate PRP list pages from the per-nvme_device dma_pool prp_page_pool or prp_small_pool. Each call to dma_pool_alloc() and dma_pool_free() takes the per-dma_pool spinlock. These device-global spinlocks are a significant source of contention when many CPUs are submitting to the same NVMe devices. On a workload issuing 32 KB reads from 16 CPUs (8 hypertwin pairs) across 2 NUMA nodes to 23 NVMe devices, we observed 2.4% of CPU time spent in _raw_spin_lock_irqsave called from dma_pool_alloc and dma_pool_free. Ideally, the dma_pools would be per-hctx to minimize contention. But that could impose considerable resource costs in a system with many NVMe devices and CPUs. As a compromise, allocate per-NUMA-node PRP list DMA pools. Map each nvme_queue to the set of DMA pools corresponding to its device and its hctx's NUMA node. This reduces the _raw_spin_lock_irqsave overhead by about half, to 1.2%. Preventing the sharing of PRP list pages across NUMA nodes also makes them cheaper to initialize. Link: https://lore.kernel.org/linux-nvme/CADUfDZqa=OOTtTTznXRDmBQo1WrFcDw1hBA7XwM7hzJ-hpckcA@mail.gmail.com/T/#u Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvme-pci: factor out a nvme_init_hctx_common() helperCaleb Sander Mateos
nvme_init_hctx() and nvme_admin_init_hctx() are very similar. In preparation for adding more logic, factor out a nvme_init_hctx-common() helper. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvme-fc: do not reference lsrsp after failureDaniel Wagner
The lsrsp object is maintained by the LLDD. The lifetime of the lsrsp object is implicit. Because there is no explicit cleanup/free call into the LLDD, it is not safe to assume after xml_rsp_fails, that the lsrsp is still valid. The LLDD could have freed the object already. With the recent changes how fcloop tracks the resources, this is the case. Thus don't access lsrsp after xml_rsp_fails. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: don't wait for lport cleanupDaniel Wagner
The lifetime of the fcloop_lsreq is not tight to the lifetime of the host or target port, thus there is no need anymore to synchronize the cleanup path anymore. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: add missing fcloop_callback_host_doneDaniel Wagner
Add the missing fcloop_call_host_done calls so that the caller frees resources when something goes wrong. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fc: take tgtport refs for portentryDaniel Wagner
Ensure that the tgtport is not going away as long portentry has a pointer on it. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fc: free pending reqs on tgtport unregisterDaniel Wagner
When nvmet_fc_unregister_targetport is called by the LLDD, it's not possible to communicate with the host, thus all pending request will not be process. Thus explicitly free them. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: drop response if targetport is goneDaniel Wagner
When the target port is gone, the lsrsp pointer is invalid. Thus don't call the done function anymore instead just drop the response. This happens when the target sends a disconnect association. After this the target starts tearing down all resources and doesn't expect any response. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: allocate/free fcloop_lsreq directlyDaniel Wagner
fcloop depends on the host or the target to allocate the fcloop_lsreq object. This means that the lifetime of the fcloop_lsreq is tied to either the host or the target. Consequently, the host or the target must cooperate during shutdown. Unfortunately, this approach does not work well when the target forces a shutdown, as there are dependencies that are difficult to resolve in a clean way. The simplest solution is to decouple the lifetime of the fcloop_lsreq object by managing them directly within fcloop. Since this is not a performance-critical path and only a small number of LS objects are used during setup and cleanup, it does not significantly impact performance to allocate them during normal operation. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <wagi@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: prevent double port deletionDaniel Wagner
The delete callback can be called either via the unregister function or from the transport directly. Thus it is necessary ensure resources are not freed multiple times. Signed-off-by: Daniel Wagner <wagi@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: access fcpreq only when holding reqlockDaniel Wagner
The abort handling logic expects that the state and the fcpreq are only accessed when holding the reqlock lock. While at it, only handle the aborts in the abort handler. Signed-off-by: Daniel Wagner <wagi@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: update refs on tfcp_reqDaniel Wagner
Track the lifetime of the in-flight tfcp_req to ensure the object is not freed too early. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: refactor fcloop_delete_local_portDaniel Wagner
Use the newly introduced fcloop_lport_lookup instead of the open coded version. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: refactor fcloop_nport_alloc and track lportDaniel Wagner
The checks for a valid input values are mixed with the logic to insert a newly allocated nport. Refactor the function so that first the checks are done. This allows to untangle the setup steps into a more linear form which reduces the complexity of the functions. Also start tracking lport when a lport is assigned to a nport. This ensures, that the lport is not going away as long it is still referenced by a nport. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: remove nport from list on last userDaniel Wagner
The nport object has an association with the rport and lport object, that means we can only remove an nport object from the global nport_list after the last user of an rport or lport is gone. Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-fcloop: track ref counts for nportsDaniel Wagner
A nport object is always used in association with targerport, remoteport, tport and rport objects. Add explicit references for any of the associated object. This ensures that nport is not removed too early on shutdown sequences. Signed-off-by: Daniel Wagner <wagi@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-auth: use SHASH_DESC_ON_STACKHannes Reinecke
Use SHASH_DESC_ON_STACK to avoid explicit allocation. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvme-auth: use SHASH_DESC_ON_STACKHannes Reinecke
Use SHASH_DESC_ON_STACK to avoid explicit allocation. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet: simplify the nvmet_req_init() interfaceWilfred Mallawa
Now that a submission queue holds a reference to its completion queue, there is no need to pass the cq argument to nvmet_req_init(), so remove it. Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet: support completion queue sharingWilfred Mallawa
The NVMe PCI transport specification allows for completion queues to be shared by different submission queues. This patch allows a submission queue to keep track of the completion queue it is using with reference counting. As such, it can be ensured that a completion queue is not deleted while a submission queue is actively using it. This patch enables completion queue sharing in the pci-epf target driver. For fabrics drivers, completion queue sharing is not enabled as it is not possible as per the fabrics specification. However, this patch modifies the fabrics drivers to correctly integrate the new API that supports completion queue sharing. Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet: fabrics: add CQ init and destroyWilfred Mallawa
With struct nvmet_cq now having a reference count, this patch amends the target fabrics call chain to initialize and destroy/put a completion queue. Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet: cq: prepare for completion queue sharingWilfred Mallawa
For the PCI transport, the NVMe specification allows submission queues to share completion queues, however, this is not supported in the current NVMe target implementation. This is a preparatory patch to allow for completion queue (CQ) sharing between different submission queues (SQ). To support queue sharing, reference counting completion queues is required. This patch adds the refcount_t field ref to struct nvmet_cq coupled with respective nvmet_cq_init(), nvmet_cq_get(), nvmet_cq_put(), nvmet_cq_is_deletable() and nvmet_cq_destroy() functions. A CQ reference count is initialized with nvmet_cq_init() when a CQ is created. Using nvmet_cq_get(), a reference to a CQ is taken when an SQ is created that uses the respective CQ. Similarly. when an SQ is destroyed, the reference count to the respective CQ from the SQ being destroyed is decremented with nvmet_cq_put(). The last reference to a CQ is dropped on a CQ deletion using nvmet_cq_put(), which invokes nvmet_cq_destroy() to fully cleanup after the CQ. The helper function nvmet_cq_in_use() is used to determine if any SQs are still using the CQ pending deletion. In which case, the CQ must not be deleted. This should protect scenarios where a bad host may attempt to delete a CQ without first having deleted SQ(s) using that CQ. Additionally, this patch adds an array of struct nvmet_cq to the nvmet_ctrl structure. This allows for the controller to keep track of CQs as they are created and destroyed, similar to the current tracking done for SQs. The memory for this array is freed when the controller is freed. A struct nvmet_ctrl reference is also added to the nvmet_cq structure to allow for CQs to be removed from the controller whilst keeping the new API similar to the existing API for SQs. Sample callchain with CQ refcounting for the PCI endpoint target (pci-epf): i. nvmet_execute_create_cq -> nvmet_pci_epf_create_cq -> nvmet_cq_create -> nvmet_cq_init [cq refcount=1] ii. nvmet_execute_create_sq -> nvmet_pci_epf_create_sq -> nvmet_sq_create -> nvmet_sq_init -> nvmet_cq_get [cq refcount=2] iii. nvmet_execute_delete_sq - > nvmet_pci_epf_delete_sq -> -> nvmet_sq_destroy -> nvmet_cq_put [cq refcount 1] iv. nvmet_execute_delete_cq -> nvmet_pci_epf_delete_cq -> nvmet_cq_put [cq refcount 0] Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet: add a helper function for cqid checkingWilfred Mallawa
This patch adds a new helper function nvmet_check_io_cqid(). It is to be used when parsing host commands for IO CQ creation/deletion and IO SQ creation to ensure that the specified IO completion queue identifier (CQID) is not 0 (Admin queue ID). This is a check that already occurs in the nvmet_execute_x() functions prior to nvmet_check_cqid. With the addition of this helper function, the CQ ID checks in the nvmet_execute_x() function can be removed, and instead simply call nvmet_check_io_cqid() in place of nvmet_check_cqid(). Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-auth: authenticate on admin queue onlyHannes Reinecke
Do not start authentication on I/O queues as it doesn't really add value, and secure concatenation disallows it anyway. Authentication commands on I/O queues are not aborted, so the host may still run the authentication protocol on I/O queues. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvme-auth: do not re-authenticate queues with no prior authenticationHannes Reinecke
When sending 'connect' the queues can figure out from the return code whether authentication is required or not. But reauthentication doesn't disconnect the queues, so this check is not available. Rather we need to check whether the queue had been authenticated initially to figure out if we need to reauthenticate. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet-tcp: switch to using the crc32c libraryEric Biggers
Now that the crc32c() library function directly takes advantage of architecture-specific optimizations, it is unnecessary to go through the crypto API. Just use crc32c(). This is much simpler, and it improves performance due to eliminating the crypto API overhead. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvmet: replace strncpy with strscpyMarcelo Moreira
The strncpy() function is deprecated for NUL-terminated strings as explained in the "strncpy() on NUL-terminated strings" section of Documentation/process/deprecated.rst. The key issues are: - strncpy() fails to guarantee NULL-termination when source > destination - it unnecessarily zero-pads short strings, causing performance overhead strscpy() is the proper replacement because: - it guarantees NULL-termination - it avoids redundant zero-padding - it aligns with current kernel string-copying best practice memcpy() was rejected because: - NQN buffers (subsysnqn/hostnqn) are treated as NULL-terminated strings: - strcmp() usage in nvmet_host_allowed() (discovery.c) - strscpy() to copy subsysnqn in nvmet_execute_disc_identify() seq_buf wasn't used because: - this is a simple fixed-size buffer copy - there is no need for progressive string construction features Signed-off-by: Marcelo Moreira <marcelomoreira1905@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvme-tcp: open-code nvme_tcp_queue_request() for R2THannes Reinecke
When handling an R2T PDU we short-circuit nvme_tcp_queue_request() as we should not attempt to send consecutive PDUs. So open-code nvme_tcp_queue_request() for R2T and drop the last argument. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2025-05-20nvme-tcp: remove redundant check to ctrl->optsHannes Reinecke
When checking for secure concatenation we have already validated that 'ctrl->opts' is set, so we can remove this check. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-20nvme-loop: avoid -Wflex-array-member-not-at-end warningGustavo A. R. Silva
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Move the conflicting declaration to the end of the structure. Notice that `struct nvme_loop_iod` is a flexible structure --a structure that contains a flexible-array member. Fix the following warning: drivers/nvme/target/loop.c:36:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-19Merge branch '200GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== idpf: add initial PTP support Milena Olech says: This patch series introduces support for Precision Time Protocol (PTP) to Intel(R) Infrastructure Data Path Function (IDPF) driver. PTP feature is supported when the PTP capability is negotiated with the Control Plane (CP). IDPF creates a PTP clock and sets a set of supported functions. During the PTP initialization, IDPF requests a set of PTP capabilities and receives a writeback from the CP with the set of supported options. These options are: - get time of the PTP clock - set the time of the PTP clock - adjust the PTP clock - Tx timestamping Each feature is considered to have direct access, where the operations on PCIe BAR registers are allowed, or the mailbox access, where the virtchnl messages are used to perform any PTP action. Mailbox access means that PTP requests are sent to the CP through dedicated secondary mailbox and the CP reads/writes/modifies desired resource - PTP Clock or Tx timestamp registers. Tx timestamp capabilities are negotiated only for vports that have UPLINK_VPORT flag set by the CP. Capabilities provide information about the number of available Tx timestamp latches, their indexes and size of the Tx timestamp value. IDPF requests Tx timestamp by setting the TSYN bit and the requested timestamp index in the context descriptor for the PTP packets. When the completion tag for that packet is received, IDPF schedules a worker to read the Tx timestamp value. * '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: idpf: add support for Rx timestamping idpf: add Tx timestamp flows idpf: add Tx timestamp capabilities negotiation idpf: add PTP clock configuration idpf: add mailbox access to read PTP clock time idpf: negotiate PTP capabilities and get PTP clock idpf: move virtchnl structures to the header file virtchnl: add PTP virtchnl definitions idpf: add initial PTP support idpf: change the method for mailbox workqueue allocation ==================== Link: https://patch.msgid.link/20250516170645.1172700-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-19docs: firmware: qcom_scm: Fix kernel-doc warningUnnathi Chalicheemala
Add description for members of qcom_scm_desc struct to avoid: drivers/firmware/qcom/qcom_scm.h:56: warning: Function parameter or struct member 'svc' not described in 'qcom_scm_desc' drivers/firmware/qcom/qcom_scm.h:56: warning: Function parameter or struct member 'cmd' not described in 'qcom_scm_desc' drivers/firmware/qcom/qcom_scm.h:56: warning: Function parameter or struct member 'owner' not described in 'qcom_scm_desc' Signed-off-by: Unnathi Chalicheemala <unnathi.chalicheemala@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250403-fix_scm_doc_warn-v1-1-9cd36345db77@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-05-19i2c: mlxbf: Use str_read_write() helperFeng Wei
Remove hard-coded strings by using the str_read_write() helper. Signed-off-by: Feng Wei <feng.wei8@zte.com.cn> Signed-off-by: Shao Mingyin <shao.mingyin@zte.com.cn> Link: https://lore.kernel.org/r/20250401192603311H5OxuFmUSbPc4VnQQkhZr@zte.com.cn Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-05-19i2c: thunderx: Use non-hybrid PCI devres APIPhilipp Stanner
thunderx enables its PCI device with pcim_enable_device(). This, implicitly, switches the function pci_request_regions() into managed mode, where it becomes a devres function. The PCI subsystem wants to remove this hybrid nature from its interfaces. To do so, users of the aforementioned combination of functions must be ported to non-hybrid functions. Replace the call to sometimes-managed pci_request_regions() with one to the always-managed pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250417082511.22272-3-phasta@kernel.org Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-05-19i2c: ismt: Use non-hybrid PCI devres APIPhilipp Stanner
ismt enables its PCI device with pcim_enable_device(). This, implicitly, switches the function pci_request_region() into managed mode, where it becomes a devres function. The PCI subsystem wants to remove this hybrid nature from its interfaces. To do so, users of the aforementioned combination of functions must be ported to non-hybrid functions. Replace the call to sometimes-managed pci_request_region() with one to the always-managed pcim_request_region(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250417082511.22272-2-phasta@kernel.org Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-05-19i2c: davinci: add I2C_FUNC_PROTOCOL_MANGLING to feature listMarcus Folkesson
The driver do support I2C_M_IGNORE_NAK, so add I2C_FUNC_PROTOCOL_MANGLING to the feature list. Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Acked-by: Mukesh Kumar Savaliya <quic_msavaliy@quicinc.com> Link: https://lore.kernel.org/r/20250326-i2c-v1-1-82409ebe9f2b@gmail.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-05-19i2c: smbus: introduce Write Disable-aware SPD instantiating functionsYo-Jung (Leo) Lin
Some SMBus controllers may restrict writes to addresses where SPD sensors may reside. This may lead to some SPD sensors not functioning correctly, and might need extra handling. Introduce new SPD-instantiating functions that are aware of this, and use them instead. Signed-off-by: Yo-Jung Lin (Leo) <leo.lin@canonical.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20250430-for-upstream-i801-spd5118-no-instantiate-v2-1-2f54d91ae2c7@canonical.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-05-19i2c: riic: Implement bus recoveryLad Prabhakar
Implement I2C bus recovery support for the RIIC controller by making use of software-controlled SCL and SDA line manipulation. The controller allows forcing SCL and SDA levels through control bits, which enables generation of manual clock pulses and a stop condition to free a stuck bus. This implementation wires up the bus recovery mechanism using i2c_generic_scl_recovery and provides get/set operations for SCL and SDA. This allows the RIIC driver to recover from bus hang scenarios where SDA is held low by a slave. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com> Link: https://lore.kernel.org/r/20250501204003.141134-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-05-19i2c: tegra: check msg length in SMBUS block readAkhil R
For SMBUS block read, do not continue to read if the message length passed from the device is '0' or greater than the maximum allowed bytes. Signed-off-by: Akhil R <akhilrajeev@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20250424053320.19211-1-akhilrajeev@nvidia.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-05-19i2c: pasemi: Log bus reset causesHector Martin
This ensures we get all information we need to debug issues when users forward us their logs. Signed-off-by: Hector Martin <marcan@marcan.st> Reviewed-by: Neal Gompa <neal@gompa.dev> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Sven Peter <sven@svenpeter.dev> Link: https://lore.kernel.org/r/20250427-pasemi-fixes-v3-4-af28568296c0@svenpeter.dev Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-05-19i2c: pasemi: Improve error recoveryHector Martin
Add handling for all the missing error condition, and better recovery in pasemi_smb_clear(). Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Hector Martin <marcan@marcan.st> Co-developed-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Sven Peter <sven@svenpeter.dev> Link: https://lore.kernel.org/r/20250427-pasemi-fixes-v3-3-af28568296c0@svenpeter.dev Signed-off-by: Andi Shyti <andi.shyti@kernel.org>