summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-01-07svcrdma: Explicitly pass the transport into Write chunk I/O pathsChuck Lever
Enable the eventual removal of the svc_rdma_chunk_ctxt::cc_rdma field. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Acquire the svcxprt_rdma pointer from the CQ contextChuck Lever
Enable the removal of the svc_rdma_chunk_ctxt::cc_rdma field in a subsequent patch. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Reduce size of struct svc_rdma_rw_ctxtChuck Lever
SG_CHUNK_SIZE is 128, making struct svc_rdma_rw_ctxt + the first SGL array more than 4200 bytes in length, pushing the memory allocation well into order 1. Even so, the RDMA rw core doesn't seem to use more than max_send_sge entries in that array (typically 32 or less), so that is all wasted space. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Update some svcrdma DMA-related tracepointsChuck Lever
A send/recv_ctxt already records transport-related information in the cq.id, thus there is no need to record the IP addresses of the transport endpoints. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: DMA error tracepoints should report completion IDsChuck Lever
Update the DMA error flow tracepoints to report the completion ID of the failing context. This ties the wait/failure to a particular operation or request, which is more useful than knowing only the failing transport. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: SQ error tracepoints should report completion IDsChuck Lever
Update the Send Queue's error flow tracepoints to report the completion ID of the waiting or failing context. This ties the wait/failure to a particular operation or request, which is a little more useful than knowing only the transport that is about to close. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07rpcrdma: Introduce a simple cid tracepoint classChuck Lever
De-duplicate some code, making it easier to add new tracepoints that report only a completion ID. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Add lockdep class keys for transport locksChuck Lever
Two svcrdma-related transport locks can become quite contended. Collate their use and make them easy to find in /proc/lock_stat for better observability. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Clean up lockingChuck Lever
There's no need to protect llist_entry() with a spin lock. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Add an async version of svc_rdma_write_info_free()Chuck Lever
DMA unmapping can take quite some time, so it should not be handled in a single-threaded completion handler. Defer releasing write_info structs to the recently-added workqueue. With this patch, DMA unmapping can be handled in parallel, and it does not cause head-of-queue blocking of Write completions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Add an async version of svc_rdma_send_ctxt_put()Chuck Lever
DMA unmapping can take quite some time, so it should not be handled in a single-threaded completion handler. Defer releasing send_ctxts to the recently-added workqueue. With this patch, DMA unmapping can be handled in parallel, and it does not cause head-of-queue blocking of Send completions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Add a utility workqueue to svcrdmaChuck Lever
To handle work in the background, set up an UNBOUND workqueue for svcrdma. Subsequent patches will make use of it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Pre-allocate svc_rdma_recv_ctxt objectsChuck Lever
The original reason for allocating svc_rdma_recv_ctxt objects during Receive completion was to ensure the objects were allocated on the NUMA node closest to the underlying IB device. Since commit c5d68d25bd6b ("svcrdma: Clean up allocation of svc_rdma_recv_ctxt"), however, the device's favored node is explicitly passed to the memory allocator. To enable switching Receive completion to soft IRQ context, move memory allocation out of completion handling, since it can be costly, and it can sleep. A limited number of objects is now allocated at "accept" time. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Eliminate allocation of recv_ctxt objects in backchannelChuck Lever
The svc_rdma_recv_ctxt free list uses a lockless list to avoid the need for a spin lock in the fast path. llist_del_first(), which is used by svc_rdma_recv_ctxt_get(), requires serialization, however, when there are multiple list producers that are unserialized. I mistakenly thought there was only one caller of svc_rdma_recv_ctxt_get() (svc_rdma_refresh_recvs()), thus explicit serialization would not be necessary. But there is another caller: svc_rdma_bc_sendto(), and these two are not serialized against each other. I haven't seen ill effects that I could directly ascribe to a lack of serialization. It's just an observation based on code audit. When DMA-mapping before sending a Reply, the passed-in struct svc_rdma_recv_ctxt is used only for its write and reply PCLs. These are currently always empty in the backchannel case. So, instead of passing a full svc_rdma_recv_ctxt object to svc_rdma_map_reply_msg(), let's pass in just the Write and Reply PCLs. This change makes it unnecessary for the backchannel to acquire a dummy svc_rdma_recv_ctxt object when sending an RPC Call. The need for svc_rdma_recv_ctxt free list serialization is now completely avoided. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07NFSv4, NFSD: move enum nfs_cb_opnum4 to include/linux/nfs4.hChenXiaoSong
Callback operations enum is defined in client and server, move it to common header file. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Anna Schumaker <Anna.Schumaker@netapp.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07nfsd: remove unnecessary NULL checkDan Carpenter
We check "state" for NULL on the previous line so it can't be NULL here. No need to check again. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202312031425.LffZTarR-lkp@intel.com/ Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07SUNRPC: Remove RQ_SPLICE_OKChuck Lever
This flag is no longer used. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07NFSD: Modify NFSv4 to use nfsd_read_splice_ok()Chuck Lever
Avoid the use of an atomic bitop, and prepare for adding a run-time switch for using splice reads. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07NFSD: Replace RQ_SPLICE_OK in nfsd_read()Chuck Lever
RQ_SPLICE_OK is a bit of a layering violation. Also, a subsequent patch is going to provide a mechanism for always disabling splice reads. Splicing is an issue only for NFS READs, so refactor nfsd_read() to check the auth type directly instead of relying on an rq_flag setting. The new helper will be added into the NFSv4 read path in a subsequent patch. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07SUNRPC: Add a server-side API for retrieving an RPC's pseudoflavorChuck Lever
NFSD will use this new API to determine whether nfsd_splice_read is safe to use. This avoids the need to add a dependency to NFSD for CONFIG_SUNRPC_GSS. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07NFSD: Document lack of f_pos_lock in nfsd_readdir()Chuck Lever
Al Viro notes that normal system calls hold f_pos_lock when calling ->iterate_shared and ->llseek; however nfsd_readdir() does not take that mutex when calling these methods. It should be safe however because the struct file acquired by nfsd_readdir() is not visible to other threads. Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07NFSD: Remove nfsd_drc_gc() tracepointChuck Lever
This trace point was for debugging the DRC's garbage collection. In the field it's just noise. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07NFSD: Make the file_delayed_close workqueue UNBOUNDChuck Lever
workqueue: nfsd_file_delayed_close [nfsd] hogged CPU for >13333us 8 times, consider switching to WQ_UNBOUND There's no harm in closing a cached file descriptor on another core. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07NFSD: use read_seqbegin() rather than read_seqbegin_or_lock()Oleg Nesterov
The usage of read_seqbegin_or_lock() in nfsd_copy_write_verifier() is wrong. "seq" is always even and thus "or_lock" has no effect, this code can never take ->writeverf_lock for writing. I guess this is fine, nfsd_copy_write_verifier() just copies 8 bytes and nfsd_reset_write_verifier() is supposed to be very rare operation so we do not need the adaptive locking in this case. Yet the code looks wrong and sub-optimal, it can use read_seqbegin() without changing the behaviour. [ cel: Note also that it eliminates this Sparse warning: fs/nfsd/nfssvc.c:360:6: warning: context imbalance in 'nfsd_copy_write_verifier' - different lock contexts for basic block ] Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07nfsd: new Kconfig option for legacy client trackingJeff Layton
We've had a number of attempts at different NFSv4 client tracking methods over the years, but now nfsdcld has emerged as the clear winner since the others (recoverydir and the usermodehelper upcall) are problematic. As a case in point, the recoverydir backend uses MD5 hashes to encode long form clientid strings, which means that nfsd repeatedly gets dinged on FIPS audits, since MD5 isn't considered secure. Its use of MD5 is not cryptographically significant, so there is no danger there, but allowing us to compile that out allows us to sidestep the issue entirely. As a prelude to eventually removing support for these client tracking methods, add a new Kconfig option that enables them. Mark it deprecated and make it default to N. Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07ipvlan: Remove usage of the deprecated ida_simple_xx() APIChristophe JAILLET
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). This is less verbose. Note that the upper bound of ida_alloc_range() is inclusive while the one of ida_simple_get() was exclusive. So calls have been updated accordingly. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07ipvlan: Fix a typo in a commentChristophe JAILLET
s/diffentiate/differentiate/ Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07smb: client: stop revalidating reparse points unnecessarilyPaulo Alcantara
Query dir responses don't provide enough information on reparse points such as major/minor numbers and symlink targets other than reparse tags, however we don't need to unconditionally revalidate them only because they are reparse points. Instead, revalidate them only when their ctime or reparse tag has changed. For instance, Windows Server updates ctime of reparse points when their data have changed. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07cifs: Pass unbyteswapped eof value into SMB2_set_eof()David Howells
Change SMB2_set_eof() to take eof as CPU order rather than __le64 and pass it directly rather than by pointer. This moves the conversion down into SMB_set_eof() rather than all of its callers and means we don't need to undo it for the traceline. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07smb3: Improve exception handling in allocate_mr_list()Markus Elfring
The kfree() function was called in one case by the allocate_mr_list() function during error handling even if the passed variable contained a null pointer. This issue was detected by using the Coccinelle software. Thus use another label. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07cifs: fix in logging in cifs_chan_update_ifaceShyam Prasad N
Recently, cifs_chan_update_iface was modified to not remove an iface if a suitable replacement was not found. With that, there were two conditionals that were exactly the same. This change removes that extra condition check. Also, fixed a logging in the same function to indicate the correct message. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07smb: client: handle special files and symlinks in SMB3 POSIXPaulo Alcantara
Parse reparse points in SMB3 posix query info as they will be supported and required by the new specification. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07smb: client: cleanup smb2_query_reparse_point()Paulo Alcantara
Use smb2_compound_op() with SMB2_OP_GET_REPARSE to get reparse point. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07smb: client: allow creating symlinks via reparse pointsPaulo Alcantara
Add support for creating symlinks via IO_REPARSE_TAG_SYMLINK reparse points in SMB2+. These are fully supported by most SMB servers and documented in MS-FSCC. Also have the advantage of requiring fewer roundtrips as their symlink targets can be parsed directly from CREATE responses on STATUS_STOPPED_ON_SYMLINK errors. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311260838.nx5mkj1j-lkp@intel.com/ Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07smb: client: fix hardlinking of reparse pointsPaulo Alcantara
The client was sending an SMB2_CREATE request without setting OPEN_REPARSE_POINT flag thus failing the entire hardlink operation. Fix this by setting OPEN_REPARSE_POINT in create options for SMB2_CREATE request when the source inode is a repase point. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07smb: client: fix renaming of reparse pointsPaulo Alcantara
The client was sending an SMB2_CREATE request without setting OPEN_REPARSE_POINT flag thus failing the entire rename operation. Fix this by setting OPEN_REPARSE_POINT in create options for SMB2_CREATE request when the source inode is a repase point. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07smb: client: optimise reparse point queryingPaulo Alcantara
Reduce number of roundtrips to server when querying reparse points in ->query_path_info() by sending a single compound request of create+get_reparse+get_info+close. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07smb: client: allow creating special files via reparse pointsPaulo Alcantara
Add support for creating special files (e.g. char/block devices, sockets, fifos) via NFS reparse points on SMB2+, which are fully supported by most SMB servers and documented in MS-FSCC. smb2_get_reparse_inode() creates the file with a corresponding reparse point buffer set in @iov through a single roundtrip to the server. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311260746.HOJ039BV-lkp@intel.com/ Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07smb: client: extend smb2_compound_op() to accept more commandsPaulo Alcantara
Make smb2_compound_op() accept up to MAX_COMPOUND(5) commands to be sent over a single compounded request. This will allow next commits to read and write reparse files through a single roundtrip to the server. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07smb: client: Fix minor whitespace errors and warningsPierre Mariani
Fixes no-op checkpatch errors and warnings. Signed-off-by: Pierre Mariani <pierre.mariani@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-07Linux 6.7v6.7Linus Torvalds
2024-01-07net/sched: Remove ipt action testsJamal Hadi Salim
Commit ba24ea129126 ("net/sched: Retire ipt action") removed the ipt action but not the testcases. This patch removes the outstanding tdc tests. Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07Merge branch 'stmmac-per-dma-channel-interrupt'David S. Miller
Swee Leong Ching says: ==================== net: stmmac: Enable Per DMA Channel interrupt Add Per DMA Channel interrupt feature for DWXGMAC IP. Patchset (link below) contains per DMA channel interrupt, But it was achieved. https://lore.kernel.org/lkml/20230821203328.GA2197059- robh@kernel.org/t/#m849b529a642e1bff89c05a07efc25d6a94c8bfb4 Some of the changes in this patchset are based on reviewer comment on patchset mentioned beforehand. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07net: stmmac: Use interrupt mode INTM=1 for per channel irqSwee Leong Ching
Enable per DMA channel interrupt that uses shared peripheral interrupt (SPI), so only per channel TX and RX intr (TI/RI) are handled by TX/RX ISR without calling common interrupt ISR. Signed-off-by: Teoh Ji Sheng <ji.sheng.teoh@intel.com> Signed-off-by: Swee Leong Ching <leong.ching.swee@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07net: stmmac: Add support for TX/RX channel interruptSwee Leong Ching
Enable TX/RX channel interrupt registration for MAC that interrupts CPU through shared peripheral interrupt (SPI). Per channel interrupts and interrupt-names are registered through, Eg: 4 tx and 4 rx channels: interrupts = <GIC_SPI 100 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 101 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 102 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 103 IRQ_TYPE_LEVEL_HIGH>; <GIC_SPI 104 IRQ_TYPE_LEVEL_HIGH>; <GIC_SPI 105 IRQ_TYPE_LEVEL_HIGH>; <GIC_SPI 106 IRQ_TYPE_LEVEL_HIGH>; <GIC_SPI 107 IRQ_TYPE_LEVEL_HIGH>; interrupt-names = "dma_tx0", "dma_tx1", "dma_tx2", "dma_tx3", "dma_rx0", "dma_rx1", "dma_rx2", "dma_rx3"; Signed-off-by: Teoh Ji Sheng <ji.sheng.teoh@intel.com> Signed-off-by: Swee Leong Ching <leong.ching.swee@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07net: stmmac: Make MSI interrupt routine genericSwee Leong Ching
There is no support for per DMA channel interrupt for non-MSI platform, where the MAC's per channel interrupt hooks up to interrupt controller(GIC) through shared peripheral interrupt(SPI) to handle interrupt from TX/RX transmit channel. This patch generalize the existing MSI ISR to also support non-MSI platform. Signed-off-by: Teoh Ji Sheng <ji.sheng.teoh@intel.com> Signed-off-by: Swee Leong Ching <leong.ching.swee@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07dt-bindings: net: snps,dwmac: per channel irqSwee Leong Ching
Add dt-bindings for per channel irq. Signed-off-by: Rohan G Thomas <rohan.g.thomas@intel.com> Signed-off-by: Swee Leong Ching <leong.ching.swee@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07Merge branch 'at803x-more-generalization'David S. Miller
Christian Marangi says: ==================== net: phy: at803x: even more generalization This is part 3 of at803x required patches to split the PHY driver in more specific PHY Family driver. While adding support for a new PHY Family qca807x it was notice lots of similarities with the qca808x cdt function. Hence this series is done to make things easier in the future when qca807x PHY will be submitted. Changes v4: - Fix Smatch warning Changes v3: - Rebase on top of net-next Changes v2: - Address request from Russell in a previous series on cdt get status improvement ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07net: phy: at803x: make read_status more genericChristian Marangi
Make read_status more generic in preparation on moving it to shared library as other PHY Family Driver will have the exact same implementation. The only specific part was a check for AR8031/33 if 1000basex was used. The check is moved to a dedicated function specific for those PHYs. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07net: phy: at803x: add support for cdt cross short test for qca808xChristian Marangi
QCA808x PHY Family supports Cable Diagnostic Test also for Cross Pair Short. Add all the define to make enable and support these additional tests. Cross Short test was previously disabled by default, this is now changed and enabled by default. In this mode, the mask changed a bit and length is shifted based on the fault condition. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>