summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-07dmaengine: xilinx: xdma: Fix interrupt vector settingMiquel Raynal
A couple of hardware registers need to be set to reflect which interrupts have been allocated to the device. Each register is 32-bit wide and can receive four 8-bit values. If we provide any other interrupt number than four, the irq_num variable will never be 0 within the while check and the while block will loop forever. There is an easy way to prevent this: just break the for loop when we reach "irq_num == 0", which anyway means all interrupts have been processed. Cc: stable@vger.kernel.org Fixes: 17ce252266c7 ("dmaengine: xilinx: xdma: Add xilinx xdma driver") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20230731101442.792514-2-miquel.raynal@bootlin.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-08-07dmaengine: owl-dma: Modify mismatched function nameZhang Jianhua
No functional modification involved. drivers/dma/owl-dma.c:208: warning: expecting prototype for struct owl_dma_pchan. Prototype was for struct owl_dma_vchan instead HDRTEST usr/include/sound/asequencer.h Fixes: 47e20577c24d ("dmaengine: Add Actions Semi Owl family S900 DMA driver") Signed-off-by: Zhang Jianhua <chris.zjh@huawei.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20230722153244.2086949-1-chris.zjh@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-08-07dmaengine: idxd: Clear PRS disable flag when disabling IDXD deviceFenghua Yu
Disabling IDXD device doesn't reset Page Request Service (PRS) disable flag to its initial value 0. This may cause user confusion because once PRS is disabled user will see PRS still remains the previous setting (i.e. disabled) via sysfs interface even after the device is disabled. To eliminate user confusion, reset PRS disable flag to ensure that the PRS flag bit reflects correct state after the device is disabled. Additionally, simplify the code by setting wq->flags to 0, which clears all flag bits, including any future additions. Fixes: f2dc327131b5 ("dmaengine: idxd: add per wq PRS disable") Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230712193505.3440752-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-08-07dmaengine: pl330: Return DMA_PAUSED when transaction is pausedIlpo Järvinen
pl330_pause() does not set anything to indicate paused condition which causes pl330_tx_status() to return DMA_IN_PROGRESS. This breaks 8250 DMA flush after the fix in commit 57e9af7831dc ("serial: 8250_dma: Fix DMA Rx rearm race"). The function comment for pl330_pause() claims pause is supported but resume is not which is enough for 8250 DMA flush to work as long as DMA status reports DMA_PAUSED when appropriate. Add PAUSED state for descriptor and mark BUSY descriptors with PAUSED in pl330_pause(). Return DMA_PAUSED from pl330_tx_status() when the descriptor is PAUSED. Reported-by: Richard Tresidder <rtresidd@electromag.com.au> Tested-by: Richard Tresidder <rtresidd@electromag.com.au> Fixes: 88987d2c7534 ("dmaengine: pl330: add DMA_PAUSE feature") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-serial/f8a86ecd-64b1-573f-c2fa-59f541083f1a@electromag.com.au/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20230526105434.14959-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-08-07dmaengine: qcom_hidma: Update codeaurora email domainJeffrey Hugo
The codeaurora.org email domain is defunct and will bounce. Update entries to Sinan's kernel.org address which is the address in MAINTAINERS for this component. Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Acked-By: Sinan Kaya <okaya@kernel.org> Link: https://lore.kernel.org/r/20230707195003.6619-1-quic_jhugo@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-08-07dmaengine: mcf-edma: Fix a potential un-allocated memory accessChristophe JAILLET
When 'mcf_edma' is allocated, some space is allocated for a flexible array at the end of the struct. 'chans' item are allocated, that is to say 'pdata->dma_channels'. Then, this number of item is stored in 'mcf_edma->n_chans'. A few lines later, if 'mcf_edma->n_chans' is 0, then a default value of 64 is set. This ends to no space allocated by devm_kzalloc() because chans was 0, but 64 items are read and/or written in some not allocated memory. Change the logic to define a default value before allocating the memory. Fixes: e7a3ff92eaf1 ("dmaengine: fsl-edma: add ColdFire mcf5441x edma support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/f55d914407c900828f6fad3ea5fa791a5f17b9a4.1685172449.git.christophe.jaillet@wanadoo.fr Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-08-06Merge tag 'v6.5-rc5.vfs.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix a wrong check for O_TMPFILE during RESOLVE_CACHED lookup - Clean up directory iterators and clarify file_needs_f_pos_lock() * tag 'v6.5-rc5.vfs.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: rely on ->iterate_shared to determine f_pos locking vfs: get rid of old '->iterate' directory operation proc: fix missing conversion to 'iterate_shared' open: make RESOLVE_CACHED correctly test for O_TMPFILE
2023-08-06ionic: Add missing err handling for queue reconfigNitya Sunkad
ionic_start_queues_reconfig returns an error code if txrx_init fails. Handle this error code in the relevant places. This fixes a corner case where the device could get left in a detached state if the CMB reconfig fails and the attempt to clean up the mess also fails. Note that calling netif_device_attach when the netdev is already attached does not lead to unexpected behavior. Change goto name "errout" to "err_out" to maintain consistency across goto statements. Fixes: 40bc471dc714 ("ionic: add tx/rx-push support with device Component Memory Buffers") Fixes: 6f7d6f0fd7a3 ("ionic: pull reset_queues into tx_timeout handler") Signed-off-by: Nitya Sunkad <nitya.sunkad@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06drivers: vxlan: vnifilter: free percpu vni stats on error pathFedor Pchelkin
In case rhashtable_lookup_insert_fast() fails inside vxlan_vni_add(), the allocated percpu vni stats are not freed on the error path. Introduce vxlan_vni_free() which would work as a nice wrapper to free vxlan_vni_node resources properly. Found by Linux Verification Center (linuxtesting.org). Fixes: 4095e0e1328a ("drivers: vxlan: vnifilter: per vni stats") Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06hwmon: (pmbus/bel-pfe) Enable PMBUS_SKIP_STATUS_CHECK for pfe1100Tao Ren
Skip status check for both pfe1100 and pfe3000 because the communication error is also observed on pfe1100 devices. Signed-off-by: Tao Ren <rentao.bupt@gmail.com> Fixes: 626bb2f3fb3c hwmon: (pmbus) add driver for BEL PFE1100 and PFE3000 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230804221403.28931-1-rentao.bupt@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-08-06fs: rely on ->iterate_shared to determine f_pos lockingChristian Brauner
Now that we removed ->iterate we don't need to check for either ->iterate or ->iterate_shared in file_needs_f_pos_lock(). Simply check for ->iterate_shared instead. This will tell us whether we need to unconditionally take the lock. Not just does it allow us to avoid checking f_inode's mode it also actually clearly shows that we're locking because of readdir. Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06vfs: get rid of old '->iterate' directory operationLinus Torvalds
All users now just use '->iterate_shared()', which only takes the directory inode lock for reading. Filesystems that never got convered to shared mode now instead use a wrapper that drops the lock, re-takes it in write mode, calls the old function, and then downgrades the lock back to read mode. This way the VFS layer and other callers no longer need to care about filesystems that never got converted to the modern era. The filesystems that use the new wrapper are ceph, coda, exfat, jfs, ntfs, ocfs2, overlayfs, and vboxsf. Honestly, several of them look like they really could just iterate their directories in shared mode and skip the wrapper entirely, but the point of this change is to not change semantics or fix filesystems that haven't been fixed in the last 7+ years, but to finally get rid of the dual iterators. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06proc: fix missing conversion to 'iterate_shared'Linus Torvalds
I'm looking at the directory handling due to the discussion about f_pos locking (see commit 797964253d35: "file: reinstate f_pos locking optimization for regular files"), and wanting to clean that up. And one source of ugliness is how we were supposed to move filesystems over to the '->iterate_shared()' function that only takes the inode lock for reading many many years ago, but several filesystems still use the bad old '->iterate()' that takes the inode lock for exclusive access. See commit 6192269444eb ("introduce a parallel variant of ->iterate()") that also added some documentation stating Old method is only used if the new one is absent; eventually it will be removed. Switch while you still can; the old one won't stay. and that was back in April 2016. Here we are, many years later, and the old version is still clearly sadly alive and well. Now, some of those old style iterators are probably just because the filesystem may end up having per-inode mutable data that it uses for iterating a directory, but at least one case is just a mistake. Al switched over most filesystems to use '->iterate_shared()' back when it was introduced. In particular, the /proc filesystem was converted as one of the first ones in commit f50752eaa0b0 ("switch all procfs directories ->iterate_shared()"). But then later one new user of '->iterate()' was then re-introduced by commit 6d9c939dbe4d ("procfs: add smack subdir to attrs"). And that's clearly not what we wanted, since that new case just uses the same 'proc_pident_readdir()' and 'proc_pident_lookup()' helper functions that other /proc pident directories use, and they are most definitely safe to use with the inode lock held shared. So just fix it. This still leaves a fair number of oddball filesystems using the old-style directory iterator (ceph, coda, exfat, jfs, ntfs, ocfs2, overlayfs, and vboxsf), but at least we don't have any remaining in the core filesystems. I'm going to add a wrapper function that just drops the read-lock and takes it as a write lock, so that we can clean up the core vfs layer and make all the ugly 'this filesystem needs exclusive inode locking' be just filesystem-internal warts. I just didn't want to make that conversion when we still had a core user left. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06open: make RESOLVE_CACHED correctly test for O_TMPFILEAleksa Sarai
O_TMPFILE is actually __O_TMPFILE|O_DIRECTORY. This means that the old fast-path check for RESOLVE_CACHED would reject all users passing O_DIRECTORY with -EAGAIN, when in fact the intended test was to check for __O_TMPFILE. Cc: stable@vger.kernel.org # v5.12+ Fixes: 99668f618062 ("fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Message-Id: <20230806-resolve_cached-o_tmpfile-v1-1-7ba16308465e@cyphar.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-06net: omit ndo_hwtstamp_get() call when possible in dev_set_hwtstamp_phylib()Vladimir Oltean
Setting dev->priv_flags & IFF_SEE_ALL_HWTSTAMP_REQUESTS is only legal for drivers which were converted to ndo_hwtstamp_get() and ndo_hwtstamp_set(), and it is only there that we call ndo_hwtstamp_set() for a request that otherwise goes to phylib (for stuff like packet traps, which need to be undone if phylib failed, hence the old_cfg logic). The problem is that we end up calling ndo_hwtstamp_get() when we don't need to (even if the SIOCSHWTSTAMP wasn't intended for phylib, or if it was, but the driver didn't set IFF_SEE_ALL_HWTSTAMP_REQUESTS). For those unnecessary conditions, we share a code path with virtual drivers (vlan, macvlan, bonding) where ndo_hwtstamp_get() is implemented as generic_hwtstamp_get_lower(), and may be resolved through generic_hwtstamp_ioctl_lower() if the lower device is unconverted. I.e. this situation: $ ip link add link eno0 name eno0.100 type vlan id 100 $ hwstamp_ctl -i eno0.100 -t 1 We are unprepared to deal with this, because if ndo_hwtstamp_get() is resolved through a legacy ndo_eth_ioctl(SIOCGHWTSTAMP) lower_dev implementation, that needs a non-NULL old_cfg.ifr pointer, and we don't have it. But we don't even need to deal with it either. In the general case, drivers may not even implement SIOCGHWTSTAMP handling, only SIOCSHWTSTAMP, so it makes sense to completely avoid a SIOCGHWTSTAMP call if we can. The solution is to split the single "if" condition into 3 smaller ones, thus separating the decision to call ndo_hwtstamp_get() from the decision to call ndo_hwtstamp_set(). The third "if" condition is identical to the first one, and both are subsets of the second one. Thus, the "cfg" argument of kernel_hwtstamp_config_changed() is always valid. Reported-by: Eric Dumazet <edumazet@google.com> Closes: https://lore.kernel.org/netdev/CANn89iLOspJsvjPj+y8jikg7erXDomWe8sqHMdfL_2LQSFrPAg@mail.gmail.com/ Fixes: fd770e856e22 ("net: remove phy_has_hwtstamp() -> phy_mii_ioctl() decision from converted drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast ↵Yang Yingliang
address Use eth_broadcast_addr() to assign broadcast address instead of memset(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06ibmvnic: remove unused rc variableYu Liao
gcc with W=1 reports drivers/net/ethernet/ibm/ibmvnic.c:194:13: warning: variable 'rc' set but not used [-Wunused-but-set-variable] ^ This variable is not used so remove it. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308040609.zQsSXWXI-lkp@intel.com/ Signed-off-by: Yu Liao <liaoyu15@huawei.com> Reviewed-by: Nick Child <nnac123@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06macsec: use DEV_STATS_INC()Eric Dumazet
syzbot/KCSAN reported data-races in macsec whenever dev->stats fields are updated. It appears all of these updates can happen from multiple cpus. Adopt SMP safe DEV_STATS_INC() to update dev->stats fields. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06net: mana: Add page pool for RX buffersHaiyang Zhang
Add page pool for RX buffers for faster buffer cycle and reduce CPU usage. The standard page pool API is used. With iperf and 128 threads test, this patch improved the throughput by 12-15%, and decreased the IRQ associated CPU's usage from 99-100% to 10-50%. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06Merge branch 'gve-desc'David S. Miller
Rushil Gupta says: ==================== gve: Add QPL mode for DQO descriptor format GVE supports QPL ("queue-page-list") mode where all data is communicated through a set of pre-registered pages. Adding this mode to DQO. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06gve: update gve.rstRushil Gupta
Add a note about QPL and RDA mode Signed-off-by: Rushil Gupta <rushilg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06gve: RX path for DQO-QPLRushil Gupta
The RX path allocates the QPL page pool at queue creation, and tries to reuse these pages through page recycling. This patch ensures that on refill no non-QPL pages are posted to the device. When the driver is running low on free buffers, an ondemand allocation step kicks in that allocates a non-qpl page for SKB business to free up the QPL page in use. gve_try_recycle_buf was moved to gve_rx_append_frags so that driver does not attempt to mark buffer as used if a non-qpl page was allocated ondemand. Signed-off-by: Rushil Gupta <rushilg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06gve: Tx path for DQO-QPLRushil Gupta
Each QPL page is divided into GVE_TX_BUFS_PER_PAGE_DQO buffers. When a packet needs to be transmitted, we break the packet into max GVE_TX_BUF_SIZE_DQO sized chunks and transmit each chunk using a TX descriptor. We allocate the TX buffers from the free list in dqo_tx. We store these TX buffer indices in an array in the pending_packet structure. The TX buffers are returned to the free list in dqo_compl after receiving packet completion or when removing packets from miss completions list. Signed-off-by: Rushil Gupta <rushilg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06gve: Control path for DQO-QPLRushil Gupta
GVE supports QPL ("queue-page-list") mode where all data is communicated through a set of pre-registered pages. Adding this mode to DQO descriptor format. Add checks, abi-changes and device options to support QPL mode for DQO in addition to GQI. Also, use pages-per-qpl supplied by device-option to control the size of the "queue-page-list". Signed-off-by: Rushil Gupta <rushilg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06net: tls: avoid discarding data on record closeJakub Kicinski
TLS records end with a 16B tag. For TLS device offload we only need to make space for this tag in the stream, the device will generate and replace it with the actual calculated tag. Long time ago the code would just re-reference the head frag which mostly worked but was suboptimal because it prevented TCP from combining the record into a single skb frag. I'm not sure if it was correct as the first frag may be shorter than the tag. The commit under fixes tried to replace that with using the page frag and if the allocation failed rolling back the data, if record was long enough. It achieves better fragment coalescing but is also buggy. We don't roll back the iterator, so unless we're at the end of send we'll skip the data we designated as tag and start the next record as if the rollback never happened. There's also the possibility that the record was constructed with MSG_MORE and the data came from a different syscall and we already told the user space that we "got it". Allocate a single dummy page and use it as fallback. Found by code inspection, and proven by forcing allocation failures. Fixes: e7b159a48ba6 ("net/tls: remove the record tail optimization") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06Merge branch 'tcp-options-lockless'David S. Miller
Eric Dumazet says: ==================== tcp: set few options locklessly This series is avoiding the socket lock for six TCP options. They are not heavily used, but this exercise can give ideas for other parts of TCP/IP stack :) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06tcp: set TCP_DEFER_ACCEPT locklesslyEric Dumazet
rskq_defer_accept field can be read/written without the need of holding the socket lock. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06tcp: set TCP_LINGER2 locklesslyEric Dumazet
tp->linger2 can be set locklessly as long as readers use READ_ONCE(). Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06tcp: set TCP_KEEPCNT locklesslyEric Dumazet
tp->keepalive_probes can be set locklessly, readers are already taking care of this field being potentially set by other threads. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06tcp: set TCP_KEEPINTVL locklesslyEric Dumazet
tp->keepalive_intvl can be set locklessly, readers are already taking care of this field being potentially set by other threads. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06tcp: set TCP_USER_TIMEOUT locklesslyEric Dumazet
icsk->icsk_user_timeout can be set locklessly, if all read sides use READ_ONCE(). Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06tcp: set TCP_SYNCNT locklesslyEric Dumazet
icsk->icsk_syn_retries can safely be set without locking the socket. We have to add READ_ONCE() annotations in tcp_fastopen_synack_timer() and tcp_write_timeout(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-05Merge tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linuxLinus Torvalds
Pull rust fixes from Miguel Ojeda: - Allocator: prevent mis-aligned allocation - Types: delete 'ForeignOwnable::borrow_mut'. A sound replacement is planned for the merge window - Build: fix bindgen error with UBSAN_BOUNDS_STRICT * tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linux: rust: fix bindgen build error with UBSAN_BOUNDS_STRICT rust: delete `ForeignOwnable::borrow_mut` rust: allocator: Prevent mis-aligned allocation
2023-08-05ksmbd: fix wrong next length validation of ea buffer in smb2_set_ea()Namjae Jeon
There are multiple smb2_ea_info buffers in FILE_FULL_EA_INFORMATION request from client. ksmbd find next smb2_ea_info using ->NextEntryOffset of current smb2_ea_info. ksmbd need to validate buffer length Before accessing the next ea. ksmbd should check buffer length using buf_len, not next variable. next is the start offset of current ea that got from previous ea. Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21598 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-08-05ksmbd: validate command request sizeLong Li
In commit 2b9b8f3b68ed ("ksmbd: validate command payload size"), except for SMB2_OPLOCK_BREAK_HE command, the request size of other commands is not checked, it's not expected. Fix it by add check for request size of other commands. Cc: stable@vger.kernel.org Fixes: 2b9b8f3b68ed ("ksmbd: validate command payload size") Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Long Li <leo.lilong@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-08-05Merge tag 'ata-6.5-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fix from Damien Le Moal: - Prevent the scsi disk driver from issuing a START STOP UNIT command for ATA devices during system resume as this causes various issues reported by multiple users. * tag 'ata-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata,scsi: do not issue START STOP UNIT on resume
2023-08-05zram: take device and not only bvec offset into accountChristoph Hellwig
Commit af8b04c63708 ("zram: simplify bvec iteration in __zram_make_request") changed the bio iteration in zram to rely on the implicit capping to page boundaries in bio_for_each_segment. But it failed to care for the fact zram not only care about the page alignment of the bio payload, but also the page alignment into the device. For buffered I/O and swap those are the same, but for direct I/O or kernel internal I/O like XFS log buffer writes they can differ. Fix this by open coding bio_for_each_segment and limiting the bvec len so that it never crosses over a page alignment boundary in the device in addition to the payload boundary already taken care of by bio_iter_iovec. Cc: stable@vger.kernel.org Fixes: af8b04c63708 ("zram: simplify bvec iteration in __zram_make_request") Reported-by: Dusty Mabe <dusty@dustymabe.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org> Link: https://lore.kernel.org/r/20230805055537.147835-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-08-05Merge tag '6.5-rc4-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fix from Steve French: - Fix DFS interlink problem (different namespace) * tag '6.5-rc4-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix dfs link mount against w2k8
2023-08-05Merge tag 'powerpc-6.5-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix vmemmap altmap boundary check which could cause memory hotunplug failure - Create a dummy stackframe to fix ftrace stack unwind - Fix secondary thread bringup for Book3E ELFv2 kernels - Use early_ioremap/unmap() in via_calibrate_decr() Thanks to Aneesh Kumar K.V, Benjamin Gray, Christophe Leroy, David Hildenbrand, and Naveen N Rao. * tag 'powerpc-6.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/powermac: Use early_* IO variants in via_calibrate_decr() powerpc/64e: Fix secondary thread bringup for ELFv2 kernels powerpc/ftrace: Create a dummy stackframe to fix stack unwind powerpc/mm/altmap: Fix altmap boundary check
2023-08-05Merge tag 'parisc-for-6.5-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc architecture fixes from Helge Deller: - early fixmap preallocation to fix boot failures on kernel >= 6.4 - remove DMA leftover code in parport_gsc - drop old comments and code style fixes * tag 'parisc-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: unaligned: Add required spaces after ',' parport: gsc: remove DMA leftover code parisc: pci-dma: remove unused and dead EISA code and comment parisc/mm: preallocate fixmap page tables at init
2023-08-05Merge tag 'icc-6.5-rc5' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-linus Georgi writes: interconnect fixes for v6.5-rc This contains a fix for a potential issue on some Qualcomm SoCs where bit-masks should have been used to configure the Bus Clock Manager hardware, instead of bandwidth units. - interconnect: qcom: Add support for mask-based BCMs - interconnect: qcom: sm8450: add enable_mask for bcm nodes - interconnect: qcom: sm8550: add enable_mask for bcm nodes - interconnect: qcom: sa8775p: add enable_mask for bcm nodes Signed-off-by: Georgi Djakov <djakov@kernel.org> * tag 'icc-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc: interconnect: qcom: sa8775p: add enable_mask for bcm nodes interconnect: qcom: sm8550: add enable_mask for bcm nodes interconnect: qcom: sm8450: add enable_mask for bcm nodes interconnect: qcom: Add support for mask-based BCMs
2023-08-04Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A few clk driver fixes for some SoC clk drivers: - Change a usleep() to udelay() to avoid scheduling while atomic in the Amlogic PLL code - Revert a patch to the Mediatek MT8183 driver that caused an out-of-bounds write - Return the right error value when devm_of_iomap() fails in imx93_clocks_probe() - Constrain the Kconfig for the fixed mmio clk so that it depends on HAS_IOMEM and can't be compiled on architectures such as s390" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: fixed-mmio: make COMMON_CLK_FIXED_MMIO depend on HAS_IOMEM clk: imx93: Propagate correct error in imx93_clocks_probe() clk: mediatek: mt8183: Add back SSPM related clocks clk: meson: change usleep_range() to udelay() for atomic context
2023-08-04Merge tag 'wireless-next-2023-08-04' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.6 The first pull request for v6.6 and only driver patches this time. Nothing special really standing out, it has been quiet most likely due to vacations. Major changes: rtl8xxxu - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU), RTL8192EU and RTL8723BU mwifiex - allow moving to a different namespace mt76 - preparation for mt7925 support - mt7981 support ath12k - Extremely High Throughput (EHT) PHY support for Wi-Fi 7 * tag 'wireless-next-2023-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (172 commits) wifi: rtw89: return failure if needed firmware elements are not recognized wifi: rtw89: add to parse firmware elements of BB and RF tables wifi: rtw89: introduce infrastructure of firmware elements wifi: rtw89: add firmware suit for BB MCU 0/1 wifi: rtw89: add firmware parser for v1 format wifi: rtw89: introduce v1 format of firmware header wifi: rtw89: support firmware log with formatted text wifi: rtw89: recognize log format from firmware file wifi: ath12k: avoid deadlock by change ieee80211_queue_work for regd_update_work wifi: ath12k: add handler for scan event WMI_SCAN_EVENT_DEQUEUED wifi: ath12k: relax list iteration in ath12k_mac_vif_unref() wifi: ath12k: configure puncturing bitmap wifi: ath12k: parse WMI service ready ext2 event wifi: ath12k: add MLO header in peer association wifi: ath12k: peer assoc for 320 MHz wifi: ath12k: add WMI support for EHT peer wifi: ath12k: prepare EHT peer assoc parameters wifi: ath12k: add EHT PHY modes wifi: ath12k: propagate EHT capabilities to userspace wifi: ath12k: WMI support to process EHT capabilities ... ==================== Link: https://lore.kernel.org/r/87msz7j942.fsf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04Merge branch 'tcp-disable-header-prediction-for-md5'Jakub Kicinski
Kuniyuki Iwashima says: ==================== tcp: Disable header prediction for MD5. The 1st patch disable header prediction for MD5 flow and the 2nd patch updates the stale comment in tcp_parse_options(). ==================== Link: https://lore.kernel.org/r/20230803224552.69398-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04tcp: Update stale comment for MD5 in tcp_parse_options().Kuniyuki Iwashima
Since commit 9ea88a153001 ("tcp: md5: check md5 signature without socket lock"), the MD5 option is checked in tcp_v[46]_rcv(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230803224552.69398-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04tcp: Disable header prediction for MD5 flow.Kuniyuki Iwashima
TCP socket saves the minimum required header length in tcp_header_len of struct tcp_sock, and later the value is used in __tcp_fast_path_on() to generate a part of TCP header in tcp_sock(sk)->pred_flags. In tcp_rcv_established(), if the incoming packet has the same pattern with pred_flags, we enter the fast path and skip full option parsing. The MD5 option is parsed in tcp_v[46]_rcv(), so we need not parse it again later in tcp_rcv_established() unless other options exist. We add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len in two paths to avoid the slow path. For passive open connections with MD5, we add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len in tcp_create_openreq_child() after 3WHS. On the other hand, we do it in tcp_connect_init() for active open connections. However, the value is overwritten while processing SYN+ACK or crossed SYN in tcp_rcv_synsent_state_process(). These two cases will have the wrong value in pred_flags and never go into the fast path. We could update tcp_header_len in tcp_rcv_synsent_state_process(), but a test with slightly modified netperf which uses MD5 for each flow shows that the slow path is actually a bit faster than the fast path. On c5.4xlarge EC2 instance (16 vCPU, 32 GiB mem) $ for i in {1..10}; do ./super_netperf $(nproc) -H localhost -l 10 -- -m 256 -M 256; done Avg of 10 * 36e68eadd303 : 10.376 Gbps * all fast path : 10.374 Gbps (patch v2, See Link) * all slow path : 10.394 Gbps The header prediction is not worth adding complexity for MD5, so let's disable it for MD5. Link: https://lore.kernel.org/netdev/20230803042214.38309-1-kuniyu@amazon.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230803224552.69398-2-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04dccp: fix data-race around dp->dccps_mss_cacheEric Dumazet
dccp_sendmsg() reads dp->dccps_mss_cache before locking the socket. Same thing in do_dccp_getsockopt(). Add READ_ONCE()/WRITE_ONCE() annotations, and change dccp_sendmsg() to check again dccps_mss_cache after socket is locked. Fixes: 7c657876b63c ("[DCCP]: Initial implementation") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230803163021.2958262-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04Merge branch 'mptcp-more-fixes-for-v6-5'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: more fixes for v6.5 Here is a new batch of fixes related to MPTCP for v6.5 and older. Patches 1 and 2 fix issues with MPTCP Join selftest when manually launched with '-i' parameter to use 'ip mptcp' tool instead of the dedicated one (pm_nl_ctl). The issues have been there since v5.18. Thank you Andrea for your first contributions to MPTCP code in the upstream kernel! Patch 3 avoids corrupting the data stream when trying to reset connections that have fallen back to TCP. This can happen from v6.1. Patch 4 fixes a race when doing a disconnect() and an accept() in parallel on a listener socket. The issue only happens in rare cases if the user is really unlucky since a fix that landed in v6.3 but backported up to v6.1. ==================== Link: https://lore.kernel.org/r/20230803-upstream-net-20230803-misc-fixes-6-5-v1-0-6671b1ab11cc@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04mptcp: fix disconnect vs accept racePaolo Abeni
Despite commit 0ad529d9fd2b ("mptcp: fix possible divide by zero in recvmsg()"), the mptcp protocol is still prone to a race between disconnect() (or shutdown) and accept. The root cause is that the mentioned commit checks the msk-level flag, but mptcp_stream_accept() does acquire the msk-level lock, as it can rely directly on the first subflow lock. As reported by Christoph than can lead to a race where an msk socket is accepted after that mptcp_subflow_queue_clean() releases the listener socket lock and just before it takes destructive actions leading to the following splat: BUG: kernel NULL pointer dereference, address: 0000000000000012 PGD 5a4ca067 P4D 5a4ca067 PUD 37d4c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 2 PID: 10955 Comm: syz-executor.5 Not tainted 6.5.0-rc1-gdc7b257ee5dd #37 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:mptcp_stream_accept+0x1ee/0x2f0 include/net/inet_sock.h:330 Code: 0a 09 00 48 8b 1b 4c 39 e3 74 07 e8 bc 7c 7f fe eb a1 e8 b5 7c 7f fe 4c 8b 6c 24 08 eb 05 e8 a9 7c 7f fe 49 8b 85 d8 09 00 00 <0f> b6 40 12 88 44 24 07 0f b6 6c 24 07 bf 07 00 00 00 89 ee e8 89 RSP: 0018:ffffc90000d07dc0 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff888037e8d020 RCX: ffff88803b093300 RDX: 0000000000000000 RSI: ffffffff833822c5 RDI: ffffffff8333896a RBP: 0000607f82031520 R08: ffff88803b093300 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000003e83 R12: ffff888037e8d020 R13: ffff888037e8c680 R14: ffff888009af7900 R15: ffff888009af6880 FS: 00007fc26d708640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000012 CR3: 0000000066bc5001 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> do_accept+0x1ae/0x260 net/socket.c:1872 __sys_accept4+0x9b/0x110 net/socket.c:1913 __do_sys_accept4 net/socket.c:1954 [inline] __se_sys_accept4 net/socket.c:1951 [inline] __x64_sys_accept4+0x20/0x30 net/socket.c:1951 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Address the issue by temporary removing the pending request socket from the accept queue, so that racing accept() can't touch them. After depleting the msk - the ssk still exists, as plain TCP sockets, re-insert them into the accept queue, so that later inet_csk_listen_stop() will complete the tcp socket disposal. Fixes: 2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/423 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Link: https://lore.kernel.org/r/20230803-upstream-net-20230803-misc-fixes-6-5-v1-4-6671b1ab11cc@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04mptcp: avoid bogus reset on fallback closePaolo Abeni
Since the blamed commit, the MPTCP protocol unconditionally sends TCP resets on all the subflows on disconnect(). That fits full-blown MPTCP sockets - to implement the fastclose mechanism - but causes unexpected corruption of the data stream, caught as sporadic self-tests failures. Fixes: d21f83485518 ("mptcp: use fastclose on more edge scenarios") Cc: stable@vger.kernel.org Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/419 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Link: https://lore.kernel.org/r/20230803-upstream-net-20230803-misc-fixes-6-5-v1-3-6671b1ab11cc@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>