summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-11-08dmaengine: at_hdmac: Fix concurrency over the active listTudor Ambarus
The tasklet (atc_advance_work()) did not held the channel lock when retrieving the first active descriptor, causing concurrency problems if issue_pending() was called in between. If issue_pending() was called exactly after the lock was released in the tasklet (atc_advance_work()), atc_chain_complete() could complete a descriptor for which the controller has not yet raised an interrupt. Fixes: dc78baa2b90b ("dmaengine: at_hdmac: new driver for the Atmel AHB DMA Controller") Reported-by: Peter Rosin <peda@axentia.se> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/13c6c9a2-6db5-c3bf-349b-4c127ad3496a@axentia.se/ Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20221025090306.297886-1-tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20221025090306.297886-11-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: at_hdmac: Free the memset buf without holding the chan lockTudor Ambarus
There's no need to hold the channel lock when freeing the memset buf, as the operation has already completed. Free the memset buf without holding the channel lock. Fixes: 4d112426c344 ("dmaengine: hdmac: Add memset capabilities") Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: stable@vger.kernel.org Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20221025090306.297886-1-tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20221025090306.297886-10-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: at_hdmac: Fix concurrency over descriptorTudor Ambarus
The descriptor was added to the free_list before calling the callback, which could result in reissuing of the same descriptor and calling of a single callback for both. Move the decriptor to the free list after the callback is invoked. Fixes: dc78baa2b90b ("dmaengine: at_hdmac: new driver for the Atmel AHB DMA Controller") Reported-by: Peter Rosin <peda@axentia.se> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/13c6c9a2-6db5-c3bf-349b-4c127ad3496a@axentia.se/ Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20221025090306.297886-1-tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20221025090306.297886-9-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: at_hdmac: Fix concurrency problems by removing atc_complete_all()Tudor Ambarus
atc_complete_all() had concurrency bugs, thus remove it: 1/ atc_complete_all() in its entirety was buggy, as when the atchan->queue list (the one that contains descriptors that are not yet issued to the hardware) contained descriptors, it fired just the first from the atchan->queue, but moved all the desc from atchan->queue to atchan->active_list and considered them all as fired. This could result in calling the completion of a descriptor that was not yet issued to the hardware. 2/ when in tasklet at atc_advance_work() time, atchan->active_list was queried without holding the lock of the chan. This can result in atchan->active_list concurrency problems between the tasklet and issue_pending(). Fixes: dc78baa2b90b ("dmaengine: at_hdmac: new driver for the Atmel AHB DMA Controller") Reported-by: Peter Rosin <peda@axentia.se> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/13c6c9a2-6db5-c3bf-349b-4c127ad3496a@axentia.se/ Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20221025090306.297886-1-tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20221025090306.297886-8-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: at_hdmac: Protect atchan->status with the channel lockTudor Ambarus
Now that the complete callback call was removed from device_terminate_all(), we can protect the atchan->status with the channel lock. The atomic bitops on atchan->status do not substitute proper locking on the status, as one could still modify the status after the lock was dropped in atc_terminate_all() but before the atomic bitops were executed. Fixes: 078a6506141a ("dmaengine: at_hdmac: Fix deadlocks") Reported-by: Peter Rosin <peda@axentia.se> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/13c6c9a2-6db5-c3bf-349b-4c127ad3496a@axentia.se/ Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20221025090306.297886-1-tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20221025090306.297886-7-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: at_hdmac: Do not call the complete callback on device_terminate_allTudor Ambarus
The method was wrong because it violated the dmaengine API. For aborted transfers the complete callback should not be called. Fix the behavior and do not call the complete callback on device_terminate_all. Fixes: 808347f6a317 ("dmaengine: at_hdmac: add DMA slave transfers") Reported-by: Peter Rosin <peda@axentia.se> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/13c6c9a2-6db5-c3bf-349b-4c127ad3496a@axentia.se/ Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20221025090306.297886-1-tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20221025090306.297886-6-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: at_hdmac: Fix premature completion of desc in issue_pendingTudor Ambarus
Multiple calls to atc_issue_pending() could result in a premature completion of a descriptor from the atchan->active list, as the method always completed the first active descriptor from the list. Instead, issue_pending() should just take the first transaction descriptor from the pending queue, move it to active_list and start the transfer. Fixes: dc78baa2b90b ("dmaengine: at_hdmac: new driver for the Atmel AHB DMA Controller") Reported-by: Peter Rosin <peda@axentia.se> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/13c6c9a2-6db5-c3bf-349b-4c127ad3496a@axentia.se/ Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20221025090306.297886-1-tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20221025090306.297886-5-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: at_hdmac: Start transfer for cyclic channels in issue_pendingTudor Ambarus
Cyclic channels must too call issue_pending in order to start a transfer. Start the transfer in issue_pending regardless of the type of channel. This wrongly worked before, because in the past the transfer was started at tx_submit level when only a desc in the transfer list. Fixes: 53830cc75974 ("dmaengine: at_hdmac: add cyclic DMA operation support") Reported-by: Peter Rosin <peda@axentia.se> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/13c6c9a2-6db5-c3bf-349b-4c127ad3496a@axentia.se/ Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20221025090306.297886-1-tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20221025090306.297886-4-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: at_hdmac: Don't start transactions at tx_submit levelTudor Ambarus
tx_submit is supposed to push the current transaction descriptor to a pending queue, waiting for issue_pending() to be called. issue_pending() must start the transfer, not tx_submit(), thus remove atc_dostart() from atc_tx_submit(). Clients of at_xdmac that assume that tx_submit() starts the transfer must be updated and call dma_async_issue_pending() if they miss to call it. The vdbg print was moved to after the lock is released. It is desirable to do the prints without the lock held if possible, and because the if statement disappears there's no reason why to do the print while holding the lock. Fixes: dc78baa2b90b ("dmaengine: at_hdmac: new driver for the Atmel AHB DMA Controller") Reported-by: Peter Rosin <peda@axentia.se> Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/13c6c9a2-6db5-c3bf-349b-4c127ad3496a@axentia.se/ Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20221025090306.297886-1-tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20221025090306.297886-3-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: at_hdmac: Fix at_lli struct definitionTudor Ambarus
Those hardware registers are all of 32 bits, while dma_addr_t ca be of type u64 or u32 depending on CONFIG_ARCH_DMA_ADDR_T_64BIT. Force u32 to comply with what the hardware expects. Fixes: dc78baa2b90b ("dmaengine: at_hdmac: new driver for the Atmel AHB DMA Controller") Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Cc: stable@vger.kernel.org Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20221025090306.297886-1-tudor.ambarus@microchip.com Link: https://lore.kernel.org/r/20221025090306.297886-2-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: stm32-dma: fix potential race between pause and resumeAmelie Delaunay
When disabling dma channel, a TCF flag is set and as TCIE is enabled, an interrupt is raised. On a busy system, the interrupt may have latency and the user can ask for dmaengine_resume while stm32-dma driver has not yet managed the complete pause (backup of registers to restore state in resume). To avoid such a case, instead of waiting the interrupt to backup the registers, do it just after disabling the channel and discard Transfer Complete interrupt in case the channel is paused. Fixes: 099a9a94be0e ("dmaengine: stm32-dma: add device_pause/device_resume support") Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com> Link: https://lore.kernel.org/r/20221024083611.132588-1-amelie.delaunay@foss.st.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: ti: k3-udma-glue: fix memory leak when register device failYang Yingliang
If device_register() fails, it should call put_device() to give up reference, the name allocated in dev_set_name() can be freed in callback function kobject_cleanup(). Fixes: 5b65781d06ea ("dmaengine: ti: k3-udma-glue: Add support for K3 PKTDMA") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com> Link: https://lore.kernel.org/r/20221020062827.2914148-1-yangyingliang@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: mv_xor_v2: Fix a resource leak in mv_xor_v2_remove()Christophe JAILLET
A clk_prepare_enable() call in the probe is not balanced by a corresponding clk_disable_unprepare() in the remove function. Add the missing call. Fixes: 3cd2c313f1d6 ("dmaengine: mv_xor_v2: Fix clock resource by adding a register clock") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/e9e3837a680c9bd2438e4db2b83270c6c052d005.1666640987.git.christophe.jaillet@wanadoo.fr Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: apple-admac: Fix grabbing of channels in of_xlateMartin Povišer
The of_xlate callback is supposed to return the channel after already having 'grabbed' it for private use, so fill that in. Fixes: b127315d9a78 ("dmaengine: apple-admac: Add Apple ADMAC driver") Signed-off-by: Martin Povišer <povik+lin@cutebit.org> Link: https://lore.kernel.org/r/20221019132324.8585-1-povik+lin@cutebit.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: idxd: fix RO device state error after been disabled/resetFengqian Gao
When IDXD is not configurable, that means its WQ, engine, and group configurations cannot be changed. But it can be disabled and its state should be set as disabled regardless it's configurable or not. Fix this by setting device state IDXD_DEV_DISABLED for read-only device as well in idxd_device_clear_state(). Fixes: cf4ac3fef338 ("dmaengine: idxd: fix lockdep warning on device driver removal") Signed-off-by: Fengqian Gao <fengqian.gao@intel.com> Reviewed-by: Xiaochen Shen <xiaochen.shen@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220930032835.2290-1-fengqian.gao@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: idxd: Fix max batch size for Intel IAAXiaochen Shen
>From Intel IAA spec [1], Intel IAA does not support batch processing. Two batch related default values for IAA are incorrect in current code: (1) The max batch size of device is set during device initialization, that indicates batch is supported. It should be always 0 on IAA. (2) The max batch size of work queue is set to WQ_DEFAULT_MAX_BATCH (32) as the default value regardless of Intel DSA or IAA device during work queue setup and cleanup. It should be always 0 on IAA. Fix the issues by setting the max batch size of device and max batch size of work queue to 0 on IAA device, that means batch is not supported. [1]: https://cdrdv2.intel.com/v1/dl/getContent/721858 Fixes: 23084545dbb0 ("dmaengine: idxd: set max_xfer and max_batch for RO device") Fixes: 92452a72ebdf ("dmaengine: idxd: set defaults for wq configs") Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20220930201528.18621-2-xiaochen.shen@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-08dmaengine: pxa_dma: use platform_get_irq_optionalDoug Brown
The first IRQ is required, but IRQs 1 through (nb_phy_chans - 1) are optional, because on some platforms (e.g. PXA168) there is a single IRQ shared between all channels. This change inhibits a flood of "IRQ index # not found" messages at startup. Tested on a PXA168-based device. Fixes: 7723f4c5ecdb ("driver core: platform: Add an error message to platform_get_irq*()") Signed-off-by: Doug Brown <doug@schmorgal.com> Link: https://lore.kernel.org/r/20220906000709.52705-1-doug@schmorgal.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-11-07Merge branch 'sctp-fix-a-null-pointer-dereference-in-sctp_sched_dequeue_common'Jakub Kicinski
Xin Long says: ==================== sctp: fix a NULL pointer dereference in sctp_sched_dequeue_common This issue was triggered with SCTP_PR_SCTP_PRIO in sctp, and caused by not checking and fixing stream->out_curr after removing a chunk from this stream. Patch 1 removes an unnecessary check and makes the real fix easier to add in Patch 2. ==================== Link: https://lore.kernel.org/r/cover.1667598261.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-07sctp: clear out_curr if all frag chunks of current msg are prunedXin Long
A crash was reported by Zhen Chen: list_del corruption, ffffa035ddf01c18->next is NULL WARNING: CPU: 1 PID: 250682 at lib/list_debug.c:49 __list_del_entry_valid+0x59/0xe0 RIP: 0010:__list_del_entry_valid+0x59/0xe0 Call Trace: sctp_sched_dequeue_common+0x17/0x70 [sctp] sctp_sched_fcfs_dequeue+0x37/0x50 [sctp] sctp_outq_flush_data+0x85/0x360 [sctp] sctp_outq_uncork+0x77/0xa0 [sctp] sctp_cmd_interpreter.constprop.0+0x164/0x1450 [sctp] sctp_side_effects+0x37/0xe0 [sctp] sctp_do_sm+0xd0/0x230 [sctp] sctp_primitive_SEND+0x2f/0x40 [sctp] sctp_sendmsg_to_asoc+0x3fa/0x5c0 [sctp] sctp_sendmsg+0x3d5/0x440 [sctp] sock_sendmsg+0x5b/0x70 and in sctp_sched_fcfs_dequeue() it dequeued a chunk from stream out_curr outq while this outq was empty. Normally stream->out_curr must be set to NULL once all frag chunks of current msg are dequeued, as we can see in sctp_sched_dequeue_done(). However, in sctp_prsctp_prune_unsent() as it is not a proper dequeue, sctp_sched_dequeue_done() is not called to do this. This patch is to fix it by simply setting out_curr to NULL when the last frag chunk of current msg is dequeued from out_curr stream in sctp_prsctp_prune_unsent(). Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations") Reported-by: Zhen Chen <chenzhen126@huawei.com> Tested-by: Caowangbao <caowangbao@huawei.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-07sctp: remove the unnecessary sinfo_stream check in sctp_prsctp_prune_unsentXin Long
Since commit 5bbbbe32a431 ("sctp: introduce stream scheduler foundations"), sctp_stream_outq_migrate() has been called in sctp_stream_init/update to removes those chunks to streams higher than the new max. There is no longer need to do such check in sctp_prsctp_prune_unsent(). Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-07tipc: fix the msg->req tlv len check in tipc_nl_compat_name_table_dump_headerXin Long
This is a follow-up for commit 974cb0e3e7c9 ("tipc: fix uninit-value in tipc_nl_compat_name_table_dump") where it should have type casted sizeof(..) to int to work when TLV_GET_DATA_LEN() returns a negative value. syzbot reported a call trace because of it: BUG: KMSAN: uninit-value in ... tipc_nl_compat_name_table_dump+0x841/0xea0 net/tipc/netlink_compat.c:934 __tipc_nl_compat_dumpit+0xab2/0x1320 net/tipc/netlink_compat.c:238 tipc_nl_compat_dumpit+0x991/0xb50 net/tipc/netlink_compat.c:321 tipc_nl_compat_recv+0xb6e/0x1640 net/tipc/netlink_compat.c:1324 genl_family_rcv_msg_doit net/netlink/genetlink.c:731 [inline] genl_family_rcv_msg net/netlink/genetlink.c:775 [inline] genl_rcv_msg+0x103f/0x1260 net/netlink/genetlink.c:792 netlink_rcv_skb+0x3a5/0x6c0 net/netlink/af_netlink.c:2501 genl_rcv+0x3c/0x50 net/netlink/genetlink.c:803 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0xf3b/0x1270 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x1288/0x1440 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] Reported-by: syzbot+e5dbaaa238680ce206ea@syzkaller.appspotmail.com Fixes: 974cb0e3e7c9 ("tipc: fix uninit-value in tipc_nl_compat_name_table_dump") Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/ccd6a7ea801b15aec092c3b532a883b4c5708695.1667594933.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-08scsi: scsi_debug: Make the READ CAPACITY response compliant with ZBCBart Van Assche
From ZBC-1: - RC BASIS = 0: The RETURNED LOGICAL BLOCK ADDRESS field indicates the highest LBA of a contiguous range of zones that are not sequential write required zones starting with the first zone. - RC BASIS = 1: The RETURNED LOGICAL BLOCK ADDRESS field indicates the LBA of the last logical block on the logical unit. The current scsi_debug READ CAPACITY response does not comply with the above if there are one or more sequential write required zones. SCSI initiators need a way to retrieve the largest valid LBA from SCSI devices. Reporting the largest valid LBA if there are one or more sequential zones requires to set the RC BASIS field in the READ CAPACITY response to one. Hence this patch. Cc: Douglas Gilbert <dgilbert@interlog.com> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221102193248.3177608-1-bvanassche@acm.org Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-07net: broadcom: Fix BCMGENET KconfigYueHaibing
While BCMGENET select BROADCOM_PHY as y, but PTP_1588_CLOCK_OPTIONAL is m, kconfig warning and build errors: WARNING: unmet direct dependencies detected for BROADCOM_PHY Depends on [m]: NETDEVICES [=y] && PHYLIB [=y] && PTP_1588_CLOCK_OPTIONAL [=m] Selected by [y]: - BCMGENET [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_BROADCOM [=y] && HAS_IOMEM [=y] && ARCH_BCM2835 [=y] drivers/net/phy/broadcom.o: In function `bcm54xx_suspend': broadcom.c:(.text+0x6ac): undefined reference to `bcm_ptp_stop' drivers/net/phy/broadcom.o: In function `bcm54xx_phy_probe': broadcom.c:(.text+0x784): undefined reference to `bcm_ptp_probe' drivers/net/phy/broadcom.o: In function `bcm54xx_config_init': broadcom.c:(.text+0xd4c): undefined reference to `bcm_ptp_config_init' Fixes: 99addbe31f55 ("net: broadcom: Select BROADCOM_PHY for BCMGENET") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Florian Fainelli <f.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20221105090245.8508-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-07octeontx2-pf: fix build error when CONFIG_OCTEONTX2_PF=yYang Yingliang
If CONFIG_MACSEC=m and CONFIG_OCTEONTX2_PF=y, it leads a build error: ld: drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.o: in function `otx2_pfaf_mbox_up_handler': otx2_pf.c:(.text+0x181c): undefined reference to `cn10k_handle_mcs_event' ld: drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.o: in function `otx2_probe': otx2_pf.c:(.text+0x437e): undefined reference to `cn10k_mcs_init' ld: drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.o: in function `otx2_remove': otx2_pf.c:(.text+0x5031): undefined reference to `cn10k_mcs_free' ld: drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.o: in function `otx2_mbox_up_handler_mcs_intr_notify': otx2_pf.c:(.text+0x5f11): undefined reference to `cn10k_handle_mcs_event' Make CONFIG_OCTEONTX2_PF depends on CONFIG_MACSEC to fix it. Because it has empty stub functions of cn10k, CONFIG_OCTEONTX2_PF can be enabled if CONFIG_MACSEC is disabled Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221105063442.2013981-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-07dt-bindings: net: tsnep: Fix typo on generic nvmem propertyMiquel Raynal
While working on the nvmem description I figured out this file had the "nvmem-cell-names" property name misspelled. Fix the typo, as "nvmem-cells-names" has never existed. Fixes: 603094b2cdb7 ("dt-bindings: net: Add tsnep Ethernet controller") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20221104162147.1288230-1-miquel.raynal@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-08scsi: scsi_transport_sas: Fix error handling in sas_phy_add()Yang Yingliang
If transport_add_device() fails in sas_phy_add(), the kernel will crash trying to delete the device in transport_remove_device() called from sas_remove_host(). Unable to handle kernel NULL pointer dereference at virtual address 0000000000000108 CPU: 61 PID: 42829 Comm: rmmod Kdump: loaded Tainted: G W 6.1.0-rc1+ #173 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : device_del+0x54/0x3d0 lr : device_del+0x37c/0x3d0 Call trace: device_del+0x54/0x3d0 attribute_container_class_device_del+0x28/0x38 transport_remove_classdev+0x6c/0x80 attribute_container_device_trigger+0x108/0x110 transport_remove_device+0x28/0x38 sas_phy_delete+0x30/0x60 [scsi_transport_sas] do_sas_phy_delete+0x6c/0x80 [scsi_transport_sas] device_for_each_child+0x68/0xb0 sas_remove_children+0x40/0x50 [scsi_transport_sas] sas_remove_host+0x20/0x38 [scsi_transport_sas] hisi_sas_remove+0x40/0x68 [hisi_sas_main] hisi_sas_v2_remove+0x20/0x30 [hisi_sas_v2_hw] platform_remove+0x2c/0x60 Fix this by checking and handling return value of transport_add_device() in sas_phy_add(). Fixes: c7ebbbce366c ("[SCSI] SAS transport class") Suggested-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221107124828.115557-1-yangyingliang@huawei.com Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-07net: stmmac: dwmac-meson8b: fix meson8b_devm_clk_prepare_enable()Rasmus Villemoes
There are two problems with meson8b_devm_clk_prepare_enable(), introduced in commit a54dc4a49045 ("net: stmmac: dwmac-meson8b: Make the clock enabling code re-usable"): - It doesn't pass the clk argument, but instead always the rgmii_tx_clk of the device. - It silently ignores the return value of devm_add_action_or_reset(). The former didn't become an actual bug until another user showed up in the next commit 9308c47640d5 ("net: stmmac: dwmac-meson8b: add support for the RX delay configuration"). The latter means the callers could end up with the clock not actually prepared/enabled. Fixes: a54dc4a49045 ("net: stmmac: dwmac-meson8b: Make the clock enabling code re-usable") Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Link: https://lore.kernel.org/r/20221104083004.2212520-1-linux@rasmusvillemoes.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-07bpf: Add explicit cast to 'void *' for __BPF_DISPATCHER_UPDATE()Nathan Chancellor
When building with clang: kernel/bpf/dispatcher.c:126:33: error: pointer type mismatch ('void *' and 'unsigned int (*)(const void *, const struct bpf_insn *, bpf_func_t)' (aka 'unsigned int (*)(const void *, const struct bpf_insn *, unsigned int (*)(const void *, const struct bpf_insn *))')) [-Werror,-Wpointer-type-mismatch] __BPF_DISPATCHER_UPDATE(d, new ?: &bpf_dispatcher_nop_func); ~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/bpf.h:1045:54: note: expanded from macro '__BPF_DISPATCHER_UPDATE' __static_call_update((_d)->sc_key, (_d)->sc_tramp, (_new)) ^~~~ 1 error generated. The warning is pointing out that the type of new ('void *') and &bpf_dispatcher_nop_func are not compatible, which could have side effects coming out of a conditional operator due to promotion rules. Add the explicit cast to 'void *' to make it clear that this is expected, as __BPF_DISPATCHER_UPDATE() expands to a call to __static_call_update(), which expects a 'void *' as its final argument. Fixes: c86df29d11df ("bpf: Convert BPF_DISPATCHER to use static_call() (not ftrace)") Link: https://github.com/ClangBuiltLinux/linux/issues/1755 Reported-by: kernel test robot <lkp@intel.com> Reported-by: "kernelci.org bot" <bot@kernelci.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221107170711.42409-1-nathan@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-11-07fs/userfaultfd: Fix maple tree iterator in userfaultfd_unregister()Liam Howlett
When iterating the VMAs, the maple state needs to be invalidated if the tree is modified by a split or merge to ensure the maple tree node contained in the maple state is still valid. These invalidations were missed, so add them to the paths which alter the tree. Reported-by: syzbot+0d2014e4da2ccced5b41@syzkaller.appspotmail.com Fixes: 69dbe6daf104 (userfaultfd: use maple tree iterator to iterate VMAs) Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-11-07scsi: ibmvfc: Avoid path failures during live migrationBrian King
Fix an issue reported when performing a live migration when multipath is configured with a short fast fail timeout of 5 seconds and also to have no_path_retry set to fail. In this scenario, all paths would go into the devloss state while the ibmvfc driver went through discovery to log back in. On a loaded system, the discovery might take longer than 5 seconds, which was resulting in all paths being marked failed, which then resulted in a read only filesystem. This patch changes the migration code in ibmvfc to avoid deleting rports at all in this scenario, so we avoid losing all paths. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20221026181356.148517-1-brking@linux.vnet.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-07Input: soc_button_array - add Acer Switch V 10 to dmi_use_low_level_irq[]Hans de Goede
Like on the Acer Switch 10 SW5-012, the Acer Switch V 10 SW5-017's _LID method messes with home- and power-button GPIO IRQ settings, causing an IRQ storm. Add a quirk entry for the Acer Switch V 10 to the dmi_use_low_level_irq[] DMI quirk list, to use low-level IRQs on this model, fixing the IRQ storm. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20221106215320.67109-2-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-11-07Input: soc_button_array - add use_low_level_irq module parameterHans de Goede
It seems that the Windows drivers for the ACPI0011 soc_button_array device use low level triggered IRQs rather then using edge triggering. Some ACPI tables depend on this, directly poking the GPIO controller's registers to clear the trigger type when closing a laptop's/2-in-1's lid and re-instating the trigger when opening the lid again. Linux sets the edge/level on which to trigger to both low+high since it is using edge type IRQs, the ACPI tables then ends up also setting the bit for level IRQs and since both low and high level have been selected by Linux we get an IRQ storm leading to soft lockups. As a workaround for this the soc_button_array already contains a DMI quirk table with device models known to have this issue. Add a module parameter for this so that users can easily test if their device is affected too and so that they can use the module parameter as a workaround. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20221106215320.67109-1-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-11-07Input: iforce - invert valid length check when fetching device IDsTetsuo Handa
syzbot is reporting uninitialized value at iforce_init_device() [1], for commit 6ac0aec6b0a6 ("Input: iforce - allow callers supply data buffer when fetching device IDs") is checking that valid length is shorter than bytes to read. Since iforce_get_id_packet() stores valid length when returning 0, the caller needs to check that valid length is longer than or equals to bytes to read. Reported-by: syzbot <syzbot+4dd880c1184280378821@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 6ac0aec6b0a6 ("Input: iforce - allow callers supply data buffer when fetching device IDs") Link: https://lore.kernel.org/r/531fb432-7396-ad37-ecba-3e42e7f56d5c@I-love.SAKURA.ne.jp Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-11-07Merge tag 'platform-drivers-x86-v6.1-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "The most important fixes here are a set of fixes for the ACPI backlight detection refactor which landed in 6.1. These fix regressions reported on some laptop models by making acpi_video_backlight_use_native() always return true for now, which in essence undoes some of the changes. I plan to take another shot at having only 1 /sys/class/backlight class device per panel with 6.2, with modified detection heuristics to avoid the (known) regressions. Highlights: - ACPI: video: Fix regressions from 6.1 backlight refactor by making acpi_video_backlight_use_native() always return true for now - Misc other bugfixes and HW id additions" * tag 'platform-drivers-x86-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: p2sb: Don't fail if unknown CPU is found platform/x86/intel/hid: Add some ACPI device IDs platform/x86/intel/pmt: Sapphire Rapids PMT errata fix platform/x86: hp_wmi: Fix rfkill causing soft blocked wifi platform/x86: touchscreen_dmi: Add info for the RCA Cambio W101 v2 2-in-1 platform/x86: ideapad-laptop: Disable touchpad_switch ACPI: video: Add backlight=native DMI quirk for Dell G15 5515 ACPI: video: Make acpi_video_backlight_use_native() always return true ACPI: video: Improve Chromebook checks
2022-11-07mm, slab: remove duplicate kernel-doc comment for ksize()Vlastimil Babka
Akira reports: > "make htmldocs" reports duplicate C declaration of ksize() as follows: > /linux/Documentation/core-api/mm-api:43: ./mm/slab_common.c:1428: WARNING: Duplicate C declaration, also defined at core-api/mm-api:212. > Declaration is '.. c:function:: size_t ksize (const void *objp)'. > This is due to the kernel-doc comment for ksize() declaration added in > include/linux/slab.h by commit 05a940656e1e ("slab: Introduce > kmalloc_size_roundup()"). There is an older kernel-doc comment for ksize() definition in mm/slab_common.c, which is not only duplicated, but also contradicts the new one - the additional storage discovered by ksize() should not be used by callers anymore. Delete the old kernel-doc. Reported-by: Akira Yokosawa <akiyks@gmail.com> Link: https://lore.kernel.org/all/d33440f6-40cf-9747-3340-e54ffaf7afb8@gmail.com/ Fixes: 05a940656e1e ("slab: Introduce kmalloc_size_roundup()") Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-11-07mtd: onenand: omap2: add dependency on GPMCKrzysztof Kozlowski
OMAP2 OneNAND driver uses gpmc_omap_onenand_set_timings() provided by OMAP_GPMC driver, so the latter cannot be module if OneNAND driver is built-in: /usr/bin/arm-linux-gnueabi-ld: drivers/mtd/nand/onenand/onenand_omap2.o: in function `omap2_onenand_probe': onenand_omap2.c:(.text+0x520): undefined reference to `gpmc_omap_onenand_set_timings' The OMAP_GPMC is also a runtime dependency. Reported-by: kernel test robot <lkp@intel.com> Fixes: 854fd9209b20 ("memory: omap-gpmc: Allow building as a module") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20221107091520.127053-1-krzysztof.kozlowski@linaro.org
2022-11-07mtd: rawnand: placate "$VARIABLE is used uninitialized" warningsAdam Borowski
The compiler is not smart enough to notice that it's impossible for them to be actually used uninitialized. Which exact variables trip here varies depending on random surrounding code; none triggered in 6.1-rc1 but 6.1-rc2 fails on three of these five, despite variables declared in the very same line having identical flow. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20221024092026.42123-1-kilobyte@angband.pl
2022-11-07mtd: rawnand: qcom: handle ret from parse with codeword_fixupChristian Marangi
With use_codeword_fixup enabled, any return from mtd_device_parse_register gets overwritten. Aside from the clear bug, this is also problematic as a parser can EPROBE_DEFER and because this is not correctly handled, the nand is never rescanned later in the bootup process. An example of this problem is when smem requires additional time to be probed and nandc use qcomsmempart as parser. Parser will return EPROBE_DEFER but in the current code this ret gets overwritten by qcom_nand_host_parse_boot_partitions and qcom_nand_host_init_and_register return 0. Correctly handle the return code from mtd_device_parse_register so that any error from this function is not ignored. Fixes: 862bdedd7f4b ("mtd: nand: raw: qcom_nandc: add support for unprotected spare data pages") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20221021165304.19991-1-ansuelsmth@gmail.com
2022-11-07drm/panfrost: Remove type name from internal struct againSteven Price
Commit 72655fb942c1 ("drm/panfrost: replace endian-specific types with native ones") accidentally reverted part of the parent commit 7228d9d79248 ("drm/panfrost: Remove type name from internal structs") leading to the situation that the Panfrost UAPI header still doesn't compile correctly in C++. Revert the accidental revert and pass me a brown paper bag. Reported-by: Alyssa Rosenzweig <alyssa@collabora.com> Fixes: 72655fb942c1 ("drm/panfrost: replace endian-specific types with native ones") Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221103114036.1581854-1-steven.price@arm.com
2022-11-07pinctrl: rockchip: list all pins in a possible mux route for PX30Quentin Schulz
The mux routes are incomplete for the PX30. This was discovered because we had a HW design using cif-clkoutm1 with the correct pinmux in the Device Tree but the clock would still not work. There are actually two muxing required: the pin muxing (performed by the usual Device Tree pinctrl nodes) and the "function" muxing (m0 vs m1; performed by the mux routing inside the driver). The pin muxing was correct but the function muxing was not. This adds the missing pins and their configuration for the mux routes that are already specified in the driver. Note that there are some "conflicts": it is possible *in Device Tree* to (attempt to) mux the pins for e.g. clkoutm1 and clkinm0 at the same time but this is actually not possible in hardware (because both share the same bit for the function muxing). Since it is an impossible hardware design, it is not deemed necessary to prevent the user from attempting to "misconfigure" the pins/functions. Fixes: 87065ca9b8e5 ("pinctrl: rockchip: Add pinctrl support for PX30") Signed-off-by: Quentin Schulz <quentin.schulz@theobroma-systems.com> Link: https://lore.kernel.org/r/20221017-upstream-px30-cif-clkoutm1-v1-0-4ea1389237f7@theobroma-systems.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-11-07ASoC: sof_es8336: reduce pop noise on speakerZhu Ning
The Speaker GPIO needs to be turned on slightly behind the codec turned on. It also need to be turned off slightly before the codec turned down. Current code uses delay in DAPM_EVENT to do it but the mdelay delays the DAPM itself and thus has no effect. A delayed_work is added to turn on the speaker. The Speaker is turned off in .trigger since trigger is called slightly before the DAPM events. Signed-off-by: Zhu Ning <zhuning@everest-semi.com> ------------ v1: cancel delayed work while disabling speaker. Link: https://lore.kernel.org/r/20221028020456.90286-1-zhuning0077@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-11-07ASoC: SOF: topology: No need to assign core ID if token parsing failedPeter Ujfalusi
Move the return value check before attempting to assign the core ID to the swidget since we are going to fail the sof_widget_ready() and free up swidget anyways. Fixes: 909dadf21aae ("ASoC: SOF: topology: Make DAI widget parsing IPC agnostic") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/20221107090433.5146-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-11-07ASoC: soc-utils: Remove __exit for snd_soc_util_exit()Chen Zhongjin
snd_soc_util_exit() is called in __init snd_soc_init() for cleanup. Remove the __exit annotation for it to fix the build warning: WARNING: modpost: sound/soc/snd-soc-core.o: section mismatch in reference: init_module (section: .init.text) -> snd_soc_util_exit (section: .exit.text) Fixes: 6ec27c53886c ("ASoC: core: Fix use-after-free in snd_soc_exit()") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Link: https://lore.kernel.org/r/20221031134031.256511-1-chenzhongjin@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-11-07btrfs: zoned: fix locking imbalance on scrubJohannes Thumshirn
If we're doing device replace on a zoned filesystem and discover in scrub_enumerate_chunks() that we don't have to copy the block group it is unlocked before it gets skipped. But as the block group hasn't yet been locked before it leads to a locking imbalance. To fix this simply remove the unlock. This was uncovered by fstests' testcase btrfs/163. Fixes: 9283b9e09a6d ("btrfs: remove lock protection for BLOCK_GROUP_FLAG_TO_COPY") Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07btrfs: zoned: initialize device's zone info for seedingJohannes Thumshirn
When performing seeding on a zoned filesystem it is necessary to initialize each zoned device's btrfs_zoned_device_info structure, otherwise mounting the filesystem will cause a NULL pointer dereference. This was uncovered by fstests' testcase btrfs/163. CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07btrfs: zoned: clone zoned device info when cloning a deviceJohannes Thumshirn
When cloning a btrfs_device, we're not cloning the associated btrfs_zoned_device_info structure of the device in case of a zoned filesystem. Later on this leads to a NULL pointer dereference when accessing the device's zone_info for instance when setting a zone as active. This was uncovered by fstests' testcase btrfs/161. CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07Revert "btrfs: scrub: use larger block size for data extent scrub"Qu Wenruo
This reverts commit 786672e9e1a39a231806313e3c445c236588ceef. [BUG] Since commit 786672e9e1a3 ("btrfs: scrub: use larger block size for data extent scrub"), btrfs scrub no longer reports errors if the corruption is not in the first sector of a STRIPE_LEN. The following script can expose the problem: mkfs.btrfs -f $dev mount $dev $mnt xfs_io -f -c "pwrite -S 0xff 0 8k" $mnt/foobar umount $mnt # 13631488 is the logical bytenr of above 8K extent btrfs-map-logical -l 13631488 -b 4096 $dev mirror 1 logical 13631488 physical 13631488 device /dev/test/scratch1 # Corrupt the 2nd sector of that extent xfs_io -f -c "pwrite -S 0x00 13635584 4k" $dev mount $dev $mnt btrfs scrub start -B $mnt scrub done for 54e63f9f-0c30-4c84-a33b-5c56014629b7 Scrub started: Mon Nov 7 07:18:27 2022 Status: finished Duration: 0:00:00 Total to scrub: 536.00MiB Rate: 0.00B/s Error summary: no errors found <<< [CAUSE] That offending commit enlarges the data extent scrub size from sector size to BTRFS_STRIPE_LEN, to avoid extra scrub_block to be allocated. But unfortunately the data extent scrub is still heavily relying on the fact that there is only one scrub_sector per scrub_block. Thus it will only check the first sector, and ignoring the remaining sectors. Furthermore the error reporting is not able to handle multiple sectors either. [FIX] For now just revert the offending commit. The consequence is just extra memory usage during scrub. We will need a proper change to make the remaining data scrub path to handle multiple sectors before we enlarging the data scrub size. Reported-by: Li Zhang <zhanglikernel@gmail.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07btrfs: don't print stack trace when transaction is aborted due to ENOMEMDavid Sterba
Add ENOMEM among the error codes that don't print stack trace on transaction abort. We've got several reports from syzbot that detects stacks as errors but caused by limiting memory. As this is an artificial condition we don't need to know where exactly the error happens, the abort and error cleanup will continue like e.g. for EIO. As the transaction aborts code needs to be inline in a lot of code, the implementation cases about minimal bloat. The error codes are in a separate function and the WARN uses the condition directly. This increases the code size by 571 bytes on release build. Alternatives considered: add -ENOMEM among the errors, this increases size by 2340 bytes, various attempts to combine the WARN and helper calls, increase by 700 or more bytes. Example syzbot reports (error -12): - https://syzkaller.appspot.com/bug?extid=5244d35be7f589cf093e - https://syzkaller.appspot.com/bug?extid=9c37714c07194d816417 Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07btrfs: selftests: fix wrong error check in btrfs_free_dummy_root()Zhang Xiaoxu
The btrfs_alloc_dummy_root() uses ERR_PTR as the error return value rather than NULL, if error happened, there will be a NULL pointer dereference: BUG: KASAN: null-ptr-deref in btrfs_free_dummy_root+0x21/0x50 [btrfs] Read of size 8 at addr 000000000000002c by task insmod/258926 CPU: 2 PID: 258926 Comm: insmod Tainted: G W 6.1.0-rc2+ #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 kasan_report+0xb7/0x140 kasan_check_range+0x145/0x1a0 btrfs_free_dummy_root+0x21/0x50 [btrfs] btrfs_test_free_space_cache+0x1a8c/0x1add [btrfs] btrfs_run_sanity_tests+0x65/0x80 [btrfs] init_btrfs_fs+0xec/0x154 [btrfs] do_one_initcall+0x87/0x2a0 do_init_module+0xdf/0x320 load_module+0x3006/0x3390 __do_sys_finit_module+0x113/0x1b0 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: aaedb55bc08f ("Btrfs: add tests for btrfs_get_extent") CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07btrfs: fix match incorrectly in dev_args_match_deviceLiu Shixin
syzkaller found a failed assertion: assertion failed: (args->devid != (u64)-1) || args->missing, in fs/btrfs/volumes.c:6921 This can be triggered when we set devid to (u64)-1 by ioctl. In this case, the match of devid will be skipped and the match of device may succeed incorrectly. Patch 562d7b1512f7 introduced this function which is used to match device. This function contains two matching scenarios, we can distinguish them by checking the value of args->missing rather than check whether args->devid and args->uuid is default value. Reported-by: syzbot+031687116258450f9853@syzkaller.appspotmail.com Fixes: 562d7b1512f7 ("btrfs: handle device lookup with btrfs_dev_lookup_args") CC: stable@vger.kernel.org # 5.16+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: David Sterba <dsterba@suse.com>