linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-03-16	btrfs: subpage: make readahead work properly	Qu Wenruo
	In readahead infrastructure, we are using a lot of hard coded PAGE_SHIFT while we're not doing anything specific to PAGE_SIZE. One of the most affected part is the radix tree operation of btrfs_fs_info::reada_tree. If using PAGE_SHIFT, subpage metadata readahead is broken and does no help reading metadata ahead. Fix the problem by using btrfs_fs_info::sectorsize_bits so that readahead could work for subpage. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-16	btrfs: subpage: fix wild pointer access during metadata read failure	Qu Wenruo
	[BUG] When running fstests for btrfs subpage read-write test, it has a very high chance to crash at generic/475 with the following stack: BTRFS warning (device dm-8): direct IO failed ino 510 rw 1,34817 sector 0xcdf0 len 94208 err no 10 Unable to handle kernel paging request at virtual address ffff80001157e7c0 CPU: 2 PID: 687125 Comm: kworker/u12:4 Tainted: G WC 5.12.0-rc2-custom+ #5 Hardware name: Khadas VIM3 (DT) Workqueue: btrfs-endio-meta btrfs_work_helper [btrfs] pc : queued_spin_lock_slowpath+0x1a0/0x390 lr : do_raw_spin_lock+0xc4/0x11c Call trace: queued_spin_lock_slowpath+0x1a0/0x390 _raw_spin_lock+0x68/0x84 btree_readahead_hook+0x38/0xc0 [btrfs] end_bio_extent_readpage+0x504/0x5f4 [btrfs] bio_endio+0x170/0x1a4 end_workqueue_fn+0x3c/0x60 [btrfs] btrfs_work_helper+0x1b0/0x1b4 [btrfs] process_one_work+0x22c/0x430 worker_thread+0x70/0x3a0 kthread+0x13c/0x140 ret_from_fork+0x10/0x30 Code: 910020e0 8b0200c2 f861d884 aa0203e1 (f8246827) [CAUSE] In end_bio_extent_readpage(), if we hit an error during read, we will handle the error differently for data and metadata. For data we queue a repair, while for metadata, we record the error and let the caller choose what to do. But the code is still using page->private to grab extent buffer, which no longer points to extent buffer for subpage metadata pages. Thus this wild pointer access leads to above crash. [FIX] Introduce a helper, find_extent_buffer_readpage(), to grab extent buffer. The difference against find_extent_buffer_nospinlock() is: - Also handles regular sectorsize == PAGE_SIZE case - No extent buffer refs increase/decrease As extent buffer under IO must have non-zero refs, so this is safe Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-16	gpiolib: Assign fwnode to parent's if no primary one provided	Andy Shevchenko
	In case when the properties are supplied in the secondary fwnode (for example, built-in device properties) the fwnode pointer left unassigned. This makes unable to retrieve them. Assign fwnode to parent's if no primary one provided. Fixes: 7cba1a4d5e16 ("gpiolib: generalize devprop_gpiochip_set_names() for device properties") Fixes: 2afa97e9868f ("gpiolib: Read "gpio-line-names" from a firmware node") Reported-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Tested-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-03-16	zonefs: Fix O_APPEND async write handling	Damien Le Moal
	zonefs updates the size of a sequential zone file inode only on completion of direct writes. When executing asynchronous append writes (with a file open with O_APPEND or using RWF_APPEND), the use of the current inode size in generic_write_checks() to set an iocb offset thus leads to unaligned write if an application issues an append write operation with another write already being executed. Fix this problem by introducing zonefs_write_checks() as a modified version of generic_write_checks() using the file inode wp_offset for an append write iocb offset. Also introduce zonefs_write_check_limits() to replace generic_write_check_limits() call. This zonefs special helper makes sure that the maximum file limit used is the maximum size of the file being accessed. Since zonefs_write_checks() already truncates the iov_iter, the calls to iov_iter_truncate() in zonefs_file_dio_write() and zonefs_file_buffered_write() are removed. Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system") Cc: <stable@vger.kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
2021-03-16	zonefs: prevent use of seq files as swap file	Damien Le Moal
	The sequential write constraint of sequential zone file prevent their use as swap files. Only allow conventional zone files to be used as swap files. Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system") Cc: <stable@vger.kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
2021-03-16	ALSA: hda/realtek: fix mute/micmute LEDs for HP 440 G8	Jeremy Szu
	The HP EliteBook 840 G8 Notebook PC is using ALC236 codec which is using 0x02 to control mute LED and 0x01 to control micmute LED. Therefore, add a quirk to make it works. Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210316074626.79895-1-jeremy.szu@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-03-16	can: m_can: m_can_rx_peripheral(): fix RX being blocked by errors	Torin Cooper-Bennun
	For M_CAN peripherals, m_can_rx_handler() was called with quota = 1, which caused any error handling to block RX from taking place until the next time the IRQ handler is called. This had been observed to cause RX to be blocked indefinitely in some cases. This is fixed by calling m_can_rx_handler with a sensibly high quota. Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework") Link: https://lore.kernel.org/r/20210303144350.4093750-1-torin@maxiluxsystems.com Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Torin Cooper-Bennun <torin@maxiluxsystems.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	can: m_can: m_can_do_rx_poll(): fix extraneous msg loss warning	Torin Cooper-Bennun
	Message loss from RX FIFO 0 is already handled in m_can_handle_lost_msg(), with netdev output included. Removing this warning also improves driver performance under heavy load, where m_can_do_rx_poll() may be called many times before this interrupt is cleared, causing this message to be output many times (thanks Mariusz Madej for this report). Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support") Link: https://lore.kernel.org/r/20210303103151.3760532-1-torin@maxiluxsystems.com Reported-by: Mariusz Madej <mariusz.madej@xtrack.com> Signed-off-by: Torin Cooper-Bennun <torin@maxiluxsystems.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	can: c_can: move runtime PM enable/disable to c_can_platform	Tong Zhang
	Currently doing modprobe c_can_pci will make the kernel complain: Unbalanced pm_runtime_enable! this is caused by pm_runtime_enable() called before pm is initialized. This fix is similar to 227619c3ff7c, move those pm_enable/disable code to c_can_platform. Fixes: 4cdd34b26826 ("can: c_can: Add runtime PM support to Bosch C_CAN/D_CAN controller") Link: http://lore.kernel.org/r/20210302025542.987600-1-ztong0001@gmail.com Signed-off-by: Tong Zhang <ztong0001@gmail.com> Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	can: c_can_pci: c_can_pci_remove(): fix use-after-free	Tong Zhang
	There is a UAF in c_can_pci_remove(). dev is released by free_c_can_dev() and is used by pci_iounmap(pdev, priv->base) later. To fix this issue, save the mmio address before releasing dev. Fixes: 5b92da0443c2 ("c_can_pci: generic module for C_CAN/D_CAN on PCI") Link: https://lore.kernel.org/r/20210301024512.539039-1-ztong0001@gmail.com Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	can: kvaser_usb: Add support for USBcan Pro 4xHS	Jimmy Assarsson
	Add support for Kvaser USBcan Pro 4xHS. Link: https://lore.kernel.org/r/20210309091724.31262-2-jimmyassarsson@gmail.com Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	can: kvaser_pciefd: Always disable bus load reporting	Jimmy Assarsson
	Under certain circumstances, when switching from Kvaser's linuxcan driver (kvpciefd) to the SocketCAN driver (kvaser_pciefd), the bus load reporting is not disabled. This is flooding the kernel log with prints like: [3485.574677] kvaser_pciefd 0000:02:00.0: Received unexpected packet type 0x00000009 Always put the controller in the expected state, instead of assuming that bus load reporting is inactive. Note: If bus load reporting is enabled when the driver is loaded, you will still get a number of bus load packages (and printouts), before it is disabled. Fixes: 26ad340e582d ("can: kvaser_pciefd: Add driver for Kvaser PCIEcan devices") Link: https://lore.kernel.org/r/20210309091724.31262-1-jimmyassarsson@gmail.com Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	can: flexcan: flexcan_chip_freeze(): fix chip freeze for missing bitrate	Angelo Dureghello
	For cases when flexcan is built-in, bitrate is still not set at registering. So flexcan_chip_freeze() generates: [ 1.860000] * ZERO DIVIDE * FORMAT=4 [ 1.860000] Current process id is 1 [ 1.860000] BAD KERNEL TRAP: 00000000 [ 1.860000] PC: [<402e70c8>] flexcan_chip_freeze+0x1a/0xa8 To allow chip freeze, using an hardcoded timeout when bitrate is still not set. Fixes: ec15e27cc890 ("can: flexcan: enable RX FIFO after FRZ/HALT valid") Link: https://lore.kernel.org/r/20210315231510.650593-1-angelo@kernel-space.org Signed-off-by: Angelo Dureghello <angelo@kernel-space.org> [mkl: use if instead of ? operator] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	can: peak_usb: add forgotten supported devices	Stephane Grosjean
	Since the peak_usb driver also supports the CAN-USB interfaces "PCAN-USB X6" and "PCAN-Chip USB" from PEAK-System GmbH, this patch adds their names to the list of explicitly supported devices. Fixes: ea8b65b596d7 ("can: usb: Add support of PCAN-Chip USB stamp module") Fixes: f00b534ded60 ("can: peak: Add support for PCAN-USB X6 USB interface") Link: https://lore.kernel.org/r/20210309082128.23125-3-s.grosjean@peak-system.com Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	can: isotp: TX-path: ensure that CAN frame flags are initialized	Marc Kleine-Budde
	The previous patch ensures that the TX flags (struct can_isotp_ll_options::tx_flags) are 0 for classic CAN frames or a user configured value for CAN-FD frames. This patch sets the CAN frames flags unconditionally to the ISO-TP TX flags, so that they are initialized to a proper value. Otherwise when running "candump -x" on a classical CAN ISO-TP stream shows wrongly set "B" and "E" flags. \| $ candump any,0:0,#FFFFFFFF -extA \| [...] \| can0 TX B E 713 [8] 2B 0A 0B 0C 0D 0E 0F 00 \| can0 TX B E 713 [8] 2C 01 02 03 04 05 06 07 \| can0 TX B E 713 [8] 2D 08 09 0A 0B 0C 0D 0E \| can0 TX B E 713 [8] 2E 0F 00 01 02 03 04 05 Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Link: https://lore.kernel.org/r/20210218215434.1708249-2-mkl@pengutronix.de Cc: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	can: isotp: isotp_setsockopt(): only allow to set low level TX flags for CAN-FD	Marc Kleine-Budde
	CAN-FD frames have struct canfd_frame::flags, while classic CAN frames don't. This patch refuses to set TX flags (struct can_isotp_ll_options::tx_flags) on non CAN-FD isotp sockets. Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Link: https://lore.kernel.org/r/20210218215434.1708249-2-mkl@pengutronix.de Cc: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	can: dev: Move device back to init netns on owning netns delete	Martin Willi
	When a non-initial netns is destroyed, the usual policy is to delete all virtual network interfaces contained, but move physical interfaces back to the initial netns. This keeps the physical interface visible on the system. CAN devices are somewhat special, as they define rtnl_link_ops even if they are physical devices. If a CAN interface is moved into a non-initial netns, destroying that netns lets the interface vanish instead of moving it back to the initial netns. default_device_exit() skips CAN interfaces due to having rtnl_link_ops set. Reproducer: ip netns add foo ip link set can0 netns foo ip netns delete foo WARNING: CPU: 1 PID: 84 at net/core/dev.c:11030 ops_exit_list+0x38/0x60 CPU: 1 PID: 84 Comm: kworker/u4:2 Not tainted 5.10.19 #1 Workqueue: netns cleanup_net [<c010e700>] (unwind_backtrace) from [<c010a1d8>] (show_stack+0x10/0x14) [<c010a1d8>] (show_stack) from [<c086dc10>] (dump_stack+0x94/0xa8) [<c086dc10>] (dump_stack) from [<c086b938>] (__warn+0xb8/0x114) [<c086b938>] (__warn) from [<c086ba10>] (warn_slowpath_fmt+0x7c/0xac) [<c086ba10>] (warn_slowpath_fmt) from [<c0629f20>] (ops_exit_list+0x38/0x60) [<c0629f20>] (ops_exit_list) from [<c062a5c4>] (cleanup_net+0x230/0x380) [<c062a5c4>] (cleanup_net) from [<c0142c20>] (process_one_work+0x1d8/0x438) [<c0142c20>] (process_one_work) from [<c0142ee4>] (worker_thread+0x64/0x5a8) [<c0142ee4>] (worker_thread) from [<c0148a98>] (kthread+0x148/0x14c) [<c0148a98>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c) To properly restore physical CAN devices to the initial netns on owning netns exit, introduce a flag on rtnl_link_ops that can be set by drivers. For CAN devices setting this flag, default_device_exit() considers them non-virtual, applying the usual namespace move. The issue was introduced in the commit mentioned below, as at that time CAN devices did not have a dellink() operation. Fixes: e008b5fc8dc7 ("net: Simplfy default_device_exit and improve batching.") Link: https://lore.kernel.org/r/20210302122423.872326-1-martin@strongswan.org Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16	Merge tag 'usb-v5.12-rc4' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-linus Peter writes: It fixed one incorrect value issue for cdns ssp driver * tag 'usb-v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb: usb: cdnsp: Fixes incorrect value in ISOC TRB
2021-03-16	ALSA: hda/realtek: fix mute/micmute LEDs for HP 840 G8	Jeremy Szu
	The HP EliteBook 840 G8 Notebook PC is using ALC285 codec which is using 0x04 to control mute LED and 0x01 to control micmute LED. Therefore, add a quirk to make it works. Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210316065452.75659-1-jeremy.szu@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-03-15	scsi: lpfc: Fix some error codes in debugfs	Dan Carpenter
	If copy_from_user() or kstrtoull() fail then the correct behavior is to return a negative error code. Link: https://lore.kernel.org/r/YEsbU/UxYypVrC7/@mwanda Fixes: f9bb2da11db8 ("[SCSI] lpfc 8.3.27: T10 additions for SLI4") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-15	scsi: qla2xxx: Fix broken #endif placement	Alexey Dobriyan
	Only half of the file is under include guard because terminating #endif is placed too early. Link: https://lore.kernel.org/r/YE4snvoW1SuwcXAn@localhost.localdomain Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-15	scsi: st: Fix a use after free in st_open()	Lv Yunlong
	In st_open(), if STp->in_use is true, STp will be freed by scsi_tape_put(). However, STp is still used by DEBC_printk() after. It is better to DEBC_printk() before scsi_tape_put(). Link: https://lore.kernel.org/r/20210311064636.10522-1-lyl2019@mail.ustc.edu.cn Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi> Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-15	scsi: myrs: Fix a double free in myrs_cleanup()	Lv Yunlong
	In myrs_cleanup(), cs->mmio_base will be freed twice by iounmap(). Link: https://lore.kernel.org/r/20210311063005.9963-1-lyl2019@mail.ustc.edu.cn Fixes: 77266186397c ("scsi: myrs: Add Mylex RAID controller (SCSI interface)") Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-15	scsi: ibmvfc: Free channel_setup_buf during device tear down	Tyrel Datwyler
	The buffer for negotiating channel setup is DMA allocated at device probe time. However, the remove path fails to free this allocation which will prevent the hypervisor from releasing the virtual device in the case of a hotplug remove. Fix this issue by freeing the buffer allocation in ibmvfc_free_mem(). Link: https://lore.kernel.org/r/20210311012212.428068-1-tyreld@linux.ibm.com Fixes: e95eef3fc0bc ("scsi: ibmvfc: Implement channel enquiry and setup commands") Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-16	drm: rcar-du: Use drmm_encoder_alloc() to manage encoder	Kieran Bingham
	The encoder allocation was converted to a DRM managed resource at the same time as the addition of a new helper drmm_encoder_alloc() which simplifies the same process. Convert the custom drm managed resource allocation of the encoder with the helper to simplify the implementation, and prevent hitting a WARN_ON() due to the handling the drm_encoder_init() call directly without registering a .destroy() function op. Fixes: f5f16725edbc ("drm: rcar-du: Use DRM-managed allocation for encoders") Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2021-03-15	mptcp: fix ADD_ADDR HMAC in case port is specified	Davide Caratti
	Currently, Linux computes the HMAC contained in ADD_ADDR sub-option using the Address Id and the IP Address, and hardcodes a destination port equal to zero. This is not ok for ADD_ADDR with port: ensure to account for the endpoint port when computing the HMAC, in compliance with RFC8684 §3.4.1. Fixes: 22fb85ffaefb ("mptcp: add port support for ADD_ADDR suboption writing") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15	Merge tag 'afs-fixes-20210315' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: - Fix an oops in AFS that can be triggered by accessing one of the afs.yfs.* xattrs against an OpenAFS server - for instance by commands like "cp -a"[1], "rsync -X" or getfattr[2]. These try and copy all of the xattrs. cp and rsync should pay attention to the list in /etc/xattr.conf, but cp doesn't on Ubuntu and rsync doesn't seem to on Ubuntu or Fedora. xattr.conf has been modified upstream[3], and a new version has just been cut that includes it. I've logged a bug against rsync for the problem there[4]. - Stop listing "afs." xattrs[5][6][7], but particularly ACL ones[8] so that they don't confuse cp and rsync. This removes them from the list returned by listxattr(), but they're still available to get/set. Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003498.html [1] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003501.html [2] Link: https://git.savannah.nongnu.org/cgit/attr.git/commit/?id=74da517cc655a82ded715dea7245ce88ebc91b98 [3] Link: https://github.com/WayneD/rsync/issues/163 [4] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003516.html [5] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003524.html [6] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003565.html # v1 Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003568.html [7] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003570.html [8] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003571.html # v2 tag 'afs-fixes-20210315' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Stop listxattr() from listing "afs.*" attributes afs: Fix accessing YFS xattrs on a non-YFS server
2021-03-15	tcp: relookup sock for RST+ACK packets handled by obsolete req sock	Alexander Ovechkin
	Currently tcp_check_req can be called with obsolete req socket for which big socket have been already created (because of CPU race or early demux assigning req socket to multiple packets in gro batch). Commit e0f9759f530bf789e984 ("tcp: try to keep packet if SYN_RCV race is lost") added retry in case when tcp_check_req is called for PSH\|ACK packet. But if client sends RST+ACK immediatly after connection being established (it is performing healthcheck, for example) retry does not occur. In that case tcp_check_req tries to close req socket, leaving big socket active. Fixes: e0f9759f530 ("tcp: try to keep packet if SYN_RCV race is lost") Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru> Reported-by: Oleg Senin <olegsenin@yandex-team.ru> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15	tipc: better validate user input in tipc_nl_retrieve_key()	Eric Dumazet
	Before calling tipc_aead_key_size(ptr), we need to ensure we have enough data to dereference ptr->keylen. We probably also want to make sure tipc_aead_key_size() wont overflow with malicious ptr->keylen values. Syzbot reported: BUG: KMSAN: uninit-value in __tipc_nl_node_set_key net/tipc/node.c:2971 [inline] BUG: KMSAN: uninit-value in tipc_nl_node_set_key+0x9bf/0x13b0 net/tipc/node.c:3023 CPU: 0 PID: 21060 Comm: syz-executor.5 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x21c/0x280 lib/dump_stack.c:120 kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197 __tipc_nl_node_set_key net/tipc/node.c:2971 [inline] tipc_nl_node_set_key+0x9bf/0x13b0 net/tipc/node.c:3023 genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline] genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] genl_rcv_msg+0x1319/0x1610 net/netlink/genetlink.c:800 netlink_rcv_skb+0x6fa/0x810 net/netlink/af_netlink.c:2494 genl_rcv+0x63/0x80 net/netlink/genetlink.c:811 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x11d6/0x14a0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x1740/0x1840 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] ____sys_sendmsg+0xcfc/0x12f0 net/socket.c:2345 ___sys_sendmsg net/socket.c:2399 [inline] __sys_sendmsg+0x714/0x830 net/socket.c:2432 __compat_sys_sendmsg net/compat.c:347 [inline] __do_compat_sys_sendmsg net/compat.c:354 [inline] __se_compat_sys_sendmsg+0xa7/0xc0 net/compat.c:351 __ia32_compat_sys_sendmsg+0x4a/0x70 net/compat.c:351 do_syscall_32_irqs_on arch/x86/entry/common.c:79 [inline] __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:141 do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:166 do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:209 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c RIP: 0023:0xf7f60549 Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00 RSP: 002b:00000000f555a5fc EFLAGS: 00000296 ORIG_RAX: 0000000000000172 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020000200 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline] kmsan_internal_poison_shadow+0x5c/0xf0 mm/kmsan/kmsan.c:104 kmsan_slab_alloc+0x8d/0xe0 mm/kmsan/kmsan_hooks.c:76 slab_alloc_node mm/slub.c:2907 [inline] __kmalloc_node_track_caller+0xa37/0x1430 mm/slub.c:4527 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x2f8/0xb30 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1099 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline] netlink_sendmsg+0xdbc/0x1840 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] ____sys_sendmsg+0xcfc/0x12f0 net/socket.c:2345 ___sys_sendmsg net/socket.c:2399 [inline] __sys_sendmsg+0x714/0x830 net/socket.c:2432 __compat_sys_sendmsg net/compat.c:347 [inline] __do_compat_sys_sendmsg net/compat.c:354 [inline] __se_compat_sys_sendmsg+0xa7/0xc0 net/compat.c:351 __ia32_compat_sys_sendmsg+0x4a/0x70 net/compat.c:351 do_syscall_32_irqs_on arch/x86/entry/common.c:79 [inline] __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:141 do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:166 do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:209 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c Fixes: e1f32190cf7d ("tipc: add support for AEAD key setting via netlink") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tuong Lien <tuong.t.lien@dektech.com.au> Cc: Jon Maloy <jmaloy@redhat.com> Cc: Ying Xue <ying.xue@windriver.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15	net: phylink: Fix phylink_err() function name error in phylink_major_config	Ong Boon Leong
	if pl->mac_ops->mac_finish() failed, phylink_err should use "mac_finish" instead of "mac_prepare". Fixes: b7ad14c2fe2d4 ("net: phylink: re-implement interface configuration with PCS") Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15	ALSA: hda/realtek: apply pin quirk for XiaomiNotebook Pro	Xiaoliang Yu
	Built-in microphone and combojack on Xiaomi Notebook Pro (1d72:1701) needs to be fixed, the existing quirk for Dell works well on that machine. Signed-off-by: Xiaoliang Yu <yxl_22@outlook.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/OS0P286MB02749B9E13920E6899902CD8EE6C9@OS0P286MB0274.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-03-15	net: hdlc_x25: Prevent racing between "x25_close" and "x25_xmit"/"x25_rx"	Xie He
	"x25_close" is called by "hdlc_close" in "hdlc.c", which is called by hardware drivers' "ndo_stop" function. "x25_xmit" is called by "hdlc_start_xmit" in "hdlc.c", which is hardware drivers' "ndo_start_xmit" function. "x25_rx" is called by "hdlc_rcv" in "hdlc.c", which receives HDLC frames from "net/core/dev.c". "x25_close" races with "x25_xmit" and "x25_rx" because their callers race. However, we need to ensure that the LAPB APIs called in "x25_xmit" and "x25_rx" are called before "lapb_unregister" is called in "x25_close". This patch adds locking to ensure when "x25_xmit" and "x25_rx" are doing their work, "lapb_unregister" is not yet called in "x25_close". Reasons for not solving the racing between "x25_close" and "x25_xmit" by calling "netif_tx_disable" in "x25_close": 1. We still need to solve the racing between "x25_close" and "x25_rx"; 2. The design of the HDLC subsystem assumes the HDLC hardware drivers have full control over the TX queue, and the HDLC protocol drivers (like this driver) have no control. Controlling the queue here in the protocol driver may interfere with hardware drivers' control of the queue. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15	s390/pci: fix leak of PCI device structure	Niklas Schnelle
	In commit 05bc1be6db4b2 ("s390/pci: create zPCI bus") we removed the pci_dev_put() call matching the earlier pci_get_slot() done as part of __zpci_event_availability(). This was based on the wrong understanding that the device_put() done as part of pci_destroy_device() would counter the pci_get_slot() when it only counters the initial reference. This same understanding and existing bad example also lead to not doing a pci_dev_put() in zpci_remove_device(). Since releasing the PCI devices, unlike releasing the PCI slot, does not print any debug message for testing I added one in pci_release_dev(). This revealed that we are indeed leaking the PCI device on PCI hotunplug. Further testing also revealed another missing pci_dev_put() in disable_slot(). Fix this by adding the missing pci_dev_put() in disable_slot() and fix zpci_remove_device() with the correct pci_dev_put() calls. Also instead of calling pci_get_slot() in __zpci_event_availability() to determine if a PCI device is registered and then doing the same again in zpci_remove_device() do this once in zpci_remove_device() which makes sure that the pdev in __zpci_event_availability() is only used for the result of pci_scan_single_device() which does not need a reference count decremnt as its ownership goes to the PCI bus. Also move the check if zdev->zbus->bus is set into zpci_remove_device() since it may be that we're removing a device with devfn != 0 which never had a PCI bus. So we can still set the pdev->error_state to indicate that the device is not usable anymore, add a flag to set the error state. Fixes: 05bc1be6db4b2 ("s390/pci: create zPCI bus") Cc: <stable@vger.kernel.org> # 5.8+: e1bff843cde6 s390/pci: remove superfluous zdev->zbus check Cc: <stable@vger.kernel.org> # 5.8+: ba764dd703fe s390/pci: refactor zpci_create_device() Cc: <stable@vger.kernel.org> # 5.8+ Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-03-15	s390/vtime: fix increased steal time accounting	Gerald Schaefer
	Commit 152e9b8676c6e ("s390/vtime: steal time exponential moving average") inadvertently changed the input value for account_steal_time() from "cputime_to_nsecs(steal)" to just "steal", resulting in broken increased steal time accounting. Fix this by changing it back to "cputime_to_nsecs(steal)". Fixes: 152e9b8676c6e ("s390/vtime: steal time exponential moving average") Cc: <stable@vger.kernel.org> # 5.1 Reported-by: Sabine Forkel <sabine.forkel@de.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-03-15	s390/cpumf: disable preemption when accessing per-cpu variable	Thomas Richter
	The following BUG message was triggered repeatedly when complete counter sets are extracted from the CPUMF: BUG: using smp_processor_id() in preemptible [00000000] code: psvc-readsets/7759 caller is cf_diag_needspace+0x2c/0x100 CPU: 7 PID: 7759 Comm: psvc-readsets Not tainted 5.12.0 Hardware name: IBM 3906 M03 703 (LPAR) Call Trace: [<00000000c7043f78>] show_stack+0x90/0xf8 [<00000000c705776a>] dump_stack+0xba/0x108 [<00000000c705d91c>] check_preemption_disabled+0xec/0xf0 [<00000000c63eb1c4>] cf_diag_needspace+0x2c/0x100 [<00000000c63ecbcc>] cf_diag_ioctl_start+0x10c/0x240 [<00000000c63ece9a>] cf_diag_ioctl+0x19a/0x238 [<00000000c675f3f4>] __s390x_sys_ioctl+0xc4/0x100 [<00000000c63ca762>] do_syscall+0x82/0xd0 [<00000000c705bdd8>] __do_syscall+0xc0/0xd8 [<00000000c706d532>] system_call+0x72/0x98 2 locks held by psvc-readsets/7759: #0: 00000000c75a57c0 (cpu_hotplug_lock){++++}-{0:0}, at: cf_diag_ioctl+0x44/0x238 #1: 00000000c75a3078 (cf_diag_ctrset_mutex){+.+.}-{3:3}, at: cf_diag_ioctl+0x54/0x238 This issue is a missing get_cpu_ptr/put_cpu_ptr pair in function cf_diag_needspace. Add it. Fixes: cf6acb8bdb1d ("s390/cpumf: Add support for complete counter set extraction") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-03-15	drm/amd/display: Copy over soc values before bounding box creation	Sung Lee
	[Why] With certain fclock overclocks, state 1 may be chosen as the closest clock level. This may result in this state being empty if not populated beforehand, resulting in black screens and screen corruption. [How] Copy over all soc states to clock_limits before bounding box creation to avoid any cases with empty states. Fixes: f2459c52c84449 ("drm/amd/display: Add Bounding Box State for Low DF PState but High Voltage State") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1514 Signed-off-by: Sung Lee <sung.lee@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-15	netfilter: ctnetlink: fix dump of the expect mask attribute	Florian Westphal
	Before this change, the mask is never included in the netlink message, so "conntrack -E expect" always prints 0.0.0.0. In older kernels the l3num callback struct was passed as argument, based on tuple->src.l3num. After the l3num indirection got removed, the call chain is based on m.src.l3num, but this value is 0xffff. Init l3num to the correct value. Fixes: f957be9d349a3 ("netfilter: conntrack: remove ctnetlink callbacks from l3 protocol trackers") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-03-15	netfilter: x_tables: Use correct memory barriers.	Mark Tomlinson
	When a new table value was assigned, it was followed by a write memory barrier. This ensured that all writes before this point would complete before any writes after this point. However, to determine whether the rules are unused, the sequence counter is read. To ensure that all writes have been done before these reads, a full memory barrier is needed, not just a write memory barrier. The same argument applies when incrementing the counter, before the rules are read. Changing to using smp_mb() instead of smp_wmb() fixes the kernel panic reported in cc00bcaa5899 (which is still present), while still maintaining the same speed of replacing tables. The smb_mb() barriers potentially slow the packet path, however testing has shown no measurable change in performance on a 4-core MIPS64 platform. Fixes: 7f5c6d4f665b ("netfilter: get rid of atomic ops in fast path") Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-03-15	Revert "netfilter: x_tables: Switch synchronization to RCU"	Mark Tomlinson
	This reverts commit cc00bcaa589914096edef7fb87ca5cee4a166b5c. This (and the preceding) patch basically re-implemented the RCU mechanisms of patch 784544739a25. That patch was replaced because of the performance problems that it created when replacing tables. Now, we have the same issue: the call to synchronize_rcu() makes replacing tables slower by as much as an order of magnitude. Prior to using RCU a script calling "iptables" approx. 200 times was taking 1.16s. With RCU this increased to 11.59s. Revert these patches and fix the issue in a different way. Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-03-15	Revert "netfilter: x_tables: Update remaining dereference to RCU"	Mark Tomlinson
	This reverts commit 443d6e86f821a165fae3fc3fc13086d27ac140b1. This (and the following) patch basically re-implemented the RCU mechanisms of patch 784544739a25. That patch was replaced because of the performance problems that it created when replacing tables. Now, we have the same issue: the call to synchronize_rcu() makes replacing tables slower by as much as an order of magnitude. Revert these patches and fix the issue in a different way. Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-03-15	afs: Stop listxattr() from listing "afs.*" attributes	David Howells
	afs_listxattr() lists all the available special afs xattrs (i.e. those in the "afs." space), no matter what type of server we're dealing with. But OpenAFS servers, for example, cannot deal with some of the extra-capable attributes that AuriStor (YFS) servers provide. Unfortunately, the presence of the afs.yfs. attributes causes errors[1] for anything that tries to read them if the server is of the wrong type. Fix the problem by removing afs_listxattr() so that none of the special xattrs are listed (AFS doesn't support xattrs). It does mean, however, that getfattr won't list them, though they can still be accessed with getxattr() and setxattr(). This can be tested with something like: getfattr -d -m "." /afs/example.com/path/to/file With this change, none of the afs. attributes should be visible. Changes: ver #2: - Hide all of the afs.* xattrs, not just the ACL ones. Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs") Reported-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003502.html [1] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003567.html # v1 Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003573.html # v2
2021-03-15	afs: Fix accessing YFS xattrs on a non-YFS server	David Howells
	If someone attempts to access YFS-related xattrs (e.g. afs.yfs.acl) on a file on a non-YFS AFS server (such as OpenAFS), then the kernel will jump to a NULL function pointer because the afs_fetch_acl_operation descriptor doesn't point to a function for issuing an operation on a non-YFS server[1]. Fix this by making afs_wait_for_operation() check that the issue_afs_rpc method is set before jumping to it and setting -ENOTSUPP if not. This fix also covers other potential operations that also only exist on YFS servers. afs_xattr_get/set_yfs() then need to translate -ENOTSUPP to -ENODATA as the former error is internal to the kernel. The bug shows up as an oops like the following: BUG: kernel NULL pointer dereference, address: 0000000000000000 [...] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [...] Call Trace: afs_wait_for_operation+0x83/0x1b0 [kafs] afs_xattr_get_yfs+0xe6/0x270 [kafs] __vfs_getxattr+0x59/0x80 vfs_getxattr+0x11c/0x140 getxattr+0x181/0x250 ? __check_object_size+0x13f/0x150 ? __fput+0x16d/0x250 __x64_sys_fgetxattr+0x64/0xb0 do_syscall_64+0x49/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fb120a9defe This was triggered with "cp -a" which attempts to copy xattrs, including afs ones, but is easier to reproduce with getfattr, e.g.: getfattr -d -m ".*" /afs/openafs.org/ Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Reported-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003498.html [1] Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003566.html # v1 Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003572.html # v2
2021-03-15	selftests/bpf: Set gopt opt_class to 0 if get tunnel opt failed	Hangbin Liu
	When fixing the bpf test_tunnel.sh geneve failure. I only fixed the IPv4 part but forgot the IPv6 issue. Similar with the IPv4 fixes 557c223b643a ("selftests/bpf: No need to drop the packet when there is no geneve opt"), when there is no tunnel option and bpf_skb_get_tunnel_opt() returns error, there is no need to drop the packets and break all geneve rx traffic. Just set opt_class to 0 and keep returning TC_ACT_OK at the end. Fixes: 557c223b643a ("selftests/bpf: No need to drop the packet when there is no geneve opt") Fixes: 933a741e3b82 ("selftests/bpf: bpf tunnel test.") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: William Tu <u9012063@gmail.com> Link: https://lore.kernel.org/bpf/20210309032214.2112438-1-liuhangbin@gmail.com
2021-03-15	btrfs: zoned: fix linked list corruption after log root tree allocation failure	Filipe Manana
	When using a zoned filesystem, while syncing the log, if we fail to allocate the root node for the log root tree, we are not removing the log context we allocated on stack from the list of log contexts of the log root tree. This means after the return from btrfs_sync_log() we get a corrupted linked list. Fix this by allocating the node before adding our stack allocated context to the list of log contexts of the log root tree. Fixes: 3ddebf27fcd3a9 ("btrfs: zoned: reorder log node allocation on zoned filesystem") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-15	btrfs: fix qgroup data rsv leak caused by falloc failure	Qu Wenruo
	[BUG] When running fsstress with only falloc workload, and a very low qgroup limit set, we can get qgroup data rsv leak at unmount time. BTRFS warning (device dm-0): qgroup 0/5 has unreleased space, type 0 rsv 20480 BTRFS error (device dm-0): qgroup reserved space leaked The minimal reproducer looks like: #!/bin/bash dev=/dev/test/test mnt="/mnt/btrfs" fsstress=~/xfstests-dev/ltp/fsstress runtime=8 workload() { umount $dev &> /dev/null umount $mnt &> /dev/null mkfs.btrfs -f $dev > /dev/null mount $dev $mnt btrfs quota en $mnt btrfs quota rescan -w $mnt btrfs qgroup limit 16m 0/5 $mnt $fsstress -w -z -f creat=10 -f fallocate=10 -p 2 -n 100 \ -d $mnt -v > /tmp/fsstress umount $mnt if dmesg \| grep leak ; then echo "!!! FAILED !!!" exit 1 fi } for (( i=0; i < $runtime; i++)); do echo "=== $i/$runtime===" workload done Normally it would fail before round 4. [CAUSE] In function insert_prealloc_file_extent(), we first call btrfs_qgroup_release_data() to know how many bytes are reserved for qgroup data rsv. Then use that @qgroup_released number to continue our work. But after we call btrfs_qgroup_release_data(), we should either queue @qgroup_released to delayed ref or free them manually in error path. Unfortunately, we lack the error handling to free the released bytes, leaking qgroup data rsv. All the error handling function outside won't help at all, as we have released the range, meaning in inode io tree, the EXTENT_QGROUP_RESERVED bit is already cleared, thus all btrfs_qgroup_free_data() call won't free any data rsv. [FIX] Add free_qgroup tag to manually free the released qgroup data rsv. Reported-by: Nikolay Borisov <nborisov@suse.com> Reported-by: David Sterba <dsterba@suse.cz> Fixes: 9729f10a608f ("btrfs: inode: move qgroup reserved space release to the callers of insert_reserved_file_extent()") CC: stable@vger.kernel.org # 5.10+ Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-15	btrfs: track qgroup released data in own variable in insert_prealloc_file_extent	Qu Wenruo
	There is a piece of weird code in insert_prealloc_file_extent(), which looks like: ret = btrfs_qgroup_release_data(inode, file_offset, len); if (ret < 0) return ERR_PTR(ret); if (trans) { ret = insert_reserved_file_extent(trans, inode, file_offset, &stack_fi, true, ret); ... } extent_info.is_new_extent = true; extent_info.qgroup_reserved = ret; ... Note how the variable @ret is abused here, and if anyone is adding code just after btrfs_qgroup_release_data() call, it's super easy to overwrite the @ret and cause tons of qgroup related bugs. Fix such abuse by introducing new variable @qgroup_released, so that we won't reuse the existing variable @ret. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-15	btrfs: fix wrong offset to zero out range beyond i_size	Qu Wenruo
	[BUG] The test generic/091 fails , with the following output: fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -W mapped writes DISABLED Seed set to 1 main: filesystem does not support fallocate mode FALLOC_FL_COLLAPSE_RANGE, disabling! main: filesystem does not support fallocate mode FALLOC_FL_INSERT_RANGE, disabling! skipping zero size read truncating to largest ever: 0xe400 copying to largest ever: 0x1f400 cloning to largest ever: 0x70000 cloning to largest ever: 0x77000 fallocating to largest ever: 0x7a120 Mapped Read: non-zero data past EOF (0x3a7ff) page offset 0x800 is 0xf2e1 <<< ... [CAUSE] In commit c28ea613fafa ("btrfs: subpage: fix the false data csum mismatch error") end_bio_extent_readpage() changes to only zero the range inside the bvec for incoming subpage support. But that commit is using incorrect offset to calculate the start. For subpage, we can have a case that the whole bvec is beyond isize, thus we need to calculate the correct offset. But the offending commit is using @end (bvec end), other than @start (bvec start) to calculate the start offset. This means, we only zero the last byte of the bvec, not from the isize. This stupid bug makes the range beyond isize is not properly zeroed, and failed above test. [FIX] Use correct @start to calculate the range start. Reported-by: kernel test robot <oliver.sang@intel.com> Fixes: c28ea613fafa ("btrfs: subpage: fix the false data csum mismatch error") Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-15	xfs: also reject BULKSTAT_SINGLE in a mount user namespace	Christoph Hellwig
	BULKSTAT_SINGLE exposed the ondisk uids/gids just like bulkstat, and can be called on any inode, including ones not visible in the current mount. Fixes: f736d93d76d3 ("xfs: support idmapped mounts") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-03-15	docs: ABI: Fix the spelling oustanding to outstanding in the file sysfs-fs-xfs	Bhaskar Chowdhury
	s/oustanding/outstanding/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-03-15	xfs: force log and push AIL to clear pinned inodes when aborting mount	Darrick J. Wong
	If we allocate quota inodes in the process of mounting a filesystem but then decide to abort the mount, it's possible that the quota inodes are sitting around pinned by the log. Now that inode reclaim relies on the AIL to flush inodes, we have to force the log and push the AIL in between releasing the quota inodes and kicking off reclaim to tear down all the incore inodes. Do this by extracting the bits we need from the unmount path and reusing them. As an added bonus, failed writes during a failed mount will not retry forever now. This was originally found during a fuzz test of metadata directories (xfs/1546), but the actual symptom was that reclaim hung up on the quota inodes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>