linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-12-09	uio: uio_dmem_genirq: finalize conversion of probe to devm_ handlers	Alexandru Ardelean
	This moves move pm_runtime_disable on a devm_add_action_or_reset() handler. And with the use of the devm_uio_register_device() function, the remove hook is no longer required. Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Link: https://lore.kernel.org/r/20201120075625.12272-2-alexandru.ardelean@analog.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	uio: uio_dmem_genirq: convert simple allocations to device-managed	Alexandru Ardelean
	This change converts the simple allocations in the driver to used device-managed allocation functions. This removes the error path entirely in the probe function, and reduces some code in the remove function. Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Link: https://lore.kernel.org/r/20201120075625.12272-1-alexandru.ardelean@analog.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	uio/uio_pci_generic: remove unneeded pci_set_drvdata()	Alexandru Ardelean
	The pci_get_drvdata() was moved during commit ef84928cff58 ("uio/uio_pci_generic: use device-managed function equivalents"). Storing a private object with pci_set_drvdata() doesn't make sense since that change, since there is no more pci_get_drvdata() call in the driver to retrieve the information. This change removes it. Fixes: ef84928cff58 ("uio/uio_pci_generic: use device-managed function equivalents") Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Link: https://lore.kernel.org/r/20201123143447.16829-1-alexandru.ardelean@analog.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	uio: pruss: use devm_clk_get() for clk init	Alexandru Ardelean
	This change uses devm_clk_get() to obtain a reference to the clock. It has the benefit that clk_put() is no longer required, and cleans up the exit & error path. Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Link: https://lore.kernel.org/r/20201119145059.48326-1-alexandru.ardelean@analog.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	slimbus: qcom-ngd-ctrl: fix SSR dependencies	Srinivas Kandagatla
	NGD should depend on QCOM_RPROC_COMMON instead of selecting it, as this will be selected by respective remoteproc driver. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20201201093843.20126-1-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	altera-stapl: remove the unreached switch case	Kaixu Xia
	The value of the variable status must be one of the 0, -EIO and -EILSEQ, so the switch case -ENODATA is unreached. Remove it. Reported-by: Tosk Robot <tencent_os_robot@tencent.com> Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Link: https://lore.kernel.org/r/1605284071-6901-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	fsi: Aspeed: Add mutex to protect HW access	Eddie James
	There is nothing to prevent multiple commands being executed simultaneously. Add a mutex to prevent this. Fixes: 606397d67f41 ("fsi: Add ast2600 master driver") Reviewed-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Milton Miller <miltonm@us.ibm.com> Signed-off-by: Eddie James <eajames@linux.ibm.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20201120004929.185239-1-joel@jms.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	bus: fsl-mc: simplify DPRC version check	Ioana Ciornei
	Because the minimum supported DPRC version is 6.0, there is no need to check for incompatible 6.x versions lower to the minimum one. Just remove the second half of the check to simplify the logic. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20201123164839.1668409-1-ciorneiioana@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	bus: fsl-mc: fix error return code in fsl_mc_object_allocate()	Zhang Changzhong
	Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 197f4d6a4a00 ("staging: fsl-mc: fsl-mc object allocator driver") Reported-by: Hulk Robot <hulkci@huawei.com> Acked-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1607068967-31991-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	bus: fsl-mc: added missing fields to dprc_rsp_get_obj_region structure	Laurentiu Tudor
	'type' and 'flags' fields were missing from dprc_rsp_get_obj_region structure therefore the MC Bus driver was not receiving proper flags from MC like DPRC_REGION_CACHEABLE. Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Cristian Sovaiala <cristian.sovaiala@freescale.com> Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Link: https://lore.kernel.org/r/20201124111200.1391-1-laurentiu.tudor@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	bus: fsl-mc: make sure MC firmware is up and running	Laurentiu Tudor
	Some bootloaders might pause the MC firmware before starting the kernel to ensure that MC will not cause faults as soon as SMMU probes due to no configuration being in place for the firmware. Make sure that MC is resumed at probe time as its SMMU setup should be done by now. Also included, a comment fix on how PL and BMT bits are packed in the StreamID. Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Link: https://lore.kernel.org/r/20201105153050.19662-2-laurentiu.tudor@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	bus: fsl-mc: add back accidentally dropped error check	Laurentiu Tudor
	A previous patch accidentally dropped an error check, so add it back. Fixes: aef85b56c3c1 ("bus: fsl-mc: MC control registers are not always available") Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Link: https://lore.kernel.org/r/20201105153050.19662-1-laurentiu.tudor@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	serial: imx: Remove unneeded of_device_get_match_data() NULL check	Fabio Estevam
	Since 5.10-rc1 i.MX is a devicetree-only platform and the NULL check on of_device_get_match_data() is no longer needed. This check was only needed when this driver supported both DT and non-DT platforms. Remove the unneeded of_device_get_match_data() NULL check. Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20201126124643.3371-1-festevam@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	soc: fix comment for freeing soc_dev_attr	Krzysztof Kozlowski
	The soc_dev_attr is stored soc_dev->attr during soc_device_register() so it could be used till the cleanup call: soc_device_unregister(). Therefore this memory should not be freed prior, but after unregistering soc device. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20201207185952.261697-1-krzk@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	MAINTAINERS: Mark SPMI as maintained	Stephen Boyd
	I can do more than just review patches here. The plan is to pick up patches from the list and shuttle them up to gregkh. The korg tree will be used to hold the pending patches. Move the list away from linux-arm-msm to just be linux-kernel as SPMI isn't msm specific anymore. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <linux-arm-msm@vger.kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20201207214204.1284946-1-sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	staging: olpc_dcon: Do not call platform_device_unregister() in dcon_probe()	Jing Xiangfeng
	In dcon_probe(), when platform_device_add() failes to add the device, it jumps to call platform_device_unregister() to remove the device, which is unnecessary. So use platform_device_put() instead. Fixes: 53c43c5ca133 ("Revert "Staging: olpc_dcon: Remove obsolete driver"") Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Link: https://lore.kernel.org/r/20201120074932.31871-1-jingxiangfeng@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	staging: most: Fix spelling mistake "tranceiver" -> "transceiver"	Colin Ian King
	There is a spelling mistake in the Kconfig help text. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20201126224602.13878-1-colin.king@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	vme: switch from 'pci_' to 'dma_' API	Christophe JAILLET
	The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below and has been hand modified to replace GFP_ with a correct flag. It has been compile tested. When memory is allocated in 'ca91cx42_alloc_consistent()' and 'tsi148_alloc_consistent()' GFP_KERNEL can be used because both functions are called only from 'vme_alloc_consistent()' (vme.c). This function is only called from the 'vme_user_probe()' probe function and no lock is taken in the between. When memory is allocated in 'ca91cx42_crcsr_init()' and 'tsi148_crcsr_init()' GFP_KERNEL can be used because both functions are called only from their corresponding probe function and no lock is taken in the between. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20201206071352.21949-1-christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	misc: pci_endpoint_test: fix return value of error branch	Xiongfeng Wang
	We return 'err' in the error branch, but this variable may be set as zero before. Fix it by setting 'err' as a negative value before we goto the error label. Fixes: e03327122e2c ("pci_endpoint_test: Add 2 ioctl commands") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Link: https://lore.kernel.org/r/1605790158-6780-1-git-send-email-wangxiongfeng2@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	misc: genwqe: Use dma_set_mask_and_coherent to simplify code	Christophe JAILLET
	'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by an equivalent 'dma_set_mask_and_coherent()' which is much less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20201201210147.7917-1-christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	misc: rtsx: rts5249 support runtime PM	Ricky Wu
	rtsx_pcr: add callback functions to support runtime PM add delay_work to put device to D3 after idle over 10 sec rts5249: add extra init flow for rtd3 and set rtd3_en from config setting rtsx_pci_sdmmc: child device support autosuspend Signed-off-by: Ricky Wu <ricky_wu@realtek.com> Link: https://lore.kernel.org/r/20201202065857.19412-1-ricky_wu@realtek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	misc: rtsx: modify and fix init_hw function	Ricky Wu
	changed rtsx_pci_disable_aspm() to rtsx_disable_aspm() do not access ASPM configuration directly changed pcie_capability_write_word() to _clear_and_set_word() make sure only change PCI_EXP_LNKCTL bit8 make sure ASPM disable after extra_init_hw() Signed-off-by: Ricky Wu <ricky_wu@realtek.com> Link: https://lore.kernel.org/r/20201202063228.18319-1-ricky_wu@realtek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	misc: rtsx: modify en/disable aspm function	Ricky Wu
	enable/disable device ASPM function: changed write ASPM configuration directly to use write register Signed-off-by: Ricky Wu <ricky_wu@realtek.com> Link: https://lore.kernel.org/r/20201202063124.18262-1-ricky_wu@realtek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	misc: vmw_vmci: fix kernel info-leak by initializing dbells in ↵	Anant Thazhemadam
	vmci_ctx_get_chkpt_doorbells() A kernel-infoleak was reported by syzbot, which was caused because dbells was left uninitialized. Using kzalloc() instead of kmalloc() fixes this issue. Reported-by: syzbot+a79e17c39564bedf0930@syzkaller.appspotmail.com Tested-by: syzbot+a79e17c39564bedf0930@syzkaller.appspotmail.com Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> Link: https://lore.kernel.org/r/20201122224534.333471-1-anant.thazhemadam@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	misc/sgi-xp: Replace in_interrupt() usage	Thomas Gleixner
	The usage of in_interrupt() in xpc_partition_disengaged() is clearly intended to avoid canceling the timeout timer when the function is invoked from the timer callback. While in_interrupt() is deprecated and ill defined as it does not provide what the name suggests it catches the intended case. Add an argument to xpc_partition_disengaged() which is true if called from timer and otherwise false. Use del_timer_sync() instead of del_singleshot_timer_sync() which is the same thing. Note: This does not prevent reentrancy into the function as the function has no concurrency control and timer callback and regular task context callers can happen concurrently on different CPUs or the timer can interrupt the task context before it is able to cancel it. While the only driver which is providing the arch_xpc_ops callbacks (xpc_uv) seems not to have a reentrancy problem and the only negative effect would be a double dev_info() entry in dmesg, the whole mechanism is conceptually broken. But that's not subject of this cleanup endeavour and left as an exercise to the folks who might have interest to make that code fully correct. [bigeasy: Add the argument, use del_timer_sync().] Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Cliff Whickman <cpw@sgi.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Robin Holt <robinmholt@gmail.com> Cc: Steve Wahl <steve.wahl@hpe.com> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Russ Anderson <russ.anderson@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20201119103151.ppo45mj53ulbxjx4@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	misc: isl29003: Fix typo for get/set mode	Zhou Xingxing
	The operation of get/set mode was same with get/set resolution. It is a typo absolutely. This patch updates these bits operated by get/set mode. Signed-off-by: Zhou Xingxing <zhou_x1@hoperun.com> Link: https://lore.kernel.org/r/1607334545-2091-1-git-send-email-zhou_x1@hoperun.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	driver core: platform: use bus_type functions	Uwe Kleine-König
	This works towards the goal mentioned in 2006 in commit 594c8281f905 ("[PATCH] Add bus_type probe, remove, shutdown methods."). The functions are moved to where the other bus_type functions are defined and renamed to match the already established naming scheme. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20201119124611.2573057-3-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	driver core: platform: change logic implementing platform_driver_probe	Uwe Kleine-König
	Instead of overwriting the core driver's probe function handle probing devices for drivers loaded by platform_driver_probe() in the platform driver probe function. The intended goal is to not have to change the probe function to simplify converting the platform bus to use bus functions. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20201119124611.2573057-2-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	driver core: platform: reorder functions	Uwe Kleine-König
	This way all callbacks and structures used to initialize platform_bus_type are defined just before platform_bus_type and in the same order. Also move platform_drv_probe_fail just before it's only user. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20201119124611.2573057-1-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	driver core: make driver_probe_device() static	Julian Wiedmann
	It's only used inside drivers/base/dd.c Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Link: https://lore.kernel.org/r/20201123111938.18968-1-jwi@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	driver core: Fix a couple of typos	Thierry Reding
	These were just some minor typos that have crept in recently and are easily fixed. Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20201127104630.1839171-1-thierry.reding@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	driver core: Reorder devices on successful probe	Thierry Reding
	Device drivers usually depend on the fact that the devices that they control are suspended in the same order that they were probed in. In most cases this is already guaranteed via deferred probe. However, there's one case where this can still break: if a device is instantiated before a dependency (for example if it appears before the dependency in device tree) but gets probed only after the dependency is probed. Instantiation order would cause the dependency to get probed later, in which case probe of the original device would be deferred and the suspend/resume queue would get reordered properly. However, if the dependency is provided by a built-in driver and the device depending on that driver is controlled by a loadable module, which may only get loaded after the root filesystem has become available, we can be faced with a situation where the probe order ends up being different from the suspend/resume order. One example where this happens is on Tegra186, where the ACONNECT is listed very early in device tree (sorted by unit-address) and depends on BPMP (listed very late because it has no unit-address) for power domains and clocks/resets. If the ACONNECT driver is built-in, there is no problem because it will be probed before BPMP, causing a probe deferral and that in turn reorders the suspend/resume queue. However, if built as a module, it will end up being probed after BPMP, and therefore not result in a probe deferral, and therefore the suspend/resume queue will stay in the instantiation order. This in turn causes problems because ACONNECT will be resumed before BPMP, which will result in a hang because the ACONNECT's power domain cannot be powered on as long as the BPMP is still suspended. Fix this by always reordering devices on successful probe. This ensures that the suspend/resume queue is always in probe order and hence meets the natural expectations of drivers vs. their dependencies. Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Acked-by: Rafael. J. Wysocki <rafael@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20201203175756.1405564-1-thierry.reding@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09	btrfs: scrub: allow scrub to work with subpage sectorsize	Qu Wenruo
	Since btrfs scrub is utilizing its own infrastructure to submit read/write, scrub is independent from all other routines. This brings one very neat feature, allow us to read 4K data into offset 0 of a 64K page. So is the writeback routine. This makes scrub on subpage sector size much easier to implement, and thanks to previous commits which just changed the implementation to always do scrub based on sector size, now scrub can handle subpage filesystem without any problem. This patch will just remove the restriction on (sectorsize != PAGE_SIZE), to make scrub finally work on subpage filesystems. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: scrub: support subpage data scrub	Qu Wenruo
	Btrfs scrub is more flexible than buffered data write path, as we can read an unaligned subpage data into page offset 0. This ability makes subpage support much easier, we just need to check each scrub_page::page_len and ensure we only calculate hash for [0, page_len) of a page. There is a small thing to notice: for subpage case, we still do sector by sector scrub. This means we will submit a read bio for each sector to scrub, resulting in the same amount of read bios, just like on the 4K page systems. This behavior can be considered as a good thing, if we want everything to be the same as 4K page systems. But this also means, we're wasting the possibility to submit larger bio using 64K page size. This is another problem to consider in the future. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: scrub: support subpage tree block scrub	Qu Wenruo
	To support subpage tree block scrub, scrub_checksum_tree_block() only needs to learn 2 new tricks: - Follow sector size Now scrub_page only represents one sector, we need to follow it properly. - Run checksum on all sectors Since scrub_page only represents one sector, we need to run checksum on all sectors, not only (nodesize >> PAGE_SIZE). Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: scrub: always allocate one full page for one sector for RAID56	Qu Wenruo
	For scrub_pages() and scrub_pages_for_parity(), we currently allocate one scrub_page structure for one page. This is fine if we only read/write one sector one time. But for cases like scrubbing RAID56, we need to read/write the full stripe, which is in 64K size for now. For subpage size, we will submit the read in just one page, which is normally a good thing, but for RAID56 case, it only expects to see one sector, not the full stripe in its endio function. This could lead to wrong parity checksum for RAID56 on subpage. To make the existing code work well for subpage case, here we take a shortcut by always allocating a full page for one sector. This should provide the base to make RAID56 work for subpage case. The cost is pretty obvious now, for one RAID56 stripe now we always need 16 pages. For support subpage situation (64K page size, 4K sector size), this means we need full one megabyte to scrub just one RAID56 stripe. And for data scrub, each 4K sector will also need one 64K page. This is mostly just a workaround, the proper fix for this is a much larger project, using scrub_block to replace scrub_page, and allow scrub_block to handle multi pages, csums, and csum_bitmap to avoid allocating one page for each sector. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: scrub: reduce width of extent_len/stripe_len from 64 to 32 bits	Qu Wenruo
	Btrfs on-disk format chose to use u64 for almost everything, but there are a other restrictions that won't let us use more than u32 for things like extent length (the maximum length is 128MiB for non-hole extents), or stripe length (we have device number limit). This means if we don't have extra handling to convert u64 to u32, we will always have some questionable operations like "u32 = u64 >> sectorsize_bits" in the code. This patch will try to address the problem by reducing the width for the following members/parameters: - scrub_parity::stripe_len - @len of scrub_pages() - @extent_len of scrub_remap_extent() - @len of scrub_parity_mark_sectors_error() - @len of scrub_parity_mark_sectors_data() - @len of scrub_extent() - @len of scrub_pages_for_parity() - @len of scrub_extent_for_parity() For members extracted from on-disk structure, like map->stripe_len, they will be kept as is. Since that modification would require on-disk format change. There will be cases like "u32 = u64 - u64" or "u32 = u64", for such call sites, extra ASSERT() is added to be extra safe for debug builds. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: refactor btrfs_lookup_bio_sums to handle out-of-order bvecs	Qu Wenruo
	Refactor btrfs_lookup_bio_sums() by: - Remove the @file_offset parameter There are two factors making the @file_offset parameter useless: * For csum lookup in csum tree, file offset makes no sense We only need disk_bytenr, which is unrelated to file_offset * page_offset (file offset) of each bvec is not contiguous. Pages can be added to the same bio as long as their on-disk bytenr is contiguous, meaning we could have pages at different file offsets in the same bio. Thus passing file_offset makes no sense any more. The only user of file_offset is for data reloc inode, we will use a new function, search_file_offset_in_bio(), to handle it. - Extract the csum tree lookup into search_csum_tree() The new function will handle the csum search in csum tree. The return value is the same as btrfs_find_ordered_sum(), returning the number of found sectors which have checksum. - Change how we do the main loop The only needed info from bio is: * the on-disk bytenr * the length After extracting the above info, we can do the search without bio at all, which makes the main loop much simpler: for (cur_disk_bytenr = orig_disk_bytenr; cur_disk_bytenr < orig_disk_bytenr + orig_len; cur_disk_bytenr += count * sectorsize) { /* Lookup csum tree / count = search_csum_tree(fs_info, path, cur_disk_bytenr, search_len, csum_dst); if (!count) { / Csum hole handling */ } } - Use single variable as the source to calculate all other offsets Instead of all different type of variables, we use only one main variable, cur_disk_bytenr, which represents the current disk bytenr. All involved values can be calculated from that variable, and all those variable will only be visible in the inner loop. The above refactoring makes btrfs_lookup_bio_sums() way more robust than it used to be, especially related to the file offset lookup. Now file_offset lookup is only related to data reloc inode, otherwise we don't need to bother file_offset at all. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: remove btrfs_find_ordered_sum call from btrfs_lookup_bio_sums	Qu Wenruo
	The function btrfs_lookup_bio_sums() is only called for read bios. While btrfs_find_ordered_sum() is to search ordered extent sums, which is only for write path. This means to read a page we either: - Submit read bio if it's not uptodate This means we only need to search csum tree for checksums. - The page is already uptodate It can be marked uptodate for previous read, or being marked dirty. As we always mark page uptodate for dirty page. In that case, we don't need to submit read bio at all, thus no need to search any checksums. Remove the btrfs_find_ordered_sum() call in btrfs_lookup_bio_sums(). And since btrfs_lookup_bio_sums() is the only caller for btrfs_find_ordered_sum(), also remove the implementation. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: handle sectorsize < PAGE_SIZE case for extent buffer accessors	Qu Wenruo
	To support sectorsize < PAGE_SIZE case, we need to take extra care of extent buffer accessors. Since sectorsize is smaller than PAGE_SIZE, one page can contain multiple tree blocks, we must use eb->start to determine the real offset to read/write for extent buffer accessors. This patch introduces two helpers to do this: - get_eb_page_index() This is to calculate the index to access extent_buffer::pages. It's just a simple wrapper around "start >> PAGE_SHIFT". For sectorsize == PAGE_SIZE case, nothing is changed. For sectorsize < PAGE_SIZE case, we always get index as 0, and the existing page shift also works. - get_eb_offset_in_page() This is to calculate the offset to access extent_buffer::pages. This needs to take extent_buffer::start into consideration. For sectorsize == PAGE_SIZE case, extent_buffer::start is always aligned to PAGE_SIZE, thus adding extent_buffer::start to offset_in_page() won't change the result. For sectorsize < PAGE_SIZE case, adding extent_buffer::start gives us the correct offset to access. This patch will touch the following parts to cover all extent buffer accessors: - BTRFS_SETGET_HEADER_FUNCS() - read_extent_buffer() - read_extent_buffer_to_user() - memcmp_extent_buffer() - write_extent_buffer_chunk_tree_uuid() - write_extent_buffer_fsid() - write_extent_buffer() - memzero_extent_buffer() - copy_extent_buffer_full() - copy_extent_buffer() - memcpy_extent_buffer() - memmove_extent_buffer() - btrfs_get_token_##bits() - btrfs_get_##bits() - btrfs_set_token_##bits() - btrfs_set_##bits() - generic_bin_search() Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: update num_extent_pages to support subpage sized extent buffer	Qu Wenruo
	For subpage sized extent buffer, we have ensured no extent buffer will cross page boundary, thus we would only need one page for any extent buffer. Update function num_extent_pages to handle such case. Now num_extent_pages() returns 1 for subpage sized extent buffer. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: don't allow tree block to cross page boundary for subpage support	Qu Wenruo
	As a preparation for subpage sector size support (allowing filesystem with sector size smaller than page size to be mounted) if the sector size is smaller than page size, we don't allow tree block to be read if it crosses 64K(*) boundary. The 64K is selected because: - we are only going to support 64K page size for subpage for now - 64K is also the maximum supported node size This ensures that tree blocks are always contained in one page for a system with 64K page size, which can greatly simplify the handling. Otherwise we would have to do complex multi-page handling of tree blocks. Currently there is no way to create such tree blocks. In kernel we have avoided such tree blocks allocation even on 4K page size, as it can lead to RAID56 stripe scrubbing. While btrfs-progs have fixed its chunk allocator since 2016 for convert, and has extra checks to do the same behavior as the kernel. Just add such graceful checks in case of an ancient filesystem. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: calculate inline extent buffer page size based on page size	Qu Wenruo
	Btrfs only support 64K as maximum node size, thus for 4K page system, we would have at most 16 pages for one extent buffer. For a system using 64K page size, we would really have just one page. While we always use 16 pages for extent_buffer::pages, this means for systems using 64K pages, we are wasting memory for 15 page pointers which will never be used. Calculate the array size based on page size and the node size maximum. - for systems using 4K page size, it will stay 16 pages - for systems using 64K page size, it will be 1 page Move the definition of BTRFS_MAX_METADATA_BLOCKSIZE to btrfs_tree.h, to avoid circular inclusion of ctree.h. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: factor out btree page submission code to a helper	Qu Wenruo
	In btree_write_cache_pages() we have a btree page submission routine buried deeply in a nested loop. This patch will extract that part of code into a helper function, submit_eb_page(), to do the same work. Since submit_eb_page() now can return >0 for successful extent buffer submission, remove the "ASSERT(ret <= 0);" line. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: make btrfs_verify_data_csum follow sector size	Qu Wenruo
	Currently btrfs_verify_data_csum() just passes the whole page to check_data_csum(), which is fine since we only support sectorsize == PAGE_SIZE. To support subpage, we need to properly honor per-sector checksum verification, just like what we did in dio read path. This patch will do the csum verification in a for loop, starts with pg_off == start - page_offset(page), with sectorsize increase for each loop. For sectorsize == PAGE_SIZE case, the pg_off will always be 0, and we will only loop once. For subpage case, we do the iterate over each sector and if we found any error, we return error. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: pass bio_offset to check_data_csum() directly	Qu Wenruo
	Parameter icsum for check_data_csum() is a little hard to understand. So is the phy_offset for btrfs_verify_data_csum(). Both parameters are calculated values for csum lookup. Instead of some calculated value, just pass bio_offset and let the final and only user, check_data_csum(), calculate whatever it needs. Since we are here, also make the bio_offset parameter and some related variables to be u32 (unsigned int). As bio size is limited by its bi_size, which is unsigned int, and has extra size limit check during various bio operations. Thus we are ensured that bio_offset won't overflow u32. Thus for all involved functions, not only rename the parameter from @phy_offset to @bio_offset, but also reduce its width to u32, so we won't have suspicious "u32 = u64 >> sector_bits;" lines anymore. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: rename bio_offset of extent_submit_bio_start_t to dio_file_offset	Qu Wenruo
	The parameter bio_offset of extent_submit_bio_start_t is very confusing. If it's really bio_offset (offset to bio), then it should be u32. But in fact, it's only utilized by dio read, and that member is used as file offset, which must be u64. Rename it to dio_file_offset since the only user uses it as file offset, and add comment for who is using it. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: fix lockdep warning when creating free space tree	Boris Burkov
	A lock dependency loop exists between the root tree lock, the extent tree lock, and the free space tree lock. The root tree lock depends on the free space tree lock because btrfs_create_tree holds the new tree's lock while adding it to the root tree. The extent tree lock depends on the root tree lock because during umount, we write out space cache v1, which writes inodes in the root tree, which results in holding the root tree lock while doing a lookup in the extent tree. Finally, the free space tree depends on the extent tree because populate_free_space_tree holds a locked path in the extent tree and then does a lookup in the free space tree to add the new item. The simplest of the three to break is the one during tree creation: we unlock the leaf before inserting the tree node into the root tree, which fixes the lockdep warning. [30.480136] ====================================================== [30.480830] WARNING: possible circular locking dependency detected [30.481457] 5.9.0-rc8+ #76 Not tainted [30.481897] ------------------------------------------------------ [30.482500] mount/520 is trying to acquire lock: [30.483064] ffff9babebe03908 (btrfs-free-space-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180 [30.484054] but task is already holding lock: [30.484637] ffff9babebe24468 (btrfs-extent-01#2){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180 [30.485581] which lock already depends on the new lock. [30.486397] the existing dependency chain (in reverse order) is: [30.487205] -> #2 (btrfs-extent-01#2){++++}-{3:3}: [30.487825] down_read_nested+0x43/0x150 [30.488306] __btrfs_tree_read_lock+0x39/0x180 [30.488868] __btrfs_read_lock_root_node+0x3a/0x50 [30.489477] btrfs_search_slot+0x464/0x9b0 [30.490009] check_committed_ref+0x59/0x1d0 [30.490603] btrfs_cross_ref_exist+0x65/0xb0 [30.491108] run_delalloc_nocow+0x405/0x930 [30.491651] btrfs_run_delalloc_range+0x60/0x6b0 [30.492203] writepage_delalloc+0xd4/0x150 [30.492688] __extent_writepage+0x18d/0x3a0 [30.493199] extent_write_cache_pages+0x2af/0x450 [30.493743] extent_writepages+0x34/0x70 [30.494231] do_writepages+0x31/0xd0 [30.494642] __filemap_fdatawrite_range+0xad/0xe0 [30.495194] btrfs_fdatawrite_range+0x1b/0x50 [30.495677] __btrfs_write_out_cache+0x40d/0x460 [30.496227] btrfs_write_out_cache+0x8b/0x110 [30.496716] btrfs_start_dirty_block_groups+0x211/0x4e0 [30.497317] btrfs_commit_transaction+0xc0/0xba0 [30.497861] sync_filesystem+0x71/0x90 [30.498303] btrfs_remount+0x81/0x433 [30.498767] reconfigure_super+0x9f/0x210 [30.499261] path_mount+0x9d1/0xa30 [30.499722] do_mount+0x55/0x70 [30.500158] __x64_sys_mount+0xc4/0xe0 [30.500616] do_syscall_64+0x33/0x40 [30.501091] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [30.501629] -> #1 (btrfs-root-00){++++}-{3:3}: [30.502241] down_read_nested+0x43/0x150 [30.502727] __btrfs_tree_read_lock+0x39/0x180 [30.503291] __btrfs_read_lock_root_node+0x3a/0x50 [30.503903] btrfs_search_slot+0x464/0x9b0 [30.504405] btrfs_insert_empty_items+0x60/0xa0 [30.504973] btrfs_insert_item+0x60/0xd0 [30.505412] btrfs_create_tree+0x1b6/0x210 [30.505913] btrfs_create_free_space_tree+0x54/0x110 [30.506460] btrfs_mount_rw+0x15d/0x20f [30.506937] btrfs_remount+0x356/0x433 [30.507369] reconfigure_super+0x9f/0x210 [30.507868] path_mount+0x9d1/0xa30 [30.508264] do_mount+0x55/0x70 [30.508668] __x64_sys_mount+0xc4/0xe0 [30.509186] do_syscall_64+0x33/0x40 [30.509652] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [30.510271] -> #0 (btrfs-free-space-00){++++}-{3:3}: [30.510972] __lock_acquire+0x11ad/0x1b60 [30.511432] lock_acquire+0xa2/0x360 [30.511917] down_read_nested+0x43/0x150 [30.512383] __btrfs_tree_read_lock+0x39/0x180 [30.512947] __btrfs_read_lock_root_node+0x3a/0x50 [30.513455] btrfs_search_slot+0x464/0x9b0 [30.513947] search_free_space_info+0x45/0x90 [30.514465] __add_to_free_space_tree+0x92/0x39d [30.515010] btrfs_create_free_space_tree.cold.22+0x1ee/0x45d [30.515639] btrfs_mount_rw+0x15d/0x20f [30.516142] btrfs_remount+0x356/0x433 [30.516538] reconfigure_super+0x9f/0x210 [30.517065] path_mount+0x9d1/0xa30 [30.517438] do_mount+0x55/0x70 [30.517824] __x64_sys_mount+0xc4/0xe0 [30.518293] do_syscall_64+0x33/0x40 [30.518776] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [30.519335] other info that might help us debug this: [30.520210] Chain exists of: btrfs-free-space-00 --> btrfs-root-00 --> btrfs-extent-01#2 [30.521407] Possible unsafe locking scenario: [30.522037] CPU0 CPU1 [30.522456] ---- ---- [30.522941] lock(btrfs-extent-01#2); [30.523311] lock(btrfs-root-00); [30.523952] lock(btrfs-extent-01#2); [30.524620] lock(btrfs-free-space-00); [30.525068] * DEADLOCK * [30.525669] 5 locks held by mount/520: [30.526116] #0: ffff9babebc520e0 (&type->s_umount_key#37){+.+.}-{3:3}, at: path_mount+0x7ef/0xa30 [30.527056] #1: ffff9babebc52640 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x3d5/0x5c0 [30.527960] #2: ffff9babeae8f2e8 (&cache->free_space_lock#2){+.+.}-{3:3}, at: btrfs_create_free_space_tree.cold.22+0x101/0x45d [30.529118] #3: ffff9babebe24468 (btrfs-extent-01#2){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180 [30.530113] #4: ffff9babebd52eb8 (btrfs-extent-00){++++}-{3:3}, at: btrfs_try_tree_read_lock+0x16/0x100 [30.531124] stack backtrace: [30.531528] CPU: 0 PID: 520 Comm: mount Not tainted 5.9.0-rc8+ #76 [30.532166] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-4.module_el8.1.0+248+298dec18 04/01/2014 [30.533215] Call Trace: [30.533452] dump_stack+0x8d/0xc0 [30.533797] check_noncircular+0x13c/0x150 [30.534233] __lock_acquire+0x11ad/0x1b60 [30.534667] lock_acquire+0xa2/0x360 [30.535063] ? __btrfs_tree_read_lock+0x39/0x180 [30.535525] down_read_nested+0x43/0x150 [30.535939] ? __btrfs_tree_read_lock+0x39/0x180 [30.536400] __btrfs_tree_read_lock+0x39/0x180 [30.536862] __btrfs_read_lock_root_node+0x3a/0x50 [30.537304] btrfs_search_slot+0x464/0x9b0 [30.537713] ? trace_hardirqs_on+0x1c/0xf0 [30.538148] search_free_space_info+0x45/0x90 [30.538572] __add_to_free_space_tree+0x92/0x39d [30.539071] ? printk+0x48/0x4a [30.539367] btrfs_create_free_space_tree.cold.22+0x1ee/0x45d [30.539972] btrfs_mount_rw+0x15d/0x20f [30.540350] btrfs_remount+0x356/0x433 [30.540773] ? shrink_dcache_sb+0xd9/0x100 [30.541203] reconfigure_super+0x9f/0x210 [30.541642] path_mount+0x9d1/0xa30 [30.542040] do_mount+0x55/0x70 [30.542366] __x64_sys_mount+0xc4/0xe0 [30.542822] do_syscall_64+0x33/0x40 [30.543197] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [30.543691] RIP: 0033:0x7f109f7ab93a [30.546042] RSP: 002b:00007ffc47c4f858 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [30.546770] RAX: ffffffffffffffda RBX: 00007f109f8cf264 RCX: 00007f109f7ab93a [30.547485] RDX: 0000557e6fc10770 RSI: 0000557e6fc19cf0 RDI: 0000557e6fc19cd0 [30.548185] RBP: 0000557e6fc10520 R08: 0000557e6fc18e30 R09: 0000557e6fc18cb0 [30.548911] R10: 0000000000200020 R11: 0000000000000246 R12: 0000000000000000 [30.549606] R13: 0000557e6fc19cd0 R14: 0000557e6fc10770 R15: 0000557e6fc10520 Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: skip space_cache v1 setup when not using it	Boris Burkov
	If we are not using space cache v1, we should not create the free space object or free space inodes. This comes up when we delete the existing free space objects/inodes when migrating to v2, only to see them get recreated for every dirtied block group. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09	btrfs: remove free space items when disabling space cache v1	Boris Burkov
	When the filesystem transitions from space cache v1 to v2 or to nospace_cache, it removes the old cached data, but does not remove the FREE_SPACE items nor the free space inodes they point to. This doesn't cause any issues besides being a bit inefficient, since these items no longer do anything useful. To fix it, when we are mounting, and plan to disable the space cache, destroy each block group's free space item and free space inode. The code to remove the items is lifted from the existing use case of removing the block group, with a light adaptation to handle whether or not we have already looked up the free space inode. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>