linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2021-10-26	skmsg: Extract and reuse sk_msg_is_readable()	Cong Wang
	tcp_bpf_sock_is_readable() is pretty much generic, we can extract it and reuse it for non-TCP sockets. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211008203306.37525-3-xiyou.wangcong@gmail.com
2021-10-26	net: Rename ->stream_memory_read to ->sock_is_readable	Cong Wang
	The proto ops ->stream_memory_read() is currently only used by TCP to check whether psock queue is empty or not. We need to rename it before reusing it for non-TCP protocols, and adjust the exsiting users accordingly. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211008203306.37525-2-xiyou.wangcong@gmail.com
2021-10-26	tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function	Liu Jian
	With two Msgs, msgA and msgB and a user doing nonblocking sendmsg calls (or multiple cores) on a single socket 'sk' we could get the following flow. msgA, sk msgB, sk ----------- --------------- tcp_bpf_sendmsg() lock(sk) psock = sk->psock tcp_bpf_sendmsg() lock(sk) ... blocking tcp_bpf_send_verdict if (psock->eval == NONE) psock->eval = sk_psock_msg_verdict .. < handle SK_REDIRECT case > release_sock(sk) < lock dropped so grab here > ret = tcp_bpf_sendmsg_redir psock = sk->psock tcp_bpf_send_verdict lock_sock(sk) ... blocking on B if (psock->eval == NONE) <- boom. psock->eval will have msgA state The problem here is we dropped the lock on msgA and grabbed it with msgB. Now we have old state in psock and importantly psock->eval has not been cleared. So msgB will run whatever action was done on A and the verdict program may never see it. Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20211012052019.184398-1-liujian56@huawei.com
2021-10-26	x86: Fix __get_wchan() for !STACKTRACE	Peter Zijlstra
	Use asm/unwind.h to implement wchan, since we cannot always rely on STACKTRACE=y. Fixes: bc9bbb81730e ("x86: Fix get_wchan() to support the ORC unwinder") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20211022152104.137058575@infradead.org
2021-10-26	spi: spi-rpc-if: Check return value of rpcif_sw_init()	Lad Prabhakar
	rpcif_sw_init() can fail so make sure we check the return value of it and on error exit rpcif_spi_probe() callback with error code. Fixes: eb8d6d464a27 ("spi: add Renesas RPC-IF driver") Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20211025205631.21151-4-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-26	spi: tegra210-quad: Put device into suspend on driver removal	Dmitry Osipenko
	pm_runtime_disable() cancels all pending power requests, while they should be completed for the Tegra SPI driver. Otherwise SPI clock won't be disabled ever again because clk refcount will become unbalanced. Enforce runtime PM suspension to put device into expected state before driver is unbound and device's RPM state is reset by driver's core. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Link: https://lore.kernel.org/r/20211023225951.14253-2-digetx@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-26	spi: tegra20-slink: Put device into suspend on driver removal	Dmitry Osipenko
	pm_runtime_disable() cancels all pending power requests, while they should be completed for the Tegra SPI driver. Otherwise SPI clock won't be disabled ever again because clk refcount will become unbalanced. Enforce runtime PM suspension to put device into expected state before driver is unbound and device's RPM state is reset by driver's core. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Link: https://lore.kernel.org/r/20211023225951.14253-1-digetx@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-26	spi: bcm-qspi: Fix missing clk_disable_unprepare() on error in bcm_qspi_probe()	Yang Yingliang
	Fix the missing clk_disable_unprepare() before return from bcm_qspi_probe() in the error handling case. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20211018073413.2029081-1-yangyingliang@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-26	Merge series "ASoC: cs42l42: Fixes to power-down" from Richard Fitzgerald ↵	Mark Brown
	<rf@opensource.cirrus.com>: Driver probe and remove were inconsistent in what they did to power-down and neither did all steps. In addition to that, neither function prevented the interrupt handler from running during and after power-down. Richard Fitzgerald (2): ASoC: cs42l42: Reset and power-down on remove() and failed probe() ASoC: cs42l42: free_irq() before powering-down on probe() fail sound/soc/codecs/cs42l42.c \| 43 +++++++++++++++++++++++++++++++------------ 1 file changed, 31 insertions(+), 12 deletions(-) -- 2.11.0
2021-10-26	Merge series "Update Lpass digital codec macro drivers" from Srinivasa Rao ↵	Mark Brown
	Mandadapu <srivasam@codeaurora.org>: This patch set is to add support for lpass sc7280 based targets. Upadate compatible name and change of bulk clock voting to optional clock voting in digital codecs va, rx, tx macro drivers. Changes Since V3: -- Removed fixes tag. -- Change signedoff by sequence. Changes Since V2: -- Add Tx macro deafults for lpass sc7280 Changes Since V1: -- Removed individual clock voting and used bulk clock optional. -- Removed volatile changes and fixed default values. -- Typo errors. Srinivasa Rao Mandadapu (5): ASoC: qcom: Add compatible names in va,wsa,rx,tx codec drivers for sc7280 ASoC: qcom: dt-bindings: Add compatible names for lpass sc7280 digital codecs ASoC: codecs: tx-macro: Enable tx top soundwire mic clock ASoC: codecs: tx-macro: Update tx default values ASoC: codecs: Change bulk clock voting to optional voting in digital codecs .../bindings/sound/qcom,lpass-rx-macro.yaml \| 4 +++- .../bindings/sound/qcom,lpass-tx-macro.yaml \| 4 +++- .../bindings/sound/qcom,lpass-va-macro.yaml \| 4 +++- .../bindings/sound/qcom,lpass-wsa-macro.yaml \| 4 +++- sound/soc/codecs/lpass-rx-macro.c \| 3 ++- sound/soc/codecs/lpass-tx-macro.c \| 25 +++++++++++++++++++--- sound/soc/codecs/lpass-va-macro.c \| 3 ++- sound/soc/codecs/lpass-wsa-macro.c \| 1 + 8 files changed, 39 insertions(+), 9 deletions(-) -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
2021-10-26	Merge series "ASoC: qcom: Add AudioReach support" from Srinivas Kandagatla ↵	Mark Brown
	<srinivas.kandagatla@linaro.org>: Hi Mark, This version is a respin of v10 fixing a build error in 12/17 patch. QCOM SoC relevant non-audio patches in this series has been merged into the Qualcomm drivers-for-5.16 tree, as this series depends those patches an immutable tag is available at: https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git tags/20210927135559.738-6-srinivas.kandagatla@linaro.org This patchset adds ASoC driver support to configure signal processing framework ("AudioReach") which is integral part of Qualcomm next generation audio SDK and will be deployed on upcoming Qualcomm chipsets. It makes use of ASoC Topology to load graphs on to the DSP which is then managed by APM (Audio Processing Manager) service to prepare/start/stop. Here is simplified high-level block diagram of AudioReach: ___________________________________________________________ \| CPU (Application Processor) \| \| +---------+ +---------+ +----------+ \| \| \| q6apm \| \| q6apm \| \| q6apm \| \| \| \| dais \| <------> \| \| <-----> \|lpass-dais\| \| \| +---------+ +---------+ +----------+ \| \| ^ ^ \| \| \| \| +---------+ \| \| +---------+ v +---------->\|topology \| \| \| \| q6prm \| +---------+ \| \| \| \| \| \|<-------->\| GPR \| +---------+ \| \| +---------+ +---------+ \| \| ^ ^ \| \| \| \| \| \| +----------+ \| \| \| \| q6prm \| \| \| \| \|lpass-clks\| \| \| \| +----------+ \| \| \|____________________________\|______________________________\| \| \| RPMSG (IPC over GLINK) ____________________________\|______________________________ \| \| \| \| +-----------------------+ \| \| \| \| \| \| v v q6 (Audio DSP) \| \|+-----+ +----------------------------------+ \| \|\| PRM \| \| APM (Audio Processing Manager) \| \| \|+-----+ \| . Graph Management \| \| \| \| . Command Handing \| \| \| \| . Event Management \| \| \| \| ... \| \| \| +----------------------------------+ \| \| ^ \| \|____________________________\|______________________________\| \| \| LPASS AIF ____________________________\|______________________________ \| \| Audio I/O \| \| v \| \| +--------------------------------------------------+ \| \| \| Audio devices \| \| \| \| CODEC \| HDMI-TX \| PCM \| SLIMBUS \| I2S \|MI2S \|...\| \| \| \| \| \| \| +--------------------------------------------------+ \| \|___________________________________________________________\| AudioReach has constructs of sub-graph, container and modules. Each sub-graph can have N containers and each Container can have N Modules and connections between them can be linear or non-linear. An audio function can be realized with one or many connected sub-graphs. There are also control/event paths between modules that can be wired up while building graph to achieve various control mechanism between modules. These concepts of Sub-Graph, Containers and Modules are represented in ASoC topology. Here is simple I2S graph with a Write Shared Memory and a Volume control module within a single Subgraph (1) with one Container (1) and 5 modules. ____________________________________________________________ \| Sub-Graph [1] \| \| _______________________________________________________ \| \| \| Container [1] \| \| \| \| [WR_SH] -> [PCM DEC] -> [PCM CONV] -> [VOL]-> [I2S-EP]\| \| \| \|_______________________________________________________\| \| \|____________________________________________________________\| For now this graph is split into two subgraphs to achieve dpcm like below: ________________________________________________ _________________ \| Sub-Graph [1] \| \| Sub-Graph [2] \| \| ____________________________________________ \| \| _____________ \| \| \| Container [1] \| \| \| \|Container [2]\| \| \| \| [WR_SH] -> [PCM DEC] -> [PCM CONV] -> [VOL]\| \| \| \| [I2S-EP] \| \| \| \|____________________________________________\| \| \| \|_____________\| \| \|________________________________________________\| \|_________________\| _________________ \| Sub-Graph [3] \| \| _____________ \| \| \|Container [3]\| \| \| \| [DMA-EP] \| \| \| \|_____________\| \| \|_________________\| This patchset adds very minimal support for AudioReach which includes supporting sub-graphs containing CODEC DMA ports and simple PCM Decoder/Encoder and Logger Modules. Additional capabilities will be built over time to expose features offered by AudioReach. This patchset is Tested on SM8250 SoC based Qualcomm Robotics Platform RB5 and SM9250 MTP with WSA881X Smart Speaker Amplifiers, DMICs connected via VA Macro and WCD938x Codec connected via TX and RX Macro and HDMI audio via I2S. First 10 Patches are mostly reorganization existing Old QDSP Audio Framework code and bindings so that we could reuse them on AudioReach. ASoC topology graphs for DragonBoard RB5 and SM8250 MTP are available at https://git.linaro.org/people/srinivas.kandagatla/audioreach-topology.git/ and Qualcomm AudioReach DSP headers are available at: https://source.codeaurora.org/quic/la/platform/vendor/opensource/arspf-headers Note: There is one false positive warning in this patchset: audioreach.c:80:45: warning: array of flexible structures Thanks, srini Changes since v10: - fix build error during arm64 defconfig build reported by Mark in 12/17 patch for audioreach_tplg_init symbol Srinivas Kandagatla (17): ASoC: dt-bindings: move LPASS dai related bindings out of q6afe ASoC: dt-bindings: move LPASS clocks related bindings out of q6afe ASoC: dt-bindings: rename q6afe.h to q6dsp-lpass-ports.h ASoC: qdsp6: q6afe-dai: move lpass audio ports to common file ASoC: qdsp6: q6afe-clocks: move audio-clocks to common file ASoC: dt-bindings: q6dsp: add q6apm-lpass-dai compatible ASoC: dt-bindings: lpass-clocks: add q6prm clocks compatible ASoC: dt-bindings: add q6apm digital audio stream bindings ASoC: qdsp6: audioreach: add basic pkt alloc support ASoC: qdsp6: audioreach: add q6apm support ASoC: qdsp6: audioreach: add module configuration command helpers ASoC: qdsp6: audioreach: add Kconfig and Makefile ASoC: qdsp6: audioreach: add topology support ASoC: qdsp6: audioreach: add q6apm-dai support ASoC: qdsp6: audioreach: add q6apm lpass dai support ASoC: qdsp6: audioreach: add q6prm support ASoC: qdsp6: audioreach: add support for q6prm-clocks .../devicetree/bindings/sound/qcom,q6afe.txt \| 181 --- .../bindings/sound/qcom,q6apm-dai.yaml \| 53 + .../sound/qcom,q6dsp-lpass-clocks.yaml \| 77 ++ .../sound/qcom,q6dsp-lpass-ports.yaml \| 205 +++ include/dt-bindings/sound/qcom,q6afe.h \| 203 +-- .../sound/qcom,q6dsp-lpass-ports.h \| 208 +++ include/uapi/sound/snd_ar_tokens.h \| 208 +++ sound/soc/qcom/Kconfig \| 22 + sound/soc/qcom/qdsp6/Makefile \| 11 +- sound/soc/qcom/qdsp6/audioreach.c \| 1130 +++++++++++++++++ sound/soc/qcom/qdsp6/audioreach.h \| 726 +++++++++++ sound/soc/qcom/qdsp6/q6afe-clocks.c \| 187 +-- sound/soc/qcom/qdsp6/q6afe-dai.c \| 687 +--------- sound/soc/qcom/qdsp6/q6apm-dai.c \| 416 ++++++ sound/soc/qcom/qdsp6/q6apm-lpass-dais.c \| 260 ++++ sound/soc/qcom/qdsp6/q6apm.c \| 822 ++++++++++++ sound/soc/qcom/qdsp6/q6apm.h \| 152 +++ sound/soc/qcom/qdsp6/q6dsp-lpass-clocks.c \| 186 +++ sound/soc/qcom/qdsp6/q6dsp-lpass-clocks.h \| 30 + sound/soc/qcom/qdsp6/q6dsp-lpass-ports.c \| 627 +++++++++ sound/soc/qcom/qdsp6/q6dsp-lpass-ports.h \| 22 + sound/soc/qcom/qdsp6/q6prm-clocks.c \| 85 ++ sound/soc/qcom/qdsp6/q6prm.c \| 202 +++ sound/soc/qcom/qdsp6/q6prm.h \| 78 ++ sound/soc/qcom/qdsp6/topology.c \| 1113 ++++++++++++++++ 25 files changed, 6664 insertions(+), 1227 deletions(-) create mode 100644 Documentation/devicetree/bindings/sound/qcom,q6apm-dai.yaml create mode 100644 Documentation/devicetree/bindings/sound/qcom,q6dsp-lpass-clocks.yaml create mode 100644 Documentation/devicetree/bindings/sound/qcom,q6dsp-lpass-ports.yaml create mode 100644 include/dt-bindings/sound/qcom,q6dsp-lpass-ports.h create mode 100644 include/uapi/sound/snd_ar_tokens.h create mode 100644 sound/soc/qcom/qdsp6/audioreach.c create mode 100644 sound/soc/qcom/qdsp6/audioreach.h create mode 100644 sound/soc/qcom/qdsp6/q6apm-dai.c create mode 100644 sound/soc/qcom/qdsp6/q6apm-lpass-dais.c create mode 100644 sound/soc/qcom/qdsp6/q6apm.c create mode 100644 sound/soc/qcom/qdsp6/q6apm.h create mode 100644 sound/soc/qcom/qdsp6/q6dsp-lpass-clocks.c create mode 100644 sound/soc/qcom/qdsp6/q6dsp-lpass-clocks.h create mode 100644 sound/soc/qcom/qdsp6/q6dsp-lpass-ports.c create mode 100644 sound/soc/qcom/qdsp6/q6dsp-lpass-ports.h create mode 100644 sound/soc/qcom/qdsp6/q6prm-clocks.c create mode 100644 sound/soc/qcom/qdsp6/q6prm.c create mode 100644 sound/soc/qcom/qdsp6/q6prm.h create mode 100644 sound/soc/qcom/qdsp6/topology.c -- 2.21.0
2021-10-26	drm: panel-orientation-quirks: Add quirk for GPD Win3	Mario
	Fixes screen orientation for GPD Win 3 handheld gaming console. Signed-off-by: Mario Risoldi <awxkrnl@gmail.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211026112737.9181-1-awxkrnl@gmail.com
2021-10-26	watchdog: Fix OMAP watchdog early handling	Walter Stoll
	TI's implementation does not service the watchdog even if the kernel command line parameter omap_wdt.early_enable is set to 1. This patch fixes the issue. Signed-off-by: Walter Stoll <walter.stoll@duagon.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/88a8fe5229cd68fa0f1fd22f5d66666c1b7057a0.camel@duagon.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-10-26	watchdog: ixp4xx_wdt: Fix address space warning	Guenter Roeck
	sparse reports the following address space warning. drivers/watchdog/ixp4xx_wdt.c:122:20: sparse: incorrect type in assignment (different address spaces) drivers/watchdog/ixp4xx_wdt.c:122:20: sparse: expected void [noderef] __iomem base drivers/watchdog/ixp4xx_wdt.c:122:20: sparse: got void platform_data Add a typecast to solve the problem. Fixes: 21a0a29d16c6 ("watchdog: ixp4xx: Rewrite driver to use core") Cc: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210911042925.556889-1-linux@roeck-us.net Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-10-26	watchdog: sbsa: drop unneeded MODULE_ALIAS	Krzysztof Kozlowski
	The MODULE_DEVICE_TABLE already creates proper alias for platform driver. Having another MODULE_ALIAS causes the alias to be duplicated. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210917092024.19323-1-krzysztof.kozlowski@canonical.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-10-26	watchdog: sbsa: only use 32-bit accessors	Jamie Iles
	SBSA says of the generic watchdog: All registers are 32 bits in size and should be accessed using 32-bit reads and writes. If an access size other than 32 bits is used then the results are IMPLEMENTATION DEFINED. and for qemu, the implementation will only allow 32-bit accesses resulting in a synchronous external abort when configuring the watchdog. Use lo_hi_* accessors rather than a readq/writeq. Fixes: abd3ac7902fb ("watchdog: sbsa: Support architecture version 1") Signed-off-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20210903112101.493552-1-quic_jiles@quicinc.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-10-26	Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout"	Guenter Roeck
	This reverts commit cb011044e34c ("watchdog: iTCO_wdt: Account for rebooting on second timeout") and commit aec42642d91f ("watchdog: iTCO_wdt: Fix detection of SMI-off case") since those patches cause a regression on certain boards (https://bugzilla.kernel.org/show_bug.cgi?id=213809). While this revert may result in some boards to only reset after twice the configured timeout value, that is still better than a watchdog reset after half the configured value. Fixes: cb011044e34c ("watchdog: iTCO_wdt: Account for rebooting on second timeout") Fixes: aec42642d91f ("watchdog: iTCO_wdt: Fix detection of SMI-off case") Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Mantas Mikulėnas <grawity@gmail.com> Reported-by: Javier S. Pedro <debbugs@javispedro.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20211008003302.1461733-1-linux@roeck-us.net Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2021-10-26	qcom: spm: allow compile-testing	Arnd Bergmann
	ARM_QCOM_SPM_CPUIDLE can be selected when compile-testing on other architectures, but this causes a Kconfig warning for QCOM_SPM: WARNING: unmet direct dependencies detected for QCOM_SPM Depends on [n]: ARCH_QCOM [=n] Selected by [y]: - ARM_QCOM_SPM_CPUIDLE [=y] && CPU_IDLE [=y] && (ARM [=y] \|\| ARM64) && (ARCH_QCOM [=n] \|\| COMPILE_TEST [=y]) && !ARM64 && MMU [=y] Make it possible to also compile-test this one, which can be done now that v5.15-rc5 lets you select QCOM_SCM everywhere. Fixes: a871be6b8eee ("cpuidle: Convert Qualcomm SPM driver to a generic CPUidle driver") Fixes: 498ba2a8a275 ("cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration") Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-26	pinctrl: tegra: Fix warnings and error	Prathamesh Shete
	Fix warnings are errors caused by commit a42c7d95d29e ("pinctrl: tegra: Use correct offset for pin group"). Signed-off-by: Prathamesh Shete <pshete@nvidia.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-10-26	tty: rpmsg: Define tty name via constant string literal	Andy Shevchenko
	Driver uses already twice the same string literal. Define it in one place, so every user will have this name consistent. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211025135148.53944-5-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	tty: rpmsg: Add pr_fmt() to prefix messages	Andy Shevchenko
	Make all messages to be prefixed in a unified way. Add pr_fmt() to achieve this. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211025135148.53944-4-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	tty: rpmsg: Use dev_err_probe() in ->probe()	Andy Shevchenko
	It's fine to use dev_err_probe() in ->probe() even if we know it won't be deferred. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211025135148.53944-3-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	tty: rpmsg: Unify variable used to keep an error code	Andy Shevchenko
	In some ret is used, in the other err. Let's unify it across the driver. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211025135148.53944-2-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	tty: rpmsg: Assign returned id to a local variable	Andy Shevchenko
	Instead of putting garbage in the data structure, assign allocated id or an error code to a temporary variable. This makes code cleaner. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211025135148.53944-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	serial: stm32: push DMA RX data before suspending	Erwan Le Ray
	Data may be stored in DMA RX buffer, when suspending. The data needs to be pushed to the upper layer. We can't rely on the timeout IRQ (RTOR) that can't be triggered into low power state. So safely clear DMA request (DMAR), force the DMA reception routines to push RX buffer content, before disabling RX DMA. This way, handover to pio mode is safe. Only call tty_flip_buffer_push() when there is RX data to handle. Move the locking outside of stm32_usart_receive_chars() to prevent a race condition, when disabling DMA request upon suspend / pm_runtime_suspend. Data may be received under IRQ and pushed before stm32_usart_receive_chars() has pushed older data from DMA rx_buf upon suspend. The sequence in suspend routine needs proper locking to avoid this. Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Signed-off-by: Erwan Le Ray <erwan.leray@foss.st.com> Link: https://lore.kernel.org/r/20211025134229.8456-4-erwan.leray@foss.st.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	serial: stm32: terminate / restart DMA transfer at suspend / resume	Erwan Le Ray
	DMA prevents the system to suspend when an UART RX wake-up source is using DMA. DMA can't suspend while DMA channels are still active. Terminate DMA transfer at suspend, and restart a new DMA transfer at resume. Create stm32_usart_start_rx_dma_cyclic function to factorize dma RX initialization. Move RX DMA code related to wakeup into stm32_usart_serial_en_wakeup() routine to ease further improvements on wakeup from low power modes. Don't enable/disable wakeup on uninitialized port. There may be data residue in the RX FIFO while suspending. Flush it at suspend time. Receiver timeout interrupt won't trigger later in low power mode, so call stm32_usart_receive_chars() in case there's data to handle. Signed-off-by: Valentin Caron <valentin.caron@foss.st.com> Signed-off-by: Erwan Le Ray <erwan.leray@foss.st.com> Link: https://lore.kernel.org/r/20211025134229.8456-3-erwan.leray@foss.st.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	serial: stm32: rework RX dma initialization and release	Erwan Le Ray
	The RX DMA channel is kept active forever (from the probe). That prevents going to low power mode when it is used. This change moves the DMA configuration and enabling procedures to startup routine to allow transition to low power mode. The DMA disabling procedure is implemented in stop_rx routine as this ops has to stop characters reception, and DMA transation in shutdown. Clean useless dma_async_tx_descriptor initialization to NULL value. Signed-off-by: Valentin Caron <valentin.caron@foss.st.com> Signed-off-by: Erwan Le Ray <erwan.leray@foss.st.com> Link: https://lore.kernel.org/r/20211025134229.8456-2-erwan.leray@foss.st.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	net/mlx5: remove the recent devlink params	Jakub Kicinski
	revert commit 46ae40b94d88 ("net/mlx5: Let user configure io_eq_size param") revert commit a6cb08daa3b4 ("net/mlx5: Let user configure event_eq_size param") revert commit 554604061979 ("net/mlx5: Let user configure max_macs param") The EQE parameters are applicable to more drivers, they should be configured via standard API, probably ethtool. Example of another driver needing something similar: https://lore.kernel.org/all/1633454136-14679-3-git-send-email-sbhatta@marvell.com/ The last param for "max_macs" is probably fine but the documentation is severely lacking. The meaning and implications for changing the param need to be stated. Link: https://lore.kernel.org/r/20211026152939.3125950-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-26	serial: 8250_pci: Remove empty stub pci_quatech_exit()	Andy Shevchenko
	The ->exit() callback is checked for presence anyway, no need to have an empty stub. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211026133452.61657-2-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	serial: 8250_pci: Replace custom pci_match_id() implementation	Andy Shevchenko
	Replace pci_quatech_amcc() with generic pci_match_id(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211026133452.61657-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	serial: xilinx_uartps: Fix race condition causing stuck TX	Anssi Hannula
	xilinx_uartps .start_tx() clears TXEMPTY when enabling TXEMPTY to avoid any previous TXEVENT event asserting the UART interrupt. This clear operation is done immediately after filling the TX FIFO. However, if the bytes inserted by cdns_uart_handle_tx() are consumed by the UART before the TXEMPTY is cleared, the clear operation eats the new TXEMPTY event as well, causing cdns_uart_isr() to never receive the TXEMPTY event. If there are bytes still queued in circbuf, TX will get stuck as they will never get transferred to FIFO (unless new bytes are queued to circbuf in which case .start_tx() is called again). While the racy missed TXEMPTY occurs fairly often with short data sequences (e.g. write 1 byte), in those cases circbuf is usually empty so no action on TXEMPTY would have been needed anyway. On the other hand, longer data sequences make the race much more unlikely as UART takes longer to consume the TX FIFO. Therefore it is rare for this race to cause visible issues in general. Fix the race by clearing the TXEMPTY bit in ISR before filling the FIFO. The TXEMPTY bit in ISR will only get asserted at the exact moment the TX FIFO becomes empty, so clearing the bit before filling FIFO does not cause an extra immediate assertion even if the FIFO is initially empty. This is hard to reproduce directly on a normal system, but inserting e.g. udelay(200) after cdns_uart_handle_tx(port), setting 4000000 baud, and then running "dd if=/dev/zero bs=128 of=/dev/ttyPS0 count=50" reliably reproduces the issue on my ZynqMP test system unless this fix is applied. Fixes: 85baf542d54e ("tty: xuartps: support 64 byte FIFO size") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Link: https://lore.kernel.org/r/20211026102741.2910441-1-anssi.hannula@bitwise.fi Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	serial: sunzilog: Mark sunzilog_putchar() __maybe_unused	Geert Uytterhoeven
	If CONSOLE_POLL=n, CONFIG_SERIAL_SUNZILOG_CONSOLE=n, and CONFIG_SERIO=m: drivers/tty/serial/sunzilog.c:1128:13: error: ‘sunzilog_putchar’ defined but not used [-Werror=unused-function] 1128 \| static void sunzilog_putchar(struct uart_port *port, int ch) \| ^~~~~~~~~~~~~~~~ Fix this by marking sunzilog_putchar() __maybe_unused. Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20211026080426.2444756-1-geert@linux-m68k.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	device property: Drop redundant NULL checks	Andy Shevchenko
	In cases when functions are called via fwnode operations, we already know that this is software node we are dealing with, hence no need to check if it's NULL, it can't be, Reported-by: YE Chengfeng <cyeaa@connect.ust.hk> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211026162954.89811-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	USB: iowarrior: fix control-message timeouts	Johan Hovold
	USB control-message timeouts are specified in milliseconds and should specifically not vary with CONFIG_HZ. Use the common control-message timeout define for the five-second timeout and drop the driver-specific one. Fixes: 946b960d13c1 ("USB: add driver for iowarrior devices.") Cc: stable@vger.kernel.org # 2.6.21 Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211025115159.4954-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	Documentation: USB: fix example bulk-message timeout	Johan Hovold
	USB bulk-message timeouts are specified in milliseconds and should specifically not vary with CONFIG_HZ. Use a fixed five-second timeout in the "Writing USB Device Drivers" example. Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211025115159.4954-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	most: fix control-message timeouts	Johan Hovold
	USB control-message timeouts are specified in milliseconds and should specifically not vary with CONFIG_HZ. Use the common control-message timeout defines for the five-second timeouts. Fixes: 97a6f772f36b ("drivers: most: add USB adapter driver") Cc: stable@vger.kernel.org # 5.9 Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211025115811.5410-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	comedi: vmk80xx: fix bulk and interrupt message timeouts	Johan Hovold
	USB bulk and interrupt message timeouts are specified in milliseconds and should specifically not vary with CONFIG_HZ. Note that the bulk-out transfer timeout was set to the endpoint bInterval value, which should be ignored for bulk endpoints and is typically set to zero. This meant that a failing bulk-out transfer would never time out. Assume that the 10 second timeout used for all other transfers is more than enough also for the bulk-out endpoint. Fixes: 985cafccbf9b ("Staging: Comedi: vmk80xx: Add k8061 support") Fixes: 951348b37738 ("staging: comedi: vmk80xx: wait for URBs to complete") Cc: stable@vger.kernel.org # 2.6.31 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20211025114532.4599-6-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	comedi: vmk80xx: fix bulk-buffer overflow	Johan Hovold
	The driver is using endpoint-sized buffers but must not assume that the tx and rx buffers are of equal size or a malicious device could overflow the slab-allocated receive buffer when doing bulk transfers. Fixes: 985cafccbf9b ("Staging: Comedi: vmk80xx: Add k8061 support") Cc: stable@vger.kernel.org # 2.6.31 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20211025114532.4599-5-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	comedi: vmk80xx: fix transfer-buffer overflows	Johan Hovold
	The driver uses endpoint-sized USB transfer buffers but up until recently had no sanity checks on the sizes. Commit e1f13c879a7c ("staging: comedi: check validity of wMaxPacketSize of usb endpoints found") inadvertently fixed NULL-pointer dereferences when accessing the transfer buffers in case a malicious device has a zero wMaxPacketSize. Make sure to allocate buffers large enough to handle also the other accesses that are done without a size check (e.g. byte 18 in vmk80xx_cnt_insn_read() for the VMK8061_MODEL) to avoid writing beyond the buffers, for example, when doing descriptor fuzzing. The original driver was for a low-speed device with 8-byte buffers. Support was later added for a device that uses bulk transfers and is presumably a full-speed device with a maximum 64-byte wMaxPacketSize. Fixes: 985cafccbf9b ("Staging: Comedi: vmk80xx: Add k8061 support") Cc: stable@vger.kernel.org # 2.6.31 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20211025114532.4599-4-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-26	btrfs: fix comment about sector sizes supported in 64K systems	Anand Jain
	Commit 95ea0486b20e ("btrfs: allow read-write for 4K sectorsize on 64K page size systems") added write support for 4K sectorsize on a 64K systems. Fix the now stale comments. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26	btrfs: update device path inode time instead of bd_inode	Josef Bacik
	Christoph pointed out that I'm updating bdev->bd_inode for the device time when we remove block devices from a btrfs file system, however this isn't actually exposed to anything. The inode we want to update is the one that's associated with the path to the device, usually on devtmpfs, so that blkid notices the difference. We still don't want to do the blkdev_open, so use kern_path() to get the path to the given device and do the update time on that inode. Fixes: 8f96a5bfa150 ("btrfs: update the bdev time directly when closing") Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26	fs: export an inode_update_time helper	Josef Bacik
	If you already have an inode and need to update the time on the inode there is no way to do this properly. Export this helper to allow file systems to update time on the inode so the appropriate handler is called, either ->update_time or generic_update_time. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26	btrfs: fix deadlock when defragging transparent huge pages	Omar Sandoval
	Attempting to defragment a Btrfs file containing a transparent huge page immediately deadlocks with the following stack trace: #0 context_switch (kernel/sched/core.c:4940:2) #1 __schedule (kernel/sched/core.c:6287:8) #2 schedule (kernel/sched/core.c:6366:3) #3 io_schedule (kernel/sched/core.c:8389:2) #4 wait_on_page_bit_common (mm/filemap.c:1356:4) #5 __lock_page (mm/filemap.c:1648:2) #6 lock_page (./include/linux/pagemap.h:625:3) #7 pagecache_get_page (mm/filemap.c:1910:4) #8 find_or_create_page (./include/linux/pagemap.h:420:9) #9 defrag_prepare_one_page (fs/btrfs/ioctl.c:1068:9) #10 defrag_one_range (fs/btrfs/ioctl.c:1326:14) #11 defrag_one_cluster (fs/btrfs/ioctl.c:1421:9) #12 btrfs_defrag_file (fs/btrfs/ioctl.c:1523:9) #13 btrfs_ioctl_defrag (fs/btrfs/ioctl.c:3117:9) #14 btrfs_ioctl (fs/btrfs/ioctl.c:4872:10) #15 vfs_ioctl (fs/ioctl.c:51:10) #16 __do_sys_ioctl (fs/ioctl.c:874:11) #17 __se_sys_ioctl (fs/ioctl.c:860:1) #18 __x64_sys_ioctl (fs/ioctl.c:860:1) #19 do_syscall_x64 (arch/x86/entry/common.c:50:14) #20 do_syscall_64 (arch/x86/entry/common.c:80:7) #21 entry_SYSCALL_64+0x7c/0x15b (arch/x86/entry/entry_64.S:113) A huge page is represented by a compound page, which consists of a struct page for each PAGE_SIZE page within the huge page. The first struct page is the "head page", and the remaining are "tail pages". Defragmentation attempts to lock each page in the range. However, lock_page() on a tail page actually locks the corresponding head page. So, if defragmentation tries to lock more than one struct page in a compound page, it tries to lock the same head page twice and deadlocks with itself. Ideally, we should be able to defragment transparent huge pages. However, THP for filesystems is currently read-only, so a lot of code is not ready to use huge pages for I/O. For now, let's just return ETXTBUSY. This can be reproduced with the following on a kernel with CONFIG_READ_ONLY_THP_FOR_FS=y: $ cat create_thp_file.c #include <fcntl.h> #include <stdbool.h> #include <stdio.h> #include <stdint.h> #include <stdlib.h> #include <unistd.h> #include <sys/mman.h> static const char zeroes[1024 * 1024]; static const size_t FILE_SIZE = 2 * 1024 * 1024; int main(int argc, char *argv) { if (argc != 2) { fprintf(stderr, "usage: %s PATH\n", argv[0]); return EXIT_FAILURE; } int fd = creat(argv[1], 0777); if (fd == -1) { perror("creat"); return EXIT_FAILURE; } size_t written = 0; while (written < FILE_SIZE) { ssize_t ret = write(fd, zeroes, sizeof(zeroes) < FILE_SIZE - written ? sizeof(zeroes) : FILE_SIZE - written); if (ret < 0) { perror("write"); return EXIT_FAILURE; } written += ret; } close(fd); fd = open(argv[1], O_RDONLY); if (fd == -1) { perror("open"); return EXIT_FAILURE; } / * Reserve some address space so that we can align the file mapping to * the huge page size. / void placeholder_map = mmap(NULL, FILE_SIZE * 2, PROT_NONE, MAP_PRIVATE \| MAP_ANONYMOUS, -1, 0); if (placeholder_map == MAP_FAILED) { perror("mmap (placeholder)"); return EXIT_FAILURE; } void aligned_address = (void )(((uintptr_t)placeholder_map + FILE_SIZE - 1) & ~(FILE_SIZE - 1)); void map = mmap(aligned_address, FILE_SIZE, PROT_READ \| PROT_EXEC, MAP_SHARED \| MAP_FIXED, fd, 0); if (map == MAP_FAILED) { perror("mmap"); return EXIT_FAILURE; } if (madvise(map, FILE_SIZE, MADV_HUGEPAGE) < 0) { perror("madvise"); return EXIT_FAILURE; } char line = NULL; size_t line_capacity = 0; FILE smaps_file = fopen("/proc/self/smaps", "r"); if (!smaps_file) { perror("fopen"); return EXIT_FAILURE; } for (;;) { for (size_t off = 0; off < FILE_SIZE; off += 4096) ((volatile char )map)[off]; ssize_t ret; bool this_mapping = false; while ((ret = getline(&line, &line_capacity, smaps_file)) > 0) { unsigned long start, end, huge; if (sscanf(line, "%lx-%lx", &start, &end) == 2) { this_mapping = (start <= (uintptr_t)map && (uintptr_t)map < end); } else if (this_mapping && sscanf(line, "FilePmdMapped: %ld", &huge) == 1 && huge > 0) { return EXIT_SUCCESS; } } sleep(6); rewind(smaps_file); fflush(smaps_file); } } $ ./create_thp_file huge $ btrfs fi defrag -czstd ./huge Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26	btrfs: sysfs: convert scnprintf and snprintf to sysfs_emit	Anand Jain
	Commit 2efc459d06f1 ("sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs out") merged in 5.10 introduced two new functions sysfs_emit() and sysfs_emit_at() which are aware of the PAGE_SIZE limit of the output buffer. Use the above two new functions instead of scnprintf() and snprintf() in various sysfs show(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26	btrfs: make btrfs_super_block size match BTRFS_SUPER_INFO_SIZE	Qu Wenruo
	It's a common practice to avoid use sizeof(struct btrfs_super_block) (3531), but to use BTRFS_SUPER_INFO_SIZE (4096). The problem is that, sizeof(struct btrfs_super_block) doesn't match BTRFS_SUPER_INFO_SIZE from the very beginning. Furthermore, for all call sites except selftests, we always allocate BTRFS_SUPER_INFO_SIZE space for super block, there isn't any real reason to use the smaller value, and it doesn't really save any space. So let's get rid of such confusing behavior, and unify those two values. This modification also adds a new static_assert() to verify the size, and moves the BTRFS_SUPER_INFO_* macros to the definition of btrfs_super_block for the static_assert(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26	btrfs: update comments for chunk allocation -ENOSPC cases	Filipe Manana
	Update the comments at btrfs_chunk_alloc() and do_chunk_alloc() that describe which cases can lead to a failure to allocate metadata and system space despite having previously reserved space. This adds one more reason that I previously forgot to mention. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26	btrfs: fix deadlock between chunk allocation and chunk btree modifications	Filipe Manana
	When a task is doing some modification to the chunk btree and it is not in the context of a chunk allocation or a chunk removal, it can deadlock with another task that is currently allocating a new data or metadata chunk. These contexts are the following: * When relocating a system chunk, when we need to COW the extent buffers that belong to the chunk btree; * When adding a new device (ioctl), where we need to add a new device item to the chunk btree; * When removing a device (ioctl), where we need to remove a device item from the chunk btree; * When resizing a device (ioctl), where we need to update a device item in the chunk btree and may need to relocate a system chunk that lies beyond the new device size when shrinking a device. The problem happens due to a sequence of steps like the following: 1) Task A starts a data or metadata chunk allocation and it locks the chunk mutex; 2) Task B is relocating a system chunk, and when it needs to COW an extent buffer of the chunk btree, it has locked both that extent buffer as well as its parent extent buffer; 3) Since there is not enough available system space, either because none of the existing system block groups have enough free space or because the only one with enough free space is in RO mode due to the relocation, task B triggers a new system chunk allocation. It blocks when trying to acquire the chunk mutex, currently held by task A; 4) Task A enters btrfs_chunk_alloc_add_chunk_item(), in order to insert the new chunk item into the chunk btree and update the existing device items there. But in order to do that, it has to lock the extent buffer that task B locked at step 2, or its parent extent buffer, but task B is waiting on the chunk mutex, which is currently locked by task A, therefore resulting in a deadlock. One example report when the deadlock happens with system chunk relocation: INFO: task kworker/u9:5:546 blocked for more than 143 seconds. Not tainted 5.15.0-rc3+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u9:5 state:D stack:25936 pid: 546 ppid: 2 flags:0x00004000 Workqueue: events_unbound btrfs_async_reclaim_metadata_space Call Trace: context_switch kernel/sched/core.c:4940 [inline] __schedule+0xcd9/0x2530 kernel/sched/core.c:6287 schedule+0xd3/0x270 kernel/sched/core.c:6366 rwsem_down_read_slowpath+0x4ee/0x9d0 kernel/locking/rwsem.c:993 __down_read_common kernel/locking/rwsem.c:1214 [inline] __down_read kernel/locking/rwsem.c:1223 [inline] down_read_nested+0xe6/0x440 kernel/locking/rwsem.c:1590 __btrfs_tree_read_lock+0x31/0x350 fs/btrfs/locking.c:47 btrfs_tree_read_lock fs/btrfs/locking.c:54 [inline] btrfs_read_lock_root_node+0x8a/0x320 fs/btrfs/locking.c:191 btrfs_search_slot_get_root fs/btrfs/ctree.c:1623 [inline] btrfs_search_slot+0x13b4/0x2140 fs/btrfs/ctree.c:1728 btrfs_update_device+0x11f/0x500 fs/btrfs/volumes.c:2794 btrfs_chunk_alloc_add_chunk_item+0x34d/0xea0 fs/btrfs/volumes.c:5504 do_chunk_alloc fs/btrfs/block-group.c:3408 [inline] btrfs_chunk_alloc+0x84d/0xf50 fs/btrfs/block-group.c:3653 flush_space+0x54e/0xd80 fs/btrfs/space-info.c:670 btrfs_async_reclaim_metadata_space+0x396/0xa90 fs/btrfs/space-info.c:953 process_one_work+0x9df/0x16d0 kernel/workqueue.c:2297 worker_thread+0x90/0xed0 kernel/workqueue.c:2444 kthread+0x3e5/0x4d0 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 INFO: task syz-executor:9107 blocked for more than 143 seconds. Not tainted 5.15.0-rc3+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor state:D stack:23200 pid: 9107 ppid: 7792 flags:0x00004004 Call Trace: context_switch kernel/sched/core.c:4940 [inline] __schedule+0xcd9/0x2530 kernel/sched/core.c:6287 schedule+0xd3/0x270 kernel/sched/core.c:6366 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6425 __mutex_lock_common kernel/locking/mutex.c:669 [inline] __mutex_lock+0xc96/0x1680 kernel/locking/mutex.c:729 btrfs_chunk_alloc+0x31a/0xf50 fs/btrfs/block-group.c:3631 find_free_extent_update_loop fs/btrfs/extent-tree.c:3986 [inline] find_free_extent+0x25cb/0x3a30 fs/btrfs/extent-tree.c:4335 btrfs_reserve_extent+0x1f1/0x500 fs/btrfs/extent-tree.c:4415 btrfs_alloc_tree_block+0x203/0x1120 fs/btrfs/extent-tree.c:4813 __btrfs_cow_block+0x412/0x1620 fs/btrfs/ctree.c:415 btrfs_cow_block+0x2f6/0x8c0 fs/btrfs/ctree.c:570 btrfs_search_slot+0x1094/0x2140 fs/btrfs/ctree.c:1768 relocate_tree_block fs/btrfs/relocation.c:2694 [inline] relocate_tree_blocks+0xf73/0x1770 fs/btrfs/relocation.c:2757 relocate_block_group+0x47e/0xc70 fs/btrfs/relocation.c:3673 btrfs_relocate_block_group+0x48a/0xc60 fs/btrfs/relocation.c:4070 btrfs_relocate_chunk+0x96/0x280 fs/btrfs/volumes.c:3181 __btrfs_balance fs/btrfs/volumes.c:3911 [inline] btrfs_balance+0x1f03/0x3cd0 fs/btrfs/volumes.c:4301 btrfs_ioctl_balance+0x61e/0x800 fs/btrfs/ioctl.c:4137 btrfs_ioctl+0x39ea/0x7b70 fs/btrfs/ioctl.c:4949 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl fs/ioctl.c:860 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae So fix this by making sure that whenever we try to modify the chunk btree and we are neither in a chunk allocation context nor in a chunk remove context, we reserve system space before modifying the chunk btree. Reported-by: Hao Sun <sunhao.th@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CACkBjsax51i4mu6C0C3vJqQN3NR_iVuucoeG3U1HXjrgzn5FFQ@mail.gmail.com/ Fixes: 79bd37120b1495 ("btrfs: rework chunk allocation to avoid exhaustion of the system chunk array") CC: stable@vger.kernel.org # 5.14+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26	btrfs: zoned: use greedy gc for auto reclaim	Johannes Thumshirn
	Currently auto reclaim of unusable zones reclaims the block-groups in the order they have been added to the reclaim list. Change this to a greedy algorithm by sorting the list so we have the block-groups with the least amount of valid bytes reclaimed first. Note: we can't splice the block groups from reclaim_bgs to let the sort happen outside of the lock. The block groups can be still in use by other parts eg. via bg_list and we must hold unused_bgs_lock while processing them. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> [ write note and comment why we can't splice the list ] Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26	btrfs: check-integrity: stop storing the block device name in btrfsic_dev_state	Christoph Hellwig
	Just use the %pg format specifier in all the debug printks previously using it. Note that both bdevname and the %pg specifier never print a pathname, so the kbasename call wasn't needed to start with. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> [ adjust messages and indentation ] Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26	btrfs: use btrfs_get_dev_args_from_path in dev removal ioctls	Josef Bacik
	For device removal and replace we call btrfs_find_device_by_devspec, which if we give it a device path and nothing else will call btrfs_get_dev_args_from_path, which opens the block device and reads the super block and then looks up our device based on that. However at this point we're holding the sb write "lock", so reading the block device pulls in the dependency of ->open_mutex, which produces the following lockdep splat ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc2+ #405 Not tainted ------------------------------------------------------ losetup/11576 is trying to acquire lock: ffff9bbe8cded938 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0 but task is already holding lock: ffff9bbe88e4fc68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&lo->lo_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 lo_open+0x28/0x60 [loop] blkdev_get_whole+0x25/0xf0 blkdev_get_by_dev.part.0+0x168/0x3c0 blkdev_open+0xd2/0xe0 do_dentry_open+0x161/0x390 path_openat+0x3cc/0xa20 do_filp_open+0x96/0x120 do_sys_openat2+0x7b/0x130 __x64_sys_openat+0x46/0x70 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #3 (&disk->open_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 blkdev_get_by_dev.part.0+0x56/0x3c0 blkdev_get_by_path+0x98/0xa0 btrfs_get_bdev_and_sb+0x1b/0xb0 btrfs_find_device_by_devspec+0x12b/0x1c0 btrfs_rm_device+0x127/0x610 btrfs_ioctl+0x2a31/0x2e70 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae -> #2 (sb_writers#12){.+.+}-{0:0}: lo_write_bvec+0xc2/0x240 [loop] loop_process_work+0x238/0xd00 [loop] process_one_work+0x26b/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30 -> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}: process_one_work+0x245/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30 -> #0 ((wq_completion)loop0){+.+.}-{0:0}: __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 flush_workqueue+0x91/0x5e0 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&lo->lo_mutex); lock(&disk->open_mutex); lock(&lo->lo_mutex); lock((wq_completion)loop0); * DEADLOCK * 1 lock held by losetup/11576: #0: ffff9bbe88e4fc68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop] stack backtrace: CPU: 0 PID: 11576 Comm: losetup Not tainted 5.14.0-rc2+ #405 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack_lvl+0x57/0x72 check_noncircular+0xcf/0xf0 ? stack_trace_save+0x3b/0x50 __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 ? flush_workqueue+0x67/0x5e0 ? lockdep_init_map_type+0x47/0x220 flush_workqueue+0x91/0x5e0 ? flush_workqueue+0x67/0x5e0 ? verify_cpu+0xf0/0x100 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] ? blkdev_ioctl+0x8d/0x2a0 block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f31b02404cb Instead what we want to do is populate our device lookup args before we grab any locks, and then pass these args into btrfs_rm_device(). From there we can find the device and do the appropriate removal. Suggested-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>