summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-03-05bpf: check bpf_func_state->callback_depth when pruning statesEduard Zingerman
When comparing current and cached states verifier should consider bpf_func_state->callback_depth. Current state cannot be pruned against cached state, when current states has more iterations left compared to cached state. Current state has more iterations left when it's callback_depth is smaller. Below is an example illustrating this bug, minimized from mailing list discussion [0] (assume that BPF_F_TEST_STATE_FREQ is set). The example is not a safe program: if loop_cb point (1) is followed by loop_cb point (2), then division by zero is possible at point (4). struct ctx { __u64 a; __u64 b; __u64 c; }; static void loop_cb(int i, struct ctx *ctx) { /* assume that generated code is "fallthrough-first": * if ... == 1 goto * if ... == 2 goto * <default> */ switch (bpf_get_prandom_u32()) { case 1: /* 1 */ ctx->a = 42; return 0; break; case 2: /* 2 */ ctx->b = 42; return 0; break; default: /* 3 */ ctx->c = 42; return 0; break; } } SEC("tc") __failure __flag(BPF_F_TEST_STATE_FREQ) int test(struct __sk_buff *skb) { struct ctx ctx = { 7, 7, 7 }; bpf_loop(2, loop_cb, &ctx, 0); /* 0 */ /* assume generated checks are in-order: .a first */ if (ctx.a == 42 && ctx.b == 42 && ctx.c == 7) asm volatile("r0 /= 0;":::"r0"); /* 4 */ return 0; } Prior to this commit verifier built the following checkpoint tree for this example: .------------------------------------- Checkpoint / State name | .-------------------------------- Code point number | | .---------------------------- Stack state {ctx.a,ctx.b,ctx.c} | | | .------------------- Callback depth in frame #0 v v v v - (0) {7P,7P,7},depth=0 - (3) {7P,7P,7},depth=1 - (0) {7P,7P,42},depth=1 - (3) {7P,7,42},depth=2 - (0) {7P,7,42},depth=2 loop terminates because of depth limit - (4) {7P,7,42},depth=0 predicted false, ctx.a marked precise - (6) exit (a) - (2) {7P,7,42},depth=2 - (0) {7P,42,42},depth=2 loop terminates because of depth limit - (4) {7P,42,42},depth=0 predicted false, ctx.a marked precise - (6) exit (b) - (1) {7P,7P,42},depth=2 - (0) {42P,7P,42},depth=2 loop terminates because of depth limit - (4) {42P,7P,42},depth=0 predicted false, ctx.{a,b} marked precise - (6) exit - (2) {7P,7,7},depth=1 considered safe, pruned using checkpoint (a) (c) - (1) {7P,7P,7},depth=1 considered safe, pruned using checkpoint (b) Here checkpoint (b) has callback_depth of 2, meaning that it would never reach state {42,42,7}. While checkpoint (c) has callback_depth of 1, and thus could yet explore the state {42,42,7} if not pruned prematurely. This commit makes forbids such premature pruning, allowing verifier to explore states sub-tree starting at (c): (c) - (1) {7,7,7P},depth=1 - (0) {42P,7,7P},depth=1 ... - (2) {42,7,7},depth=2 - (0) {42,42,7},depth=2 loop terminates because of depth limit - (4) {42,42,7},depth=0 predicted true, ctx.{a,b,c} marked precise - (5) division by zero [0] https://lore.kernel.org/bpf/9b251840-7cb8-4d17-bd23-1fc8071d8eef@linux.dev/ Fixes: bb124da69c47 ("bpf: keep track of max number of bpf_loop callback iterations") Suggested-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240222154121.6991-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-06riscv: dts: Move BUILTIN_DTB_SOURCE to common KconfigYangyu Chen
The BUILTIN_DTB_SOURCE was only configured for K210 before. Since SOC_BUILTIN_DTB_DECLARE was removed at commit d5805af9fe9f ("riscv: Fix builtin DTB handling") from patch [1], the kernel cannot choose one of the dtbs from then on and always take the first one dtb to use. Then, another commit 0ddd7eaffa64 ("riscv: Fix BUILTIN_DTB for sifive and microchip soc") from patch [2] supports BUILTIN_DTB_SOURCE for other SoCs. However, this feature will only work if the Kconfig we use links the dtb we expected in the first place as mentioned in the thread [3]. Thus, a config BUILTIN_DTB_SOURCE is needed for all SoCs to choose one dtb to use. For some considerations, this patch also removes default y if XIP_KERNEL for BUILTIN_DTB, as this requires setting a proper dtb to use on the BUILTIN_DTB_SOURCE, else the kernel with XIP but does not set BUILTIN_DTB_SOURCE or unselect BUILTIN_DTB will not boot. Also, this patch removes the default dtb string for k210 from Kconfig to nommu_k210_defconfig and nommu_k210_sdcard_defconfig to avoid complex Kconfig settings for other SoCs in the future. [1] https://lore.kernel.org/linux-riscv/20201208073355.40828-5-damien.lemoal@wdc.com/ [2] https://lore.kernel.org/linux-riscv/20210604120639.1447869-1-alex@ghiti.fr/ [3] https://lore.kernel.org/linux-riscv/CAK7LNATt_56mO2Le4v4EnPnAfd3gC8S_Sm5-GCsfa=qXy=8Lrg@mail.gmail.com/ Signed-off-by: Yangyu Chen <cyy@cyyself.name> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-03-05selftests/exec: Perform script checks with /bin/bashKees Cook
It seems some shells linked to /bin/sh don't have consistent behavior with error codes on execution failures. Explicitly use /bin/bash so that "not found" errors are correctly generated. Repeating the comment from the test: /* * Execute as a long pathname relative to "/". If this is a script, * the interpreter will launch but fail to open the script because its * name ("/dev/fd/5/xxx....") is bigger than PATH_MAX. * * The failure code is usually 127 (POSIX: "If a command is not found, * the exit status shall be 127."), but some systems give 126 (POSIX: * "If the command name is found, but it is not an executable utility, * the exit status shall be 126."), so allow either. */ Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Closes: https://lore.kernel.org/lkml/02c8bf8e-1934-44ab-a886-e065b37366a7@collabora.com/ Signed-off-by: Kees Cook <keescook@chromium.org> --- Cc: Eric Biederman <ebiederm@xmission.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-mm@kvack.org Cc: linux-kselftest@vger.kernel.org
2024-03-06power: supply: core: fix charge_behaviour formattingThomas Weißschuh
This property is documented to have a special format which exposes all available behaviours and the currently active one at the same time. For this special format some helpers are provided. When the charge_behaviour property was added in 1b0b6cc8030d ("power: supply: add charge_behaviour attributes"), it did not update the default logic in in power_supply_sysfs.c to use the format helpers. Thus by default only the currently active behaviour is printed. This fixes the default logic to follow the documented format. There is currently only one in-tree drivers exposing charge behaviours - thinkpad_acpi, which is not affected by the change, as it directly uses the helpers and does not use the power_supply_sysfs.c logic. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20240303-power_supply-charge_behaviour_prop-v2-3-8ebb0a7c2409@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-03-06power: supply: core: ease special formatting implementationsThomas Weißschuh
By moving the conditional into the default-branch of the switch new additions to the switch won't have to bypass the conditional. This makes it easier to implement those special cases like the upcoming change to the formatting of "charge_behaviour". Suggested-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/lkml/53082075-852f-4698-b354-ed30e7fd2683@redhat.com/ Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20240303-power_supply-charge_behaviour_prop-v2-2-8ebb0a7c2409@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-03-05rxrpc: Extract useful fields from a received ACK to skb priv dataDavid Howells
Extract useful fields from a received ACK packet into the skb private data early on in the process of parsing incoming packets. This makes the ACK fields available even before we've matched the ACK up to a call and will allow us to deal with path MTU discovery probe responses even after the relevant call has been completed. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Clean up the resend algorithmDavid Howells
Clean up the DATA packet resending algorithm to retransmit packets as we come across them whilst walking the transmission buffer rather than queuing them for retransmission at the end. This can be done as ACK parsing - and thus the discarding of successful packets - is now done in the same thread rather than separately in softirq context and a locked section is no longer required. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Record probes after transmission and reduce number of time-getsDavid Howells
Move the recording of a successfully transmitted DATA or ACK packet that will provide RTT probing to after the transmission. With the I/O thread model, this can be done because parsing of the responding ACK can no longer race with the post-transmission code. Move the various timeout-settings done after successfully transmitting a DATA packet into rxrpc_tstamp_data_packets() and eliminate a number of calls to get the current time. As a consequence we no longer need to cancel a proposed RTT probe on transmission failure. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Use ktimes for call timeout tracking and set the timer lazilyDavid Howells
Track the call timeouts as ktimes rather than jiffies as the latter's granularity is too high and only set the timer at the end of the event handling function. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Differentiate PING ACK transmission traces.David Howells
There are three points that transmit PING ACKs and all of them use the same trace string. Change two of them to use different strings. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Don't permit resending after all Tx packets ackedDavid Howells
Once all the packets transmitted as part of a call have been acked, don't permit any resending. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Parse received packets before dealing with timeoutsDavid Howells
Parse the received packets before going and processing timeouts as the timeouts may be reset by the reception of a packet. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-06power: supply: mm8013: fix "not charging" detectionThomas Weißschuh
The charge_behaviours property is meant as a control-knob that can be changed by the user. Page 23 of [0] which documents the flag CHG_INH as follows: CHG_INH : Charge Inhibit When the current is more than or equal to charge threshold current, charge inhibit temperature (upper/lower limit) :1 charge permission temperature or the current is less than charge threshold current :0 So this is pure read-only information which is better represented as POWER_SUPPLY_STATUS_NOT_CHARGING. [0] https://product.minebeamitsumi.com/en/product/category/ics/battery/fuel_gauge/parts/download/__icsFiles/afieldfile/2023/07/12/1_download_01_12.pdf Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20240303-power_supply-charge_behaviour_prop-v2-1-8ebb0a7c2409@weissschuh.net Fixes: e39257cde7e8 ("power: supply: mm8013: Add more properties") Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-03-05rxrpc: Do zerocopy using MSG_SPLICE_PAGES and page fragsDavid Howells
Switch from keeping the transmission buffers in the rxrpc_txbuf struct and allocated from the slab, to allocating them using page fragment allocators (which uses raw pages), thereby allowing them to be passed to MSG_SPLICE_PAGES and avoid copying into the UDP buffers. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-06power: supply: move power_supply_attr_groups definition back to sysfsRicardo B. Marliere
As reported by the kernel test robot, 'power_supply_attr_group' is defined but not used when CONFIG_SYSFS is not set. Sebastian suggested that the correct fix implemented by this patch, instead of my attempt in commit ea4367c40c79 ("power: supply: core: move power_supply_attr_group into #ifdef block"), is to define power_supply_attr_groups in power_supply_sysfs.c and expose it in the power_supply.h header. For the case where CONFIG_SYSFS=n, define it as NULL. Suggested-by: Sebastian Reichel <sebastian.reichel@collabora.com> Fixes: ea4367c40c79 ("power: supply: core: move power_supply_attr_group into #ifdef block") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202403021518.SUQzk3oA-lkp@intel.com/ Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Link: https://lore.kernel.org/r/20240303-class_cleanup-power-v2-1-e248b7128519@marliere.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-03-06power: supply: core: simplify power_supply_class_initSebastian Reichel
Technically the sysfs attributes should be initialized before the class is registered, since that will use them. As a nice side effect this nicely simplifies the code, since it allows dropping the helper variable. Reviewed-by: Ricardo B. Marliere <ricardo@marliere.net> Link: https://lore.kernel.org/r/20240301-psy-class-cleanup-v1-2-aebe8c4b6b08@collabora.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-03-06power: supply: core: add power_supply_for_each_device()Sebastian Reichel
Introduce power_supply_for_each_device(), which is a wrapper for class_for_each_device() using the power_supply_class and going through all devices. This allows making the power_supply_class itself a local variable, so that drivers cannot mess with it and simplifies the code slightly. Reviewed-by: Ricardo B. Marliere <ricardo@marliere.net> Link: https://lore.kernel.org/r/20240301-psy-class-cleanup-v1-1-aebe8c4b6b08@collabora.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-03-05pinctrl: core: comment that pinctrl_add_gpio_range() is deprecatedDan Carpenter
The pinctrl_add_gpio_range() function is deprecated add a comment so people don't accidentally use it in new code. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/533a7a10-c6eb-4ebe-adf1-f8dc95ae8d33@moroto.mountain Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2024-03-05pinctrl: pinmux: Suppress error message for -EPROBE_DEFERAndre Przywara
EPROBE_DEFER error returns are not really critical, since they cancel the probe process, but the kernel will return later and retry. However, depending on the probe order, this might issue quite some verbatim and scary, though pointless messages: [ 2.388731] 300b000.pinctrl: pin-224 (5000000.serial) status -517 [ 2.397321] 300b000.pinctrl: could not request pin 224 (PH0) from group PH0 on device 300b000.pinctrl Replace dev_err() with dev_err_probe(), which not only drops the priority of the message from error to debug, but also puts some text into debugfs' devices_deferred file, for later reference. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Link: https://lore.kernel.org/r/20240305143859.2449147-1-andre.przywara@arm.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2024-03-05pinctrl: Add driver for Awinic AW9523/B I2C GPIO ExpanderAngeloGioacchino Del Regno
The Awinic AW9523(B) is a multi-function I2C gpio expander in a TQFN-24L package, featuring PWM (max 37mA per pin, or total max power 3.2Watts) for LED driving capability. It has two ports with 8 pins per port (for a total of 16 pins), configurable as either PWM with 1/256 stepping or GPIO input/output, 1.8V logic input; each GPIO can be configured as input or output independently from each other. This IC also has an internal interrupt controller, which is capable of generating an interrupt for each GPIO, depending on the configuration, and will raise an interrupt on the INTN pin to advertise this to an external interrupt controller. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Signed-off-by: David Bauer <mail@david-bauer.net> Link: https://lore.kernel.org/r/20210624214458.68716-2-mail@david-bauer.net Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240301-awinic-aw9523-v8-1-7ec572f5dfb4@linaro.org
2024-03-05dt-bindings: pinctrl: Add bindings for Awinic AW9523/AW9523BAngeloGioacchino Del Regno
Add bindings for the Awinic AW9523/AW9523B I2C GPIO Expander driver. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Signed-off-by: David Bauer <mail@david-bauer.net> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210624214458.68716-1-mail@david-bauer.net [Fixed up minor bugs found by new checking tools] Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2024-03-05PCI/sysfs: Demacrofy pci_dev_resource_resize_attr(n) functionsIlpo Järvinen
pci_dev_resource_resize_attr(n) macro is invoked for six resources, creating a large footprint function for each resource. Rework the macro to only create a function that calls a helper function so the compiler can decide if it warrants to inline the function or not. With x86_64 defconfig, this saves roughly 2.5kB: $ scripts/bloat-o-meter drivers/pci/pci-sysfs.o{.old,.new} add/remove: 1/0 grow/shrink: 0/6 up/down: 512/-2934 (-2422) Function old new delta __resource_resize_store - 512 +512 resource5_resize_store 503 14 -489 resource4_resize_store 503 14 -489 resource3_resize_store 503 14 -489 resource2_resize_store 503 14 -489 resource1_resize_store 503 14 -489 resource0_resize_store 500 11 -489 Total: Before=13399, After=10977, chg -18.08% (The compiler seemingly chose to still inline __resource_resize_show() which is fine, those functions are not very complex/large.) Link: https://lore.kernel.org/r/20240222114607.1837-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-03-05PCI: Remove obsolete pci_cleanup_rom() declarationLukas Wunner
Commit d9c8bea179a6 ("PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY") removed pci_cleanup_rom(), but retained its declaration in pci.h. Remove it. Link: https://lore.kernel.org/r/fc30de5276e21d5a3ebcb7e58a8b43e399f7e6e6.1698668982.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
2024-03-05PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=yLukas Wunner
It is possible to enable CONFIG_PCI but disable CONFIG_SYSFS and for space-constrained devices such as routers, such a configuration may actually make sense. However pci-sysfs.c is compiled even if CONFIG_SYSFS is disabled, unnecessarily increasing the kernel's size. To rectify that: * Move pci_mmap_fits() to mmap.c. It is not only needed by pci-sysfs.c, but also proc.c. * Move pci_dev_type to probe.c and make it private. It references pci_dev_attr_groups in pci-sysfs.c. Make that public instead for consistency with pci_dev_groups, pcibus_groups and pci_bus_groups, which are likewise public and referenced by struct definitions in pci-driver.c and probe.c. * Define pci_dev_groups, pci_dev_attr_groups, pcibus_groups and pci_bus_groups to NULL if CONFIG_SYSFS is disabled. Provide empty static inlines for pci_{create,remove}_legacy_files() and pci_{create,remove}_sysfs_dev_files(). Result: vmlinux size is reduced by 122996 bytes in my arm 32-bit test build. Link: https://lore.kernel.org/r/85ca95ae8e4d57ccf082c5c069b8b21eb141846e.1698668982.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
2024-03-05Merge tag 'cgroup-for-6.8-rc7-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Two cpuset fixes. Both are for bugs in error handling paths and low risk" * tag 'cgroup-for-6.8-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup/cpuset: Fix retval in update_cpumask() cgroup/cpuset: Fix a memory leak in update_exclusive_cpumask()
2024-03-05Merge tag 'integrity-v6.8-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity fix from Mimi Zohar: "A single fix to eliminate an unnecessary message" * tag 'integrity-v6.8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: integrity: eliminate unnecessary "Problem loading X.509 certificate" msg
2024-03-05Merge branch 'dmraid-fix-6.9' into md-6.9Song Liu
This is the second half of fixes for dmraid. The first half is available at [1]. This set contains fixes: - reshape can start unexpected, cause data corruption, patch 1,5,6; - deadlocks that reshape concurrent with IO, patch 8; - a lockdep warning, patch 9; For all the dmraid related tests in lvm2 suite, there is no new regressions compared against 6.6 kernels (which is good baseline before recent regressions). [1] https://lore.kernel.org/all/CAPhsuW7u1UKHCDOBDhD7DzOVtkGemDz_QnJ4DUq_kSN-Q3G66Q@mail.gmail.com/ * dmraid-fix-6.9: dm-raid: fix lockdep waring in "pers->hot_add_disk" dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape dm-raid: add a new helper prepare_suspend() in md_personality md/dm-raid: don't call md_reap_sync_thread() directly dm-raid: really frozen sync_thread during suspend md: add a new helper reshape_interrupted() md: export helper md_is_rdwr() md: export helpers to stop sync_thread md: don't clear MD_RECOVERY_FROZEN for new dm-raid until resume
2024-03-05dm-raid: fix lockdep waring in "pers->hot_add_disk"Yu Kuai
The lockdep assert is added by commit a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") in print_conf(). And I didn't notice that dm-raid is calling "pers->hot_add_disk" without holding 'reconfig_mutex'. "pers->hot_add_disk" read and write many fields that is protected by 'reconfig_mutex', and raid_resume() already grab the lock in other contex. Hence fix this problem by protecting "pers->host_add_disk" with the lock. Fixes: 9092c02d9435 ("DM RAID: Add ability to restore transiently failed devices on resume") Fixes: a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-10-yukuai1@huaweicloud.com
2024-03-05dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent ↵Yu Kuai
with reshape For raid456, if reshape is still in progress, then IO across reshape position will wait for reshape to make progress. However, for dm-raid, in following cases reshape will never make progress hence IO will hang: 1) the array is read-only; 2) MD_RECOVERY_WAIT is set; 3) MD_RECOVERY_FROZEN is set; After commit c467e97f079f ("md/raid6: use valid sector values to determine if an I/O should wait on the reshape") fix the problem that IO across reshape position doesn't wait for reshape, the dm-raid test shell/lvconvert-raid-reshape.sh start to hang: [root@fedora ~]# cat /proc/979/stack [<0>] wait_woken+0x7d/0x90 [<0>] raid5_make_request+0x929/0x1d70 [raid456] [<0>] md_handle_request+0xc2/0x3b0 [md_mod] [<0>] raid_map+0x2c/0x50 [dm_raid] [<0>] __map_bio+0x251/0x380 [dm_mod] [<0>] dm_submit_bio+0x1f0/0x760 [dm_mod] [<0>] __submit_bio+0xc2/0x1c0 [<0>] submit_bio_noacct_nocheck+0x17f/0x450 [<0>] submit_bio_noacct+0x2bc/0x780 [<0>] submit_bio+0x70/0xc0 [<0>] mpage_readahead+0x169/0x1f0 [<0>] blkdev_readahead+0x18/0x30 [<0>] read_pages+0x7c/0x3b0 [<0>] page_cache_ra_unbounded+0x1ab/0x280 [<0>] force_page_cache_ra+0x9e/0x130 [<0>] page_cache_sync_ra+0x3b/0x110 [<0>] filemap_get_pages+0x143/0xa30 [<0>] filemap_read+0xdc/0x4b0 [<0>] blkdev_read_iter+0x75/0x200 [<0>] vfs_read+0x272/0x460 [<0>] ksys_read+0x7a/0x170 [<0>] __x64_sys_read+0x1c/0x30 [<0>] do_syscall_64+0xc6/0x230 [<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74 This is because reshape can't make progress. For md/raid, the problem doesn't exist because register new sync_thread doesn't rely on the IO to be done any more: 1) If array is read-only, it can switch to read-write by ioctl/sysfs; 2) md/raid never set MD_RECOVERY_WAIT; 3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold 'reconfig_mutex', hence it can be cleared and reshape can continue by sysfs api 'sync_action'. However, I'm not sure yet how to avoid the problem in dm-raid yet. This patch on the one hand make sure raid_message() can't change sync_thread() through raid_message() after presuspend(), on the other hand detect the above 3 cases before wait for IO do be done in dm_suspend(), and let dm-raid requeue those IO. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-9-yukuai1@huaweicloud.com
2024-03-05dm-raid: add a new helper prepare_suspend() in md_personalityYu Kuai
There are no functional changes for now, prepare to fix a deadlock for dm-raid456. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-8-yukuai1@huaweicloud.com
2024-03-05md/dm-raid: don't call md_reap_sync_thread() directlyYu Kuai
Currently md_reap_sync_thread() is called from raid_message() directly without holding 'reconfig_mutex', this is definitely unsafe because md_reap_sync_thread() can change many fields that is protected by 'reconfig_mutex'. However, hold 'reconfig_mutex' here is still problematic because this will cause deadlock, for example, commit 130443d60b1b ("md: refactor idle/frozen_sync_thread() to fix deadlock"). Fix this problem by using stop_sync_thread() to unregister sync_thread, like md/raid did. Fixes: be83651f0050 ("DM RAID: Add message/status support for changing sync action") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-7-yukuai1@huaweicloud.com
2024-03-05dm-raid: really frozen sync_thread during suspendYu Kuai
1) commit f52f5c71f3d4 ("md: fix stopping sync thread") remove MD_RECOVERY_FROZEN from __md_stop_writes() and doesn't realize that dm-raid relies on __md_stop_writes() to frozen sync_thread indirectly. Fix this problem by adding MD_RECOVERY_FROZEN in md_stop_writes(), and since stop_sync_thread() is only used for dm-raid in this case, also move stop_sync_thread() to md_stop_writes(). 2) The flag MD_RECOVERY_FROZEN doesn't mean that sync thread is frozen, it only prevent new sync_thread to start, and it can't stop the running sync thread; In order to frozen sync_thread, after seting the flag, stop_sync_thread() should be used. 3) The flag MD_RECOVERY_FROZEN doesn't mean that writes are stopped, use it as condition for md_stop_writes() in raid_postsuspend() doesn't look correct. Consider that reentrant stop_sync_thread() do nothing, always call md_stop_writes() in raid_postsuspend(). 4) raid_message can set/clear the flag MD_RECOVERY_FROZEN at anytime, and if MD_RECOVERY_FROZEN is cleared while the array is suspended, new sync_thread can start unexpected. Fix this by disallow raid_message() to change sync_thread status during suspend. Note that after commit f52f5c71f3d4 ("md: fix stopping sync thread"), the test shell/lvconvert-raid-reshape.sh start to hang in stop_sync_thread(), and with previous fixes, the test won't hang there anymore, however, the test will still fail and complain that ext4 is corrupted. And with this patch, the test won't hang due to stop_sync_thread() or fail due to ext4 is corrupted anymore. However, there is still a deadlock related to dm-raid456 that will be fixed in following patches. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Closes: https://lore.kernel.org/all/e5e8afe2-e9a8-49a2-5ab0-958d4065c55e@redhat.com/ Fixes: 1af2048a3e87 ("dm raid: fix deadlock caused by premature md_stop_writes()") Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-6-yukuai1@huaweicloud.com
2024-03-05md: add a new helper reshape_interrupted()Yu Kuai
The helper will be used for dm-raid456 later to detect the case that reshape can't make progress. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-5-yukuai1@huaweicloud.com
2024-03-05md: export helper md_is_rdwr()Yu Kuai
There are no functional changes for now, prepare to fix a deadlock for dm-raid456. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-4-yukuai1@huaweicloud.com
2024-03-05md: export helpers to stop sync_threadYu Kuai
Add new helpers: void md_idle_sync_thread(struct mddev *mddev); void md_frozen_sync_thread(struct mddev *mddev); void md_unfrozen_sync_thread(struct mddev *mddev); The helpers will be used in dm-raid in later patches to fix regressions and prevent calling md_reap_sync_thread() directly. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-3-yukuai1@huaweicloud.com
2024-03-05md: don't clear MD_RECOVERY_FROZEN for new dm-raid until resumeYu Kuai
After commit 9dbd1aa3a81c ("dm raid: add reshaping support to the target") raid_ctr() will set MD_RECOVERY_FROZEN before md_run() and expect to keep array frozen until resume. However, md_run() will clear the flag by setting mddev->recovery to 0. Before commit 1baae052cccd ("md: Don't ignore suspended array in md_check_recovery()"), dm-raid actually relied on suspending to prevent starting new sync_thread. Fix this problem by keeping 'MD_RECOVERY_FROZEN' for dm-raid in md_run(). Fixes: 1baae052cccd ("md: Don't ignore suspended array in md_check_recovery()") Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-2-yukuai1@huaweicloud.com
2024-03-05Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""Song Liu
This reverts commit bed9e27baf52a09b7ba2a3714f1e24e17ced386d. The original set [1][2] was expected to undo a suboptimal fix in [2], and replace it with a better fix [1]. However, as reported by Dan Moulding [2] causes an issue with raid5 with journal device. Revert [2] for now to close the issue. We will follow up on another issue reported by Juxiao Bi, as [2] is expected to fix it. We believe this is a good trade-off, because the latter issue happens less freqently. In the meanwhile, we will NOT revert [1], as it contains the right logic. [1] commit d6e035aad6c0 ("md: bypass block throttle for superblock update") [2] commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"") Reported-by: Dan Moulding <dan@danm.net> Closes: https://lore.kernel.org/linux-raid/20240123005700.9302-1-dan@danm.net/ Fixes: bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"") Cc: stable@vger.kernel.org # v5.19+ Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240125082131.788600-1-song@kernel.org
2024-03-05Merge tag 'platform-drivers-x86-v6.8-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - Fix P2SB regression causing ACPI errors and high CPU load - Fix error return path in amd_pmf_init_smart_pc() * tag 'platform-drivers-x86-v6.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/amd/pmf: Fix missing error code in amd_pmf_init_smart_pc() platform/x86: p2sb: On Goldmont only cache P2SB and SPI devfn BAR
2024-03-05spi: s3c64xx: switch exynos850 to new port config dataTudor Ambarus
Exynos850 has the same version of USI SPI (v2.1) as GS101. Drop the fifo_lvl_mask and rx_lvl_offset and switch to the new port config data. Backward compatibility with DT is not broken because when alises are set: - the SPI core will set the bus number according to the alias ID - the FIFO depth is always the same size for exynos850 (64 bytes) no matter the alias ID number. Advantages of the change: - drop dependency on the OF alias ID. - FIFO depth is inferred from the compatible. Exynos850 integrates 3 SPI IPs, all with 64 bytes FIFO depths. - use full mask for SPI_STATUS.{RX, TX}_FIFO_LVL fields. Using partial masks is misleading and can hide problems of the driver logic. Just compiled tested. Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-13-tudor.ambarus@linaro.org Tested-by: Sam Protsenko <semen.protsenko@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: s3c64xx: switch gs101 to new port config dataTudor Ambarus
Drop the fifo_lvl_mask and rx_lvl_offset and switch to the new port config data. Advantages of the change: - drop dependency on the OF alias ID. - FIFO depth is inferred from the compatible. GS101 integrates 16 SPI IPs, all with 64 bytes FIFO depths. - use full mask for SPI_STATUS.{RX, TX}_FIFO_LVL fields. Using partial masks is misleading and can hide problems of the driver logic. S3C64XX_SPI_ST_TX_FIFO_RDY_V2 was defined based on the USI's SPI_VERSION.USI_IP_VERSION register field, which has value 2 at reset. MAX_SPI_PORTS is updated to reflect the maximum number of ports for the rest of the compatibles. Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-12-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: s3c64xx: deprecate fifo_lvl_mask, rx_lvl_offset and port_idTudor Ambarus
Deprecate fifo_lvl_mask, rx_lvl_offset and port_id. One shall use {rx, tx}_fifomask instead. Add messages to each port configuration. Suggested-by: Sam Protsenko <semen.protsenko@linaro.org> Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-11-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: s3c64xx: get rid of the OF alias ID dependencyTudor Ambarus
Compatibles that set ``port_conf->{rx, tx}_fifomask`` are now safe to get rid of the OF alias ID dependency. Let the driver probe even without the alias for these. With this we also protect the FIFO_LVL_MASK calls from s3c64xx_spi_set_fifomask(). Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-10-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: s3c64xx: introduce s3c64xx_spi_set_port_id()Tudor Ambarus
Prepare driver to get rid of the of alias ID dependency. Split the port_id logic into a dedicated method. Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-9-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: s3c64xx: let the SPI core determine the bus numberTudor Ambarus
Let the core determine the bus number, either by getting the alias ID (as the driver forces now), or by allocating a dynamic bus number when the alias is absent. Prepare the driver to allow dt aliases to be absent. Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-8-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: s3c64xx: allow FIFO depth to be determined from the compatibleTudor Ambarus
There are SoCs that use the same FIFO depth for all the instances of the SPI IP. See the fifo_lvl_mask defined for gs101 for example: .fifo_lvl_mask = { 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f}, Instead of specifying the FIFO depth with the same value for all 16 nodes in this case, allow such SoCs to infer the FIFO depth from the compatible. There are other SoCs than can benefit of this, see: {gs101, fsd, exynos850, s3c641, s3c2443}_spi_port_config. The FIFO depth inferred from the compatible has a higher precedence than the one that might be specified via device tree, the driver shall know better. Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-7-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: s3c64xx: retrieve the FIFO depth from the device treeTudor Ambarus
There are SoCs that configure different FIFO depths for their instances of the SPI IP. See the fifo_lvl_mask defined for exynos4_spi_port_config for example: .fifo_lvl_mask = { 0x1ff, 0x7F, 0x7F }, The first instance of the IP is configured with 256 bytes FIFOs, whereas the last two are configured with 64 bytes FIFOs. Instead of mangling with the .fifo_lvl_mask and its dependency of the DT alias ID, allow such SoCs to determine the FIFO depth via the ``fifo-depth`` DT property. Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-6-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: s3c64xx: determine the fifo depth only onceTudor Ambarus
Determine the FIFO depth only once, at probe time. ``sdd->fifo_depth`` can be set later on with the FIFO depth specified in the device tree. Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-5-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: s3c64xx: allow full FIFO masksTudor Ambarus
The driver is wrong because is using partial register field masks for the SPI_STATUS.{RX, TX}_FIFO_LVL register fields. We see s3c64xx_spi_port_config.fifo_lvl_mask with different values for different instances of the same IP. Take s5pv210_spi_port_config for example, it defines: .fifo_lvl_mask = { 0x1ff, 0x7F }, fifo_lvl_mask is used to determine the FIFO depth of the instance of the IP. In this case, the integrator uses a 256 bytes FIFO for the first SPI instance of the IP, and a 64 bytes FIFO for the second instance. While the first mask reflects the SPI_STATUS.{RX, TX}_FIFO_LVL register fields, the second one is two bits short. Using partial field masks is misleading and can hide problems of the driver's logic. Allow platforms to specify the full FIFO mask, regardless of the FIFO depth. Introduce {rx, tx}_fifomask to represent the SPI_STATUS.{RX, TX}_FIFO_LVL register fields. It's a shifted mask defining the field's length and position. We'll be able to deprecate the use of @rx_lvl_offset, as the shift value can be determined from the mask. The existing compatibles shall start using {rx, tx}_fifomask so that they use the full field mask and to avoid shifting the mask to position, and then shifting it back to zero in the {TX, RX}_FIFO_LVL macros. @rx_lvl_offset will be deprecated in a further patch, after we have the infrastructure to deprecate @fifo_lvl_mask as well. No functional change intended. Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-4-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: s3c64xx: define a magic valueTudor Ambarus
Define a magic value, it will be used in the next patch as well. Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://msgid.link/r/20240216070555.2483977-3-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-05spi: dt-bindings: introduce FIFO depth propertiesTudor Ambarus
There are SPI IPs that can be configured by the integrator with a specific FIFO depth depending on the system's capabilities. For example, the samsung USI SPI IP can be configured by the integrator with a TX/RX FIFO from 8 byte to 256 bytes. Introduce the ``fifo-depth`` property for such instances of IPs where the same FIFO depth is used for both RX and TX. Introduce ``rx-fifo-depth`` and ``tx-fifo-depth`` properties for cases where the RX FIFO depth is different from the TX FIFO depth. Make the dedicated RX/TX properties dependent on each other and mutual exclusive with the other. Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://msgid.link/r/20240216070555.2483977-2-tudor.ambarus@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>