summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-03-06Merge tag 'riscv-firmware-for-v6.9' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes RISC-V firmware drivers for v6.9 A single minor fix for an oversized allocation due to sizeof() misuse by yours truly that came in since I sent my last fixes PR. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> * tag 'riscv-firmware-for-v6.9' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux: firmware: microchip: Fix over-requested allocation size Link: https://lore.kernel.org/r/20240305-vicinity-dumpling-8943ef26f004@spud Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-06Merge tag 'qcom-arm64-fixes-for-6.8-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes A few more Qualcomm Arm64 DeviceTree fixes for v6.8 This reduces the link speed of the PCIe bus with WiFi-card connected on the Lenovo ThinkPad X13s and the Qualcomm Compute Reference Device, avoid link errors and initialization issues reported by users. It also reverts the enablement of MPM on MSM8996, which is reported to prevent boards on this platform from booting for some users. * tag 'qcom-arm64-fixes-for-6.8-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: Revert "arm64: dts: qcom: msm8996: Hook up MPM" arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed Link: https://lore.kernel.org/r/20240306031208.4218-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-03-05selftests: avoid using SKIP(exit()) in harness fixure setupJakub Kicinski
selftest harness uses various exit codes to signal test results. Avoid calling exit() directly, otherwise tests may get broken by harness refactoring (like the commit under Fixes). SKIP() will instruct the harness that the test shouldn't run, it used to not be the case, but that has been fixed. So just return, no need to exit. Note that for hmm-tests this actually changes the result from pass to skip. Which seems fair, the test is skipped, after all. Reported-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/all/05f7bf89-04a5-4b65-bf59-c19456aeb1f0@sirena.org.uk Fixes: a724707976b0 ("selftests: kselftest_harness: use KSFT_* exit codes") Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mark Brown <broonie@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://lore.kernel.org/r/20240304233621.646054-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05Merge branch 'net-ethernet-rework-eee'Jakub Kicinski
Oleksij Rempel says: ==================== net: ethernet: Rework EEE with Andrew's permission I'll continue mainlining this patches: ============================================================== Most MAC drivers get EEE wrong. The API to the PHY is not very obvious, which is probably why. Rework the API, pushing most of the EEE handling into phylib core, leaving the MAC drivers to just enable/disable support for EEE in there change_link call back. MAC drivers are now expect to indicate to phylib if they support EEE. This will allow future patches to configure the PHY to advertise no EEE link modes when EEE is not supported. The information could also be used to enable SmartEEE if the PHY supports it. With these changes, the uAPI configuration eee_enable becomes a global on/off. tx-lpi must also be enabled before EEE is enabled. This fits the discussion here: https://lore.kernel.org/netdev/af880ce8-a7b8-138e-1ab9-8c89e662eecf@gmail.com/T/ This patchset puts in place all the infrastructure, and converts one MAC driver to the new API. Following patchsets will convert other MAC drivers, extend support into phylink, and when all MAC drivers are converted to the new scheme, clean up some unneeded code. ==================== Link: https://lore.kernel.org/r/20240302195306.3207716-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: fec: Fixup EEEAndrew Lunn
The enabling/disabling of EEE in the MAC should happen as a result of auto negotiation. So move the enable/disable into fec_enet_adjust_link() which gets called by phylib when there is a change in link status. fec_enet_set_eee() now just stores away the LPI timer value. Everything else is passed to phylib, so it can correctly setup the PHY. fec_enet_get_eee() relies on phylib doing most of the work, the MAC driver just adds the LPI timer value. Call phy_support_eee() if the quirk is present to indicate the MAC actually supports EEE. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> (On iMX8MP debix) Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://lore.kernel.org/r/20240302195306.3207716-8-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: fec: Move fec_enet_eee_mode_set() and helper earlierAndrew Lunn
FEC is about to get its EEE code re-written. To allow this, move fec_enet_eee_mode_set() before fec_enet_adjust_link() which will need to call it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://lore.kernel.org/r/20240302195306.3207716-7-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: phy: Add phy_support_eee() indicating MAC support EEEAndrew Lunn
In order for EEE to operate, both the MAC and the PHY need to support it, similar to how pause works. With some exception - a number of PHYs have SmartEEE or AutoGrEEEn support in order to provide some EEE-like power savings with non-EEE capable MACs. Copy the pause concept and add the call phy_support_eee() which the MAC makes after connecting the PHY to indicate it supports EEE. phylib will then advertise EEE when auto-neg is performed. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20240302195306.3207716-6-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: phy: Immediately call adjust_link if only tx_lpi_enabled changesAndrew Lunn
The MAC driver changes its EEE hardware configuration in its adjust_link callback. This is called when auto-neg completes. Disabling EEE via eee_enabled false will trigger an autoneg, and as a result the adjust_link callback will be called with phydev->enable_tx_lpi set to false. Similarly, eee_enabled set to true and with a change of advertised link modes will result in a new autoneg, and a call the adjust_link call. If set_eee is called with only a change to tx_lpi_enabled which does not trigger an auto-neg, it is necessary to call the adjust_link callback so that the MAC is reconfigured to take this change into account. When setting phydev->enable_tx_lpi, take both eee_enabled and tx_lpi_enabled into account, so the MAC drivers just needs to act on phydev->enable_tx_lpi and not the whole EEE configuration. The same check should be done for tx_lpi_timer too. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240302195306.3207716-5-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: phy: Keep track of EEE configurationAndrew Lunn
Have phylib keep track of the EEE configuration. This simplifies the MAC drivers, in that they don't need to store it. Future patches to phylib will also make use of this information to further simplify the MAC drivers. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20240302195306.3207716-4-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: phy: Add phydev->enable_tx_lpi to simplify adjust link callbacksAndrew Lunn
MAC drivers which support EEE need to know the results of the EEE auto-neg in order to program the hardware to perform EEE or not. The oddly named phy_init_eee() can be used to determine this, it returns 0 if EEE should be used, or a negative error code, e.g. -EOPPROTONOTSUPPORT if the PHY does not support EEE or negotiate resulted in it not being used. However, many MAC drivers get this wrong. Add phydev->enable_tx_lpi which indicates the result of the autoneg for EEE, including if EEE is administratively disabled with ethtool. The MAC driver can then access this in the same way as link speed and duplex in the adjust link callback. If enable_tx_lpi is true, the MAC should send low power indications and does not need to consider anything else with respect to EEE. Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20240302195306.3207716-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: add helpers for EEE configurationRussell King
Add helpers that phylib and phylink can use to manage EEE configuration and determine whether the MAC should be permitted to use LPI based on that configuration. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20240302195306.3207716-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: dsa: microchip: fix register write order in ksz8_ind_write8()Tobias Jakobi (Compleo)
This bug was noticed while re-implementing parts of the kernel driver in userspace using spidev. The goal was to enable some of the errata workarounds that Microchip describes in their errata sheet [1]. Both the errata sheet and the regular datasheet of e.g. the KSZ8795 imply that you need to do this for indirect register accesses: - write a 16-bit value to a control register pair (this value consists of the indirect register table, and the offset inside the table) - either read or write an 8-bit value from the data storage register (indicated by REG_IND_BYTE in the kernel) The current implementation has the order swapped. It can be proven, by reading back some indirect register with known content (the EEE register modified in ksz8_handle_global_errata() is one of these), that this implementation does not work. Private discussion with Oleksij Rempel of Pengutronix has revealed that the workaround was apparantly never tested on actual hardware. [1] https://ww1.microchip.com/downloads/aemDocuments/documents/OTH/ProductDocuments/Errata/KSZ87xx-Errata-DS80000687C.pdf Signed-off-by: Tobias Jakobi (Compleo) <tobias.jakobi.compleo@gmail.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Fixes: 7b6e6235b664 ("net: dsa: microchip: ksz8795: handle eee specif erratum") Link: https://lore.kernel.org/r/20240304154135.161332-1-tobias.jakobi.compleo@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05ethtool: ignore unused/unreliable fields in set_eee opHeiner Kallweit
This function is used with the set_eee() ethtool operation. Certain fields of struct ethtool_keee() are relevant only for the get_eee() operation. In addition, in case of the ioctl interface, we have no guarantee that userspace sends sane values in struct ethtool_eee. Therefore explicitly ignore all fields not needed for set_eee(). This protects from drivers trying to use unchecked and unreliable data, relying on specific userspace behavior. Note: Such unsafe driver behavior has been found and fixed in the tg3 driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/ad7ee11e-eb7a-4975-9122-547e13a161d8@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05dpll: move all dpll<>netdev helpers to dpll codeJakub Kicinski
Older versions of GCC really want to know the full definition of the type involved in rcu_assign_pointer(). struct dpll_pin is defined in a local header, net/core can't reach it. Move all the netdev <> dpll code into dpll, where the type is known. Otherwise we'd need multiple function calls to jump between the compilation units. This is the same problem the commit under fixes was trying to address, but with rcu_assign_pointer() not rcu_dereference(). Some of the exports are not needed, networking core can't be a module, we only need exports for the helpers used by drivers. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/all/35a869c8-52e8-177-1d4d-e57578b99b6@linux-m68k.org/ Fixes: 640f41ed33b5 ("dpll: fix build failure due to rcu_dereference_check() on unknown type") Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240305013532.694866-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05sock: Use unsafe_memcpy() for sock_copy()Kees Cook
While testing for places where zero-sized destinations were still showing up in the kernel, sock_copy() and inet_reqsk_clone() were found, which are using very specific memcpy() offsets for both avoiding a portion of struct sock, and copying beyond the end of it (since struct sock is really just a common header before the protocol-specific allocation). Instead of trying to unravel this historical lack of container_of(), just switch to unsafe_memcpy(), since that's effectively what was happening already (memcpy() wasn't checking 0-sized destinations while the code base was being converted away from fake flexible arrays). Avoid the following false positive warning with future changes to CONFIG_FORTIFY_SOURCE: memcpy: detected field-spanning write (size 3068) of destination "&nsk->__sk_common.skc_dontcopy_end" at net/core/sock.c:2057 (size 0) Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240304212928.make.772-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: tap: Remove generic .ndo_get_stats64Breno Leitao
Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is configured") moved the callback to dev_get_tstats64() to net core, so, unless the driver is doing some custom stats collection, it does not need to set .ndo_get_stats64. Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64 function pointer. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240304183810.1474883-2-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05net: tuntap: Leverage core stats allocatorBreno Leitao
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and convert veth & vrf"), stats allocation could be done on net core instead of in this driver. With this new approach, the driver doesn't have to bother with error handling (allocation failure checking, making sure free happens in the right spot, etc). This is core responsibility now. Remove the allocation in the tun/tap driver and leverage the network core allocation instead. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240304183810.1474883-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05cpumap: Zero-initialise xdp_rxq_info struct before running XDP programToke Høiland-Jørgensen
When running an XDP program that is attached to a cpumap entry, we don't initialise the xdp_rxq_info data structure being used in the xdp_buff that backs the XDP program invocation. Tobias noticed that this leads to random values being returned as the xdp_md->rx_queue_index value for XDP programs running in a cpumap. This means we're basically returning the contents of the uninitialised memory, which is bad. Fix this by zero-initialising the rxq data structure before running the XDP program. Fixes: 9216477449f3 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap") Reported-by: Tobias Böhm <tobias@aibor.de> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20240305213132.11955-1-toke@redhat.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-03-05selftests/bpf: Fix up xdp bonding test wrt feature flagsDaniel Borkmann
Adjust the XDP feature flags for the bond device when no bond slave devices are attached. After 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY"), the empty bond device must report 0 as flags instead of NETDEV_XDP_ACT_MASK. # ./vmtest.sh -- ./test_progs -t xdp_bond [...] [ 3.983311] bond1 (unregistering): (slave veth1_1): Releasing backup interface [ 3.995434] bond1 (unregistering): Released all slaves [ 4.022311] bond2: (slave veth2_1): Releasing backup interface #507/1 xdp_bonding/xdp_bonding_attach:OK #507/2 xdp_bonding/xdp_bonding_nested:OK #507/3 xdp_bonding/xdp_bonding_features:OK #507/4 xdp_bonding/xdp_bonding_roundrobin:OK #507/5 xdp_bonding/xdp_bonding_activebackup:OK #507/6 xdp_bonding/xdp_bonding_xor_layer2:OK #507/7 xdp_bonding/xdp_bonding_xor_layer23:OK #507/8 xdp_bonding/xdp_bonding_xor_layer34:OK #507/9 xdp_bonding/xdp_bonding_redirect_multi:OK #507 xdp_bonding:OK Summary: 1/9 PASSED, 0 SKIPPED, 0 FAILED [ 4.185255] bond2 (unregistering): Released all slaves [...] Fixes: 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Message-ID: <20240305090829.17131-2-daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-05xdp, bonding: Fix feature flags when there are no slave devs anymoreDaniel Borkmann
Commit 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY") changed the driver from reporting everything as supported before a device was bonded into having the driver report that no XDP feature is supported until a real device is bonded as it seems to be more truthful given eventually real underlying devices decide what XDP features are supported. The change however did not take into account when all slave devices get removed from the bond device. In this case after 9b0ed890ac2a, the driver keeps reporting a feature mask of 0x77, that is, NETDEV_XDP_ACT_MASK & ~NETDEV_XDP_ACT_XSK_ZEROCOPY whereas it should have reported a feature mask of 0. Fix it by resetting XDP feature flags in the same way as if no XDP program is attached to the bond device. This was uncovered by the XDP bond selftest which let BPF CI fail. After adjusting the starting masks on the latter to 0 instead of NETDEV_XDP_ACT_MASK the test passes again together with this fix. Fixes: 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Magnus Karlsson <magnus.karlsson@intel.com> Cc: Prashant Batra <prbatra.mail@gmail.com> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Message-ID: <20240305090829.17131-1-daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-05Merge branch 'check-bpf_func_state-callback_depth-when-pruning-states'Alexei Starovoitov
Eduard Zingerman says: ==================== check bpf_func_state->callback_depth when pruning states This patch-set fixes bug in states pruning logic hit in mailing list discussion [0]. The details of the fix are in patch #1. The main idea for the fix belongs to Yonghong Song, mine contribution is merely in review and test cases. There are some changes in verification performance: File Program Insns (DIFF) States (DIFF) ------------------------- ------------- --------------- -------------- pyperf600_bpf_loop.bpf.o on_event +15 (+0.42%) +0 (+0.00%) strobemeta_bpf_loop.bpf.o on_event +857 (+37.95%) +60 (+38.96%) xdp_synproxy_kern.bpf.o syncookie_tc +2892 (+30.39%) +109 (+36.33%) xdp_synproxy_kern.bpf.o syncookie_xdp +2892 (+30.01%) +109 (+36.09%) (when tested on a subset of selftests identified by selftests/bpf/veristat.cfg and Cilium bpf object files from [4]) Changelog: v2 [2] -> v3: - fixes for verifier.c commit message as suggested by Yonghong; - patch-set re-rerouted to 'bpf' tree as suggested in [2]; - patch for test_tcp_custom_syncookie is sent separately to 'bpf-next' [3]. - veristat results updated using 'bpf' tree as baseline and clang 16. v1 [1] -> v2: - patch #2 commit message updated to better reflect verifier behavior with regards to checkpoints tree (suggested by Yonghong); - veristat results added (suggested by Andrii). [0] https://lore.kernel.org/bpf/9b251840-7cb8-4d17-bd23-1fc8071d8eef@linux.dev/ [1] https://lore.kernel.org/bpf/20240212143832.28838-1-eddyz87@gmail.com/ [2] https://lore.kernel.org/bpf/20240216150334.31937-1-eddyz87@gmail.com/ [3] https://lore.kernel.org/bpf/20240222150300.14909-1-eddyz87@gmail.com/ [4] https://github.com/anakryiko/cilium ==================== Link: https://lore.kernel.org/r/20240222154121.6991-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-05selftests/bpf: test case for callback_depth states pruning logicEduard Zingerman
The test case was minimized from mailing list discussion [0]. It is equivalent to the following C program: struct iter_limit_bug_ctx { __u64 a; __u64 b; __u64 c; }; static __naked void iter_limit_bug_cb(void) { switch (bpf_get_prandom_u32()) { case 1: ctx->a = 42; break; case 2: ctx->b = 42; break; default: ctx->c = 42; break; } } int iter_limit_bug(struct __sk_buff *skb) { struct iter_limit_bug_ctx ctx = { 7, 7, 7 }; bpf_loop(2, iter_limit_bug_cb, &ctx, 0); if (ctx.a == 42 && ctx.b == 42 && ctx.c == 7) asm volatile("r1 /= 0;":::"r1"); return 0; } The main idea is that each loop iteration changes one of the state variables in a non-deterministic manner. Hence it is premature to prune the states that have two iterations left comparing them to states with one iteration left. E.g. {{7,7,7}, callback_depth=0} can reach state {42,42,7}, while {{7,7,7}, callback_depth=1} can't. [0] https://lore.kernel.org/bpf/9b251840-7cb8-4d17-bd23-1fc8071d8eef@linux.dev/ Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240222154121.6991-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-05bpf: check bpf_func_state->callback_depth when pruning statesEduard Zingerman
When comparing current and cached states verifier should consider bpf_func_state->callback_depth. Current state cannot be pruned against cached state, when current states has more iterations left compared to cached state. Current state has more iterations left when it's callback_depth is smaller. Below is an example illustrating this bug, minimized from mailing list discussion [0] (assume that BPF_F_TEST_STATE_FREQ is set). The example is not a safe program: if loop_cb point (1) is followed by loop_cb point (2), then division by zero is possible at point (4). struct ctx { __u64 a; __u64 b; __u64 c; }; static void loop_cb(int i, struct ctx *ctx) { /* assume that generated code is "fallthrough-first": * if ... == 1 goto * if ... == 2 goto * <default> */ switch (bpf_get_prandom_u32()) { case 1: /* 1 */ ctx->a = 42; return 0; break; case 2: /* 2 */ ctx->b = 42; return 0; break; default: /* 3 */ ctx->c = 42; return 0; break; } } SEC("tc") __failure __flag(BPF_F_TEST_STATE_FREQ) int test(struct __sk_buff *skb) { struct ctx ctx = { 7, 7, 7 }; bpf_loop(2, loop_cb, &ctx, 0); /* 0 */ /* assume generated checks are in-order: .a first */ if (ctx.a == 42 && ctx.b == 42 && ctx.c == 7) asm volatile("r0 /= 0;":::"r0"); /* 4 */ return 0; } Prior to this commit verifier built the following checkpoint tree for this example: .------------------------------------- Checkpoint / State name | .-------------------------------- Code point number | | .---------------------------- Stack state {ctx.a,ctx.b,ctx.c} | | | .------------------- Callback depth in frame #0 v v v v - (0) {7P,7P,7},depth=0 - (3) {7P,7P,7},depth=1 - (0) {7P,7P,42},depth=1 - (3) {7P,7,42},depth=2 - (0) {7P,7,42},depth=2 loop terminates because of depth limit - (4) {7P,7,42},depth=0 predicted false, ctx.a marked precise - (6) exit (a) - (2) {7P,7,42},depth=2 - (0) {7P,42,42},depth=2 loop terminates because of depth limit - (4) {7P,42,42},depth=0 predicted false, ctx.a marked precise - (6) exit (b) - (1) {7P,7P,42},depth=2 - (0) {42P,7P,42},depth=2 loop terminates because of depth limit - (4) {42P,7P,42},depth=0 predicted false, ctx.{a,b} marked precise - (6) exit - (2) {7P,7,7},depth=1 considered safe, pruned using checkpoint (a) (c) - (1) {7P,7P,7},depth=1 considered safe, pruned using checkpoint (b) Here checkpoint (b) has callback_depth of 2, meaning that it would never reach state {42,42,7}. While checkpoint (c) has callback_depth of 1, and thus could yet explore the state {42,42,7} if not pruned prematurely. This commit makes forbids such premature pruning, allowing verifier to explore states sub-tree starting at (c): (c) - (1) {7,7,7P},depth=1 - (0) {42P,7,7P},depth=1 ... - (2) {42,7,7},depth=2 - (0) {42,42,7},depth=2 loop terminates because of depth limit - (4) {42,42,7},depth=0 predicted true, ctx.{a,b,c} marked precise - (5) division by zero [0] https://lore.kernel.org/bpf/9b251840-7cb8-4d17-bd23-1fc8071d8eef@linux.dev/ Fixes: bb124da69c47 ("bpf: keep track of max number of bpf_loop callback iterations") Suggested-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240222154121.6991-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-06riscv: dts: Move BUILTIN_DTB_SOURCE to common KconfigYangyu Chen
The BUILTIN_DTB_SOURCE was only configured for K210 before. Since SOC_BUILTIN_DTB_DECLARE was removed at commit d5805af9fe9f ("riscv: Fix builtin DTB handling") from patch [1], the kernel cannot choose one of the dtbs from then on and always take the first one dtb to use. Then, another commit 0ddd7eaffa64 ("riscv: Fix BUILTIN_DTB for sifive and microchip soc") from patch [2] supports BUILTIN_DTB_SOURCE for other SoCs. However, this feature will only work if the Kconfig we use links the dtb we expected in the first place as mentioned in the thread [3]. Thus, a config BUILTIN_DTB_SOURCE is needed for all SoCs to choose one dtb to use. For some considerations, this patch also removes default y if XIP_KERNEL for BUILTIN_DTB, as this requires setting a proper dtb to use on the BUILTIN_DTB_SOURCE, else the kernel with XIP but does not set BUILTIN_DTB_SOURCE or unselect BUILTIN_DTB will not boot. Also, this patch removes the default dtb string for k210 from Kconfig to nommu_k210_defconfig and nommu_k210_sdcard_defconfig to avoid complex Kconfig settings for other SoCs in the future. [1] https://lore.kernel.org/linux-riscv/20201208073355.40828-5-damien.lemoal@wdc.com/ [2] https://lore.kernel.org/linux-riscv/20210604120639.1447869-1-alex@ghiti.fr/ [3] https://lore.kernel.org/linux-riscv/CAK7LNATt_56mO2Le4v4EnPnAfd3gC8S_Sm5-GCsfa=qXy=8Lrg@mail.gmail.com/ Signed-off-by: Yangyu Chen <cyy@cyyself.name> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-03-05selftests/exec: Perform script checks with /bin/bashKees Cook
It seems some shells linked to /bin/sh don't have consistent behavior with error codes on execution failures. Explicitly use /bin/bash so that "not found" errors are correctly generated. Repeating the comment from the test: /* * Execute as a long pathname relative to "/". If this is a script, * the interpreter will launch but fail to open the script because its * name ("/dev/fd/5/xxx....") is bigger than PATH_MAX. * * The failure code is usually 127 (POSIX: "If a command is not found, * the exit status shall be 127."), but some systems give 126 (POSIX: * "If the command name is found, but it is not an executable utility, * the exit status shall be 126."), so allow either. */ Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Closes: https://lore.kernel.org/lkml/02c8bf8e-1934-44ab-a886-e065b37366a7@collabora.com/ Signed-off-by: Kees Cook <keescook@chromium.org> --- Cc: Eric Biederman <ebiederm@xmission.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: linux-mm@kvack.org Cc: linux-kselftest@vger.kernel.org
2024-03-05rxrpc: Extract useful fields from a received ACK to skb priv dataDavid Howells
Extract useful fields from a received ACK packet into the skb private data early on in the process of parsing incoming packets. This makes the ACK fields available even before we've matched the ACK up to a call and will allow us to deal with path MTU discovery probe responses even after the relevant call has been completed. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Clean up the resend algorithmDavid Howells
Clean up the DATA packet resending algorithm to retransmit packets as we come across them whilst walking the transmission buffer rather than queuing them for retransmission at the end. This can be done as ACK parsing - and thus the discarding of successful packets - is now done in the same thread rather than separately in softirq context and a locked section is no longer required. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Record probes after transmission and reduce number of time-getsDavid Howells
Move the recording of a successfully transmitted DATA or ACK packet that will provide RTT probing to after the transmission. With the I/O thread model, this can be done because parsing of the responding ACK can no longer race with the post-transmission code. Move the various timeout-settings done after successfully transmitting a DATA packet into rxrpc_tstamp_data_packets() and eliminate a number of calls to get the current time. As a consequence we no longer need to cancel a proposed RTT probe on transmission failure. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Use ktimes for call timeout tracking and set the timer lazilyDavid Howells
Track the call timeouts as ktimes rather than jiffies as the latter's granularity is too high and only set the timer at the end of the event handling function. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Differentiate PING ACK transmission traces.David Howells
There are three points that transmit PING ACKs and all of them use the same trace string. Change two of them to use different strings. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Don't permit resending after all Tx packets ackedDavid Howells
Once all the packets transmitted as part of a call have been acked, don't permit any resending. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Parse received packets before dealing with timeoutsDavid Howells
Parse the received packets before going and processing timeouts as the timeouts may be reset by the reception of a packet. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05rxrpc: Do zerocopy using MSG_SPLICE_PAGES and page fragsDavid Howells
Switch from keeping the transmission buffers in the rxrpc_txbuf struct and allocated from the slab, to allocating them using page fragment allocators (which uses raw pages), thereby allowing them to be passed to MSG_SPLICE_PAGES and avoid copying into the UDP buffers. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2024-03-05Merge tag 'cgroup-for-6.8-rc7-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Two cpuset fixes. Both are for bugs in error handling paths and low risk" * tag 'cgroup-for-6.8-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup/cpuset: Fix retval in update_cpumask() cgroup/cpuset: Fix a memory leak in update_exclusive_cpumask()
2024-03-05Merge tag 'integrity-v6.8-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity fix from Mimi Zohar: "A single fix to eliminate an unnecessary message" * tag 'integrity-v6.8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: integrity: eliminate unnecessary "Problem loading X.509 certificate" msg
2024-03-05Merge branch 'dmraid-fix-6.9' into md-6.9Song Liu
This is the second half of fixes for dmraid. The first half is available at [1]. This set contains fixes: - reshape can start unexpected, cause data corruption, patch 1,5,6; - deadlocks that reshape concurrent with IO, patch 8; - a lockdep warning, patch 9; For all the dmraid related tests in lvm2 suite, there is no new regressions compared against 6.6 kernels (which is good baseline before recent regressions). [1] https://lore.kernel.org/all/CAPhsuW7u1UKHCDOBDhD7DzOVtkGemDz_QnJ4DUq_kSN-Q3G66Q@mail.gmail.com/ * dmraid-fix-6.9: dm-raid: fix lockdep waring in "pers->hot_add_disk" dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape dm-raid: add a new helper prepare_suspend() in md_personality md/dm-raid: don't call md_reap_sync_thread() directly dm-raid: really frozen sync_thread during suspend md: add a new helper reshape_interrupted() md: export helper md_is_rdwr() md: export helpers to stop sync_thread md: don't clear MD_RECOVERY_FROZEN for new dm-raid until resume
2024-03-05dm-raid: fix lockdep waring in "pers->hot_add_disk"Yu Kuai
The lockdep assert is added by commit a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") in print_conf(). And I didn't notice that dm-raid is calling "pers->hot_add_disk" without holding 'reconfig_mutex'. "pers->hot_add_disk" read and write many fields that is protected by 'reconfig_mutex', and raid_resume() already grab the lock in other contex. Hence fix this problem by protecting "pers->host_add_disk" with the lock. Fixes: 9092c02d9435 ("DM RAID: Add ability to restore transiently failed devices on resume") Fixes: a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-10-yukuai1@huaweicloud.com
2024-03-05dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent ↵Yu Kuai
with reshape For raid456, if reshape is still in progress, then IO across reshape position will wait for reshape to make progress. However, for dm-raid, in following cases reshape will never make progress hence IO will hang: 1) the array is read-only; 2) MD_RECOVERY_WAIT is set; 3) MD_RECOVERY_FROZEN is set; After commit c467e97f079f ("md/raid6: use valid sector values to determine if an I/O should wait on the reshape") fix the problem that IO across reshape position doesn't wait for reshape, the dm-raid test shell/lvconvert-raid-reshape.sh start to hang: [root@fedora ~]# cat /proc/979/stack [<0>] wait_woken+0x7d/0x90 [<0>] raid5_make_request+0x929/0x1d70 [raid456] [<0>] md_handle_request+0xc2/0x3b0 [md_mod] [<0>] raid_map+0x2c/0x50 [dm_raid] [<0>] __map_bio+0x251/0x380 [dm_mod] [<0>] dm_submit_bio+0x1f0/0x760 [dm_mod] [<0>] __submit_bio+0xc2/0x1c0 [<0>] submit_bio_noacct_nocheck+0x17f/0x450 [<0>] submit_bio_noacct+0x2bc/0x780 [<0>] submit_bio+0x70/0xc0 [<0>] mpage_readahead+0x169/0x1f0 [<0>] blkdev_readahead+0x18/0x30 [<0>] read_pages+0x7c/0x3b0 [<0>] page_cache_ra_unbounded+0x1ab/0x280 [<0>] force_page_cache_ra+0x9e/0x130 [<0>] page_cache_sync_ra+0x3b/0x110 [<0>] filemap_get_pages+0x143/0xa30 [<0>] filemap_read+0xdc/0x4b0 [<0>] blkdev_read_iter+0x75/0x200 [<0>] vfs_read+0x272/0x460 [<0>] ksys_read+0x7a/0x170 [<0>] __x64_sys_read+0x1c/0x30 [<0>] do_syscall_64+0xc6/0x230 [<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74 This is because reshape can't make progress. For md/raid, the problem doesn't exist because register new sync_thread doesn't rely on the IO to be done any more: 1) If array is read-only, it can switch to read-write by ioctl/sysfs; 2) md/raid never set MD_RECOVERY_WAIT; 3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold 'reconfig_mutex', hence it can be cleared and reshape can continue by sysfs api 'sync_action'. However, I'm not sure yet how to avoid the problem in dm-raid yet. This patch on the one hand make sure raid_message() can't change sync_thread() through raid_message() after presuspend(), on the other hand detect the above 3 cases before wait for IO do be done in dm_suspend(), and let dm-raid requeue those IO. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-9-yukuai1@huaweicloud.com
2024-03-05dm-raid: add a new helper prepare_suspend() in md_personalityYu Kuai
There are no functional changes for now, prepare to fix a deadlock for dm-raid456. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-8-yukuai1@huaweicloud.com
2024-03-05md/dm-raid: don't call md_reap_sync_thread() directlyYu Kuai
Currently md_reap_sync_thread() is called from raid_message() directly without holding 'reconfig_mutex', this is definitely unsafe because md_reap_sync_thread() can change many fields that is protected by 'reconfig_mutex'. However, hold 'reconfig_mutex' here is still problematic because this will cause deadlock, for example, commit 130443d60b1b ("md: refactor idle/frozen_sync_thread() to fix deadlock"). Fix this problem by using stop_sync_thread() to unregister sync_thread, like md/raid did. Fixes: be83651f0050 ("DM RAID: Add message/status support for changing sync action") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-7-yukuai1@huaweicloud.com
2024-03-05dm-raid: really frozen sync_thread during suspendYu Kuai
1) commit f52f5c71f3d4 ("md: fix stopping sync thread") remove MD_RECOVERY_FROZEN from __md_stop_writes() and doesn't realize that dm-raid relies on __md_stop_writes() to frozen sync_thread indirectly. Fix this problem by adding MD_RECOVERY_FROZEN in md_stop_writes(), and since stop_sync_thread() is only used for dm-raid in this case, also move stop_sync_thread() to md_stop_writes(). 2) The flag MD_RECOVERY_FROZEN doesn't mean that sync thread is frozen, it only prevent new sync_thread to start, and it can't stop the running sync thread; In order to frozen sync_thread, after seting the flag, stop_sync_thread() should be used. 3) The flag MD_RECOVERY_FROZEN doesn't mean that writes are stopped, use it as condition for md_stop_writes() in raid_postsuspend() doesn't look correct. Consider that reentrant stop_sync_thread() do nothing, always call md_stop_writes() in raid_postsuspend(). 4) raid_message can set/clear the flag MD_RECOVERY_FROZEN at anytime, and if MD_RECOVERY_FROZEN is cleared while the array is suspended, new sync_thread can start unexpected. Fix this by disallow raid_message() to change sync_thread status during suspend. Note that after commit f52f5c71f3d4 ("md: fix stopping sync thread"), the test shell/lvconvert-raid-reshape.sh start to hang in stop_sync_thread(), and with previous fixes, the test won't hang there anymore, however, the test will still fail and complain that ext4 is corrupted. And with this patch, the test won't hang due to stop_sync_thread() or fail due to ext4 is corrupted anymore. However, there is still a deadlock related to dm-raid456 that will be fixed in following patches. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Closes: https://lore.kernel.org/all/e5e8afe2-e9a8-49a2-5ab0-958d4065c55e@redhat.com/ Fixes: 1af2048a3e87 ("dm raid: fix deadlock caused by premature md_stop_writes()") Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-6-yukuai1@huaweicloud.com
2024-03-05md: add a new helper reshape_interrupted()Yu Kuai
The helper will be used for dm-raid456 later to detect the case that reshape can't make progress. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-5-yukuai1@huaweicloud.com
2024-03-05md: export helper md_is_rdwr()Yu Kuai
There are no functional changes for now, prepare to fix a deadlock for dm-raid456. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-4-yukuai1@huaweicloud.com
2024-03-05md: export helpers to stop sync_threadYu Kuai
Add new helpers: void md_idle_sync_thread(struct mddev *mddev); void md_frozen_sync_thread(struct mddev *mddev); void md_unfrozen_sync_thread(struct mddev *mddev); The helpers will be used in dm-raid in later patches to fix regressions and prevent calling md_reap_sync_thread() directly. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-3-yukuai1@huaweicloud.com
2024-03-05md: don't clear MD_RECOVERY_FROZEN for new dm-raid until resumeYu Kuai
After commit 9dbd1aa3a81c ("dm raid: add reshaping support to the target") raid_ctr() will set MD_RECOVERY_FROZEN before md_run() and expect to keep array frozen until resume. However, md_run() will clear the flag by setting mddev->recovery to 0. Before commit 1baae052cccd ("md: Don't ignore suspended array in md_check_recovery()"), dm-raid actually relied on suspending to prevent starting new sync_thread. Fix this problem by keeping 'MD_RECOVERY_FROZEN' for dm-raid in md_run(). Fixes: 1baae052cccd ("md: Don't ignore suspended array in md_check_recovery()") Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240305072306.2562024-2-yukuai1@huaweicloud.com
2024-03-05Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""Song Liu
This reverts commit bed9e27baf52a09b7ba2a3714f1e24e17ced386d. The original set [1][2] was expected to undo a suboptimal fix in [2], and replace it with a better fix [1]. However, as reported by Dan Moulding [2] causes an issue with raid5 with journal device. Revert [2] for now to close the issue. We will follow up on another issue reported by Juxiao Bi, as [2] is expected to fix it. We believe this is a good trade-off, because the latter issue happens less freqently. In the meanwhile, we will NOT revert [1], as it contains the right logic. [1] commit d6e035aad6c0 ("md: bypass block throttle for superblock update") [2] commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"") Reported-by: Dan Moulding <dan@danm.net> Closes: https://lore.kernel.org/linux-raid/20240123005700.9302-1-dan@danm.net/ Fixes: bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"") Cc: stable@vger.kernel.org # v5.19+ Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240125082131.788600-1-song@kernel.org
2024-03-05Merge tag 'platform-drivers-x86-v6.8-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - Fix P2SB regression causing ACPI errors and high CPU load - Fix error return path in amd_pmf_init_smart_pc() * tag 'platform-drivers-x86-v6.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/amd/pmf: Fix missing error code in amd_pmf_init_smart_pc() platform/x86: p2sb: On Goldmont only cache P2SB and SPI devfn BAR
2024-03-05Merge tag 'hyperv-fixes-signed-20240303' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Multiple fixes, cleanups and documentations for Hyper-V core code and drivers * tag 'hyperv-fixes-signed-20240303' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: make hv_bus const x86/hyperv: Allow 15-bit APIC IDs for VTL platforms x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad() x86/mm: Regularize set_memory_p() parameters and make non-static x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback Documentation: hyperv: Add overview of PCI pass-thru device support Drivers: hv: vmbus: Update indentation in create_gpadl_header() Drivers: hv: vmbus: Remove duplication and cleanup code in create_gpadl_header() fbdev/hyperv_fb: Fix logic error for Gen2 VMs in hvfb_getmem() Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory hv_utils: Allow implicit ICTIMESYNCFLAG_SYNC
2024-03-05ptp: fc3: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240304091325.717546-2-u.kleine-koenig@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05riscv: dts: starfive: jh7100: fix root clock namesKrzysztof Kozlowski
JH7100 clock controller driver depends on certain root clock names. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/all/CAMuHMdWw0dteXO2jw4cwGvzKcL6vmnb96C=qgPgUqNDMtF6X0Q@mail.gmail.com/ Fixes: f03606470886 ("riscv: dts: starfive: replace underscores in node names") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>