summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-02-07ceph: prevent use-after-free in encode_cap_msg()Rishabh Dave
In fs/ceph/caps.c, in encode_cap_msg(), "use after free" error was caught by KASAN at this line - 'ceph_buffer_get(arg->xattr_buf);'. This implies before the refcount could be increment here, it was freed. In same file, in "handle_cap_grant()" refcount is decremented by this line - 'ceph_buffer_put(ci->i_xattrs.blob);'. It appears that a race occurred and resource was freed by the latter line before the former line could increment it. encode_cap_msg() is called by __send_cap() and __send_cap() is called by ceph_check_caps() after calling __prep_cap(). __prep_cap() is where arg->xattr_buf is assigned to ci->i_xattrs.blob. This is the spot where the refcount must be increased to prevent "use after free" error. Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/59259 Signed-off-by: Rishabh Dave <ridave@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-07Merge tag 'icc-6.8-rc4' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-linus Georgi writes: interconnect fixes for v6.8-rc These are tiny fixes for reported issues in driver code for a few platforms. One of them sorts out a hang issue, the other improves the power consumption and the rest are fixing some bitmasks to make sure the hardware does thing right. - interconnect: qcom: sc8180x: Mark CO0 BCM keepalive - interconnect: qcom: sm8550: Enable sync_state - interconnect: qcom: sm8650: Use correct ACV enable_mask - interconnect: qcom: x1e80100: Add missing ACV enable_mask Signed-off-by: Georgi Djakov <djakov@kernel.org> * tag 'icc-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc: interconnect: qcom: x1e80100: Add missing ACV enable_mask interconnect: qcom: sm8650: Use correct ACV enable_mask interconnect: qcom: sm8550: Enable sync_state interconnect: qcom: sc8180x: Mark CO0 BCM keepalive
2024-02-07ceph: always set initial i_blkbits to CEPH_FSCRYPT_BLOCK_SHIFTXiubo Li
The fscrypt code will use i_blkbits to setup ci_data_unit_bits when allocating the new inode, but ceph will initiate i_blkbits ater when filling the inode, which is too late. Since ci_data_unit_bits will only be used by the fscrypt framework so initiating i_blkbits with CEPH_FSCRYPT_BLOCK_SHIFT is safe. Link: https://tracker.ceph.com/issues/64035 Fixes: 5b1188847180 ("fscrypt: support crypto data unit size less than filesystem block size") Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-07libceph: just wait for more data to be available on the socketXiubo Li
A short read may occur while reading the message footer from the socket. Later, when the socket is ready for another read, the messenger invokes all read_partial_*() handlers, including read_partial_sparse_msg_data(). The expectation is that read_partial_sparse_msg_data() would bail, allowing the messenger to invoke read_partial() for the footer and pick up where it left off. However read_partial_sparse_msg_data() violates that and ends up calling into the state machine in the OSD client. The sparse-read state machine assumes that it's a new op and interprets some piece of the footer as the sparse-read header and returns bogus extents/data length, etc. To determine whether read_partial_sparse_msg_data() should bail, let's reuse cursor->total_resid. Because once it reaches to zero that means all the extents and data have been successfully received in last read, else it could break out when partially reading any of the extents and data. And then osd_sparse_read() could continue where it left off. [ idryomov: changelog ] Link: https://tracker.ceph.com/issues/63586 Fixes: d396f89db39a ("libceph: add sparse read support to msgr1") Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-07libceph: rename read_sparse_msg_*() to read_partial_sparse_msg_*()Xiubo Li
These functions are supposed to behave like other read_partial_*() handlers: the contract with messenger v1 is that the handler bails if the area of the message it's responsible for is already processed. This comes up when handling short reads from the socket. [ idryomov: changelog ] Signed-off-by: Xiubo Li <xiubli@redhat.com> Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-07libceph: fail sparse-read if the data length doesn't matchXiubo Li
Once this happens that means there have bugs. Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-02-07selftests: cmsg_ipv6: repeat the exact packetJakub Kicinski
cmsg_ipv6 test requests tcpdump to capture 4 packets, and sends until tcpdump quits. Only the first packet is "real", however, and the rest are basic UDP packets. So if tcpdump doesn't start in time it will miss the real packet and only capture the UDP ones. This makes the test fail on slow machine (no KVM or with debug enabled) 100% of the time, while it passes in fast environments. Repeat the "real" / expected packet. Fixes: 9657ad09e1fa ("selftests: net: test IPV6_TCLASS") Fixes: 05ae83d5a4a2 ("selftests: net: test IPV6_HOPLIMIT") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-07drm: xlnx: zynqmp_dpsub: Filter interrupts against maskAnatoliy Klymenko
Filter out status register against the interrupts' mask. Some events are being reported via DP status register, even if corresponding interrupts have been disabled. One instance of such event leads to generation of VBLANK when the driver is in DRM bridge mode, which in turn results in NULL pointer dereferencing. We should avoid processing such events in an interrupt handler context. This problem is less noticeable when the driver operates in DMA mode, as in this case we have DRM CRTC object instantiated and DRM framework simply discards unwanted VBLANKs in drm_handle_vblank(). Signed-off-by: Anatoliy Klymenko <anatoliy.klymenko@amd.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124025402.373620-5-anatoliy.klymenko@amd.com
2024-02-07drm: xlnx: zynqmp_dpsub: Clear status register ASAPAnatoliy Klymenko
Clear status register as soon as we read it. Addressing comments from https://lore.kernel.org/dri-devel/beb551c7-bb7e-4cd0-b166-e9aad90c4620@ideasonboard.com/ Signed-off-by: Anatoliy Klymenko <anatoliy.klymenko@amd.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124025402.373620-4-anatoliy.klymenko@amd.com
2024-02-07drm: xlnx: zynqmp_dpsub: Fix timing for live modeAnatoliy Klymenko
Expect external video timing in live video input mode, program DPSUB acordingly. Signed-off-by: Anatoliy Klymenko <anatoliy.klymenko@amd.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124025402.373620-3-anatoliy.klymenko@amd.com
2024-02-07drm: xlnx: zynqmp_dpsub: Make drm bridge discoverableAnatoliy Klymenko
ZynqMP DPSUB supports 2 input modes: DMA based and live video. In the first mode, the driver implements CRTC and DP encoder DRM bridge to model the complete display pipeline. In this case, DRM bridge is being directly instantiated within the driver, not using any bridge discovery mechanisms. In the live video input mode video signal is generated by FPGA fabric and passed into DPSUB over the connected bus. In this mode driver exposes the DP encoder as a DRM bridge, expecting external CRTC to discover it via drm_of_find_panel_or_bridge() or a similar call. This discovery relies on drm_bridge.of_node being properly set. Assign device OF node to the bridge prior to registering it. This will make said bridge discoverable by an external CRTC driver. Signed-off-by: Anatoliy Klymenko <anatoliy.klymenko@amd.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124025402.373620-2-anatoliy.klymenko@amd.com
2024-02-07Merge drm/drm-next into drm-misc-nextThomas Zimmermann
Backmerging to update drm-misc-next to the state of v6.8-rc3. Also fixes a build problem with xe. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-02-07regmap: kunit: fix raw noinc write test wrappingBen Wolsieffer
The raw noinc write test places a known value in the register following the noinc register to verify that it is not disturbed by the noinc write. This test ensures this value is distinct by adding 100 to the second element of the noinc write data. The regmap registers are 16-bit, while the test value is stored in an unsigned int. Therefore, adding 100 may cause the register to wrap while the test value does not, causing the test to fail. This patch fixes this by changing val_test and val_last from unsigned int to u16. Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/linux-kernel/745d3a11-15bc-48b6-84c8-c8761c943bed@roeck-us.net/T/ Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20240206151004.1636761-2-ben.wolsieffer@hefring.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-02-07drm: mipi-dsi: make mipi_dsi_bus_type constRicardo B. Marliere
Now that the driver core can properly handle constant struct bus_type, move the mipi_dsi_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240203-bus_cleanup-gpu-v1-2-1b6ecdb5f941@marliere.net Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240203-bus_cleanup-gpu-v1-2-1b6ecdb5f941@marliere.net
2024-02-07drm: display: make dp_aux_bus_type constRicardo B. Marliere
Now that the driver core can properly handle constant struct bus_type, move the dp_aux_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240203-bus_cleanup-gpu-v1-1-1b6ecdb5f941@marliere.net Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240203-bus_cleanup-gpu-v1-1-1b6ecdb5f941@marliere.net
2024-02-07ASoC: cs42l43: Add system suspend ops to disable IRQCharles Keepax
The IRQ should be disabled whilst entering and exiting system suspend to avoid the IRQ handler being called whilst the PM runtime is disabled. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20240206113850.719888-2-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-02-07ASoC: cs42l43: Handle error from devm_pm_runtime_enableCharles Keepax
As devm_pm_runtime_enable can fail due to memory allocations, it is best to handle the error. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20240206113850.719888-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-02-07drm/bridge: imx8mp-hdmi-pvi: Fix build warningsAdam Ford
Two separate build warnings were reported. One from an uninitialized variable, and the other from returning 0 instead of NULL from a pointer. Fixes: 059c53e877ca ("drm/bridge: imx: add driver for HDMI TX Parallel Video Interface") Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402062134.a6CqAt3s-lkp@intel.com/ Signed-off-by: Adam Ford <aford173@gmail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20240207002305.618499-1-aford173@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240207002305.618499-1-aford173@gmail.com
2024-02-07drm/panel: re-alphabetize the menu listRandy Dunlap
A few of the DRM_PANEL entries have become out of alphabetical order, so move them around a bit to restore alpha order. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Neil Armstrong <neil.armstrong@linaro.org> Cc: Jessica Zhang <quic_jesszhan@quicinc.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: <dri-devel@lists.freedesktop.org> Cc: Aradhya Bhatia <a-bhatia1@ti.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20240205062711.3513-1-rdunlap@infradead.org Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240205062711.3513-1-rdunlap@infradead.org
2024-02-07drm/panel: simple: push blanking limit on RK32FN48HRaphael Gallais-Pou
Push horizontal front porch and vertical back porch blanking limit. This allows to get a 60 fps sharp. Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com> Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20240205-ltdc_mp13-v1-5-116d43ebba75@foss.st.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240205-ltdc_mp13-v1-5-116d43ebba75@foss.st.com
2024-02-07drm/panel: simple: fix flags on RK043FN48HRaphael Gallais-Pou
DISPLAY_FLAGS_SYNC_POSEDGE is missing in the flags on the default timings. When overriding the default mode with one described in the device tree, the mode does not get acked because of this missing flag. Moreover since the panel is driven by the positive edge it makes sense to add it here. Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20240205-ltdc_mp13-v1-4-072d24bf1b36@foss.st.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240205-ltdc_mp13-v1-4-072d24bf1b36@foss.st.com
2024-02-07net: stmmac: protect updates of 64-bit statistics countersPetr Tesarik
As explained by a comment in <linux/u64_stats_sync.h>, write side of struct u64_stats_sync must ensure mutual exclusion, or one seqcount update could be lost on 32-bit platforms, thus blocking readers forever. Such lockups have been observed in real world after stmmac_xmit() on one CPU raced with stmmac_napi_poll_tx() on another CPU. To fix the issue without introducing a new lock, split the statics into three parts: 1. fields updated only under the tx queue lock, 2. fields updated only during NAPI poll, 3. fields updated only from interrupt context, Updates to fields in the first two groups are already serialized through other locks. It is sufficient to split the existing struct u64_stats_sync so that each group has its own. Note that tx_set_ic_bit is updated from both contexts. Split this counter so that each context gets its own, and calculate their sum to get the total value in stmmac_get_ethtool_stats(). For the third group, multiple interrupts may be processed by different CPUs at the same time, but interrupts on the same CPU will not nest. Move fields from this group to a newly created per-cpu struct stmmac_pcpu_stats. Fixes: 133466c3bbe1 ("net: stmmac: use per-queue 64 bit statistics where necessary") Link: https://lore.kernel.org/netdev/Za173PhviYg-1qIn@torres.zugschlus.de/t/ Cc: stable@vger.kernel.org Signed-off-by: Petr Tesarik <petr@tesarici.cz> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-07Merge tag 'for-6.8-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - two fixes preventing deletion and manual creation of subvolume qgroup - unify error code returned for unknown send flags - fix assertion during subvolume creation when anonymous device could be allocated by other thread (e.g. due to backref walk) * tag 'for-6.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: do not ASSERT() if the newly created subvolume already got read btrfs: forbid deleting live subvol qgroup btrfs: forbid creating subvol qgroups btrfs: send: return EOPNOTSUPP on unknown flags
2024-02-07drm/i915/alpm: Alpm aux wake configuration for lnlJouni Högander
Lunarlake has some configurations in ALPM_CTL register for legacy ALPM as well. Write these. Bspec: 71477 v2: move version check to lnl_alpm_configure Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240130111130.3298779-5-jouni.hogander@intel.com
2024-02-07drm/i915/alpm: Calculate ALPM Entry checkJouni Högander
ALPM Entry Check represents the number of lines needed to put the main link to sleep and keep it in the sleep state before it can be taken out of the SLEEP state (eDP requires the main link to be in the SLEEP state for a minimum of 5us). Bspec: 71477 v2: move display version check into _lnl_compute_alpm_param Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240130111130.3298779-4-jouni.hogander@intel.com
2024-02-07drm/i915/psr: Add alpm_parameters structJouni Högander
Add new alpm_parameters struct into intel_psr for all calculated alpm parameters. v2: Move alpm_parameters struct definition to intel_psr struct Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240130111130.3298779-3-jouni.hogander@intel.com
2024-02-07drm/i915/alpm: Add ALPM register definitionsJouni Högander
Add ALPM register definitions for Lunar Lake. v3: - Fix ALPM_CTL2_A address - Remove duplicate defines v2: - Use REG_BIT instead of BIT - Add commit message Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240130111130.3298779-2-jouni.hogander@intel.com
2024-02-06ppp_async: limit MRU to 64KEric Dumazet
syzbot triggered a warning [1] in __alloc_pages(): WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER, gfp) Willem fixed a similar issue in commit c0a2a1b0d631 ("ppp: limit MRU to 64K") Adopt the same sanity check for ppp_async_ioctl(PPPIOCSMRU) [1]: WARNING: CPU: 1 PID: 11 at mm/page_alloc.c:4543 __alloc_pages+0x308/0x698 mm/page_alloc.c:4543 Modules linked in: CPU: 1 PID: 11 Comm: kworker/u4:0 Not tainted 6.8.0-rc2-syzkaller-g41bccc98fb79 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 Workqueue: events_unbound flush_to_ldisc pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __alloc_pages+0x308/0x698 mm/page_alloc.c:4543 lr : __alloc_pages+0xc8/0x698 mm/page_alloc.c:4537 sp : ffff800093967580 x29: ffff800093967660 x28: ffff8000939675a0 x27: dfff800000000000 x26: ffff70001272ceb4 x25: 0000000000000000 x24: ffff8000939675c0 x23: 0000000000000000 x22: 0000000000060820 x21: 1ffff0001272ceb8 x20: ffff8000939675e0 x19: 0000000000000010 x18: ffff800093967120 x17: ffff800083bded5c x16: ffff80008ac97500 x15: 0000000000000005 x14: 1ffff0001272cebc x13: 0000000000000000 x12: 0000000000000000 x11: ffff70001272cec1 x10: 1ffff0001272cec0 x9 : 0000000000000001 x8 : ffff800091c91000 x7 : 0000000000000000 x6 : 000000000000003f x5 : 00000000ffffffff x4 : 0000000000000000 x3 : 0000000000000020 x2 : 0000000000000008 x1 : 0000000000000000 x0 : ffff8000939675e0 Call trace: __alloc_pages+0x308/0x698 mm/page_alloc.c:4543 __alloc_pages_node include/linux/gfp.h:238 [inline] alloc_pages_node include/linux/gfp.h:261 [inline] __kmalloc_large_node+0xbc/0x1fc mm/slub.c:3926 __do_kmalloc_node mm/slub.c:3969 [inline] __kmalloc_node_track_caller+0x418/0x620 mm/slub.c:4001 kmalloc_reserve+0x17c/0x23c net/core/skbuff.c:590 __alloc_skb+0x1c8/0x3d8 net/core/skbuff.c:651 __netdev_alloc_skb+0xb8/0x3e8 net/core/skbuff.c:715 netdev_alloc_skb include/linux/skbuff.h:3235 [inline] dev_alloc_skb include/linux/skbuff.h:3248 [inline] ppp_async_input drivers/net/ppp/ppp_async.c:863 [inline] ppp_asynctty_receive+0x588/0x186c drivers/net/ppp/ppp_async.c:341 tty_ldisc_receive_buf+0x12c/0x15c drivers/tty/tty_buffer.c:390 tty_port_default_receive_buf+0x74/0xac drivers/tty/tty_port.c:37 receive_buf drivers/tty/tty_buffer.c:444 [inline] flush_to_ldisc+0x284/0x6e4 drivers/tty/tty_buffer.c:494 process_one_work+0x694/0x1204 kernel/workqueue.c:2633 process_scheduled_works kernel/workqueue.c:2706 [inline] worker_thread+0x938/0xef4 kernel/workqueue.c:2787 kthread+0x288/0x310 kernel/kthread.c:388 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-and-tested-by: syzbot+c5da1f087c9e4ec6c933@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240205171004.1059724-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-06devlink: avoid potential loop in devlink_rel_nested_in_notify_work()Jiri Pirko
In case devlink_rel_nested_in_notify_work() can not take the devlink lock mutex. Convert the work to delayed work and in case of reschedule do it jiffie later and avoid potential looping. Suggested-by: Paolo Abeni <pabeni@redhat.com> Fixes: c137743bce02 ("devlink: introduce object and nested devlink relationship infra") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240205171114.338679-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-06af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.Kuniyuki Iwashima
syzbot reported a warning [0] in __unix_gc() with a repro, which creates a socketpair and sends one socket's fd to itself using the peer. socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0 sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\360", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[3]}], msg_controllen=24, msg_flags=0}, MSG_OOB|MSG_PROBE|MSG_DONTWAIT|MSG_ZEROCOPY) = 1 This forms a self-cyclic reference that GC should finally untangle but does not due to lack of MSG_OOB handling, resulting in memory leak. Recently, commit 11498715f266 ("af_unix: Remove io_uring code for GC.") removed io_uring's dead code in GC and revealed the problem. The code was executed at the final stage of GC and unconditionally moved all GC candidates from gc_candidates to gc_inflight_list. That papered over the reported problem by always making the following WARN_ON_ONCE(!list_empty(&gc_candidates)) false. The problem has been there since commit 2aab4b969002 ("af_unix: fix struct pid leaks in OOB support") added full scm support for MSG_OOB while fixing another bug. To fix this problem, we must call kfree_skb() for unix_sk(sk)->oob_skb if the socket still exists in gc_candidates after purging collected skb. Then, we need to set NULL to oob_skb before calling kfree_skb() because it calls last fput() and triggers unix_release_sock(), where we call duplicate kfree_skb(u->oob_skb) if not NULL. Note that the leaked socket remained being linked to a global list, so kmemleak also could not detect it. We need to check /proc/net/protocol to notice the unfreed socket. [0]: WARNING: CPU: 0 PID: 2863 at net/unix/garbage.c:345 __unix_gc+0xc74/0xe80 net/unix/garbage.c:345 Modules linked in: CPU: 0 PID: 2863 Comm: kworker/u4:11 Not tainted 6.8.0-rc1-syzkaller-00583-g1701940b1a02 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 Workqueue: events_unbound __unix_gc RIP: 0010:__unix_gc+0xc74/0xe80 net/unix/garbage.c:345 Code: 8b 5c 24 50 e9 86 f8 ff ff e8 f8 e4 22 f8 31 d2 48 c7 c6 30 6a 69 89 4c 89 ef e8 97 ef ff ff e9 80 f9 ff ff e8 dd e4 22 f8 90 <0f> 0b 90 e9 7b fd ff ff 48 89 df e8 5c e7 7c f8 e9 d3 f8 ff ff e8 RSP: 0018:ffffc9000b03fba0 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffc9000b03fc10 RCX: ffffffff816c493e RDX: ffff88802c02d940 RSI: ffffffff896982f3 RDI: ffffc9000b03fb30 RBP: ffffc9000b03fce0 R08: 0000000000000001 R09: fffff52001607f66 R10: 0000000000000003 R11: 0000000000000002 R12: dffffc0000000000 R13: ffffc9000b03fc10 R14: ffffc9000b03fc10 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005559c8677a60 CR3: 000000000d57a000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> process_one_work+0x889/0x15e0 kernel/workqueue.c:2633 process_scheduled_works kernel/workqueue.c:2706 [inline] worker_thread+0x8b9/0x12a0 kernel/workqueue.c:2787 kthread+0x2c6/0x3b0 kernel/kthread.c:388 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:242 </TASK> Reported-by: syzbot+fa3ef895554bdbfd1183@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fa3ef895554bdbfd1183 Fixes: 2aab4b969002 ("af_unix: fix struct pid leaks in OOB support") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240203183149.63573-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-06riscv: Flush the tlb when a page directory is freedAlexandre Ghiti
The riscv privileged specification mandates to flush the TLB whenever a page directory is modified, so add that to tlb_flush(). Fixes: c5e9b2c2ae82 ("riscv: Improve tlb_flush()") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240128120405.25876-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-06kunit: device: Unregister the kunit_bus on shutdownDavid Gow
If KUnit is built as a module, and it's unloaded, the kunit_bus is not unregistered. This causes an error if it's then re-loaded later, as we try to re-register the bus. Unregister the bus and root_device on shutdown, if it looks valid. In addition, be more specific about the value of kunit_bus_device. It is: - a valid struct device* if the kunit_bus initialised correctly. - an ERR_PTR if it failed to initialise. - NULL before initialisation and after shutdown. Fixes: d03c720e03bd ("kunit: Add APIs for managing devices") Signed-off-by: David Gow <davidgow@google.com> Reviewed-by: Rae Moar <rmoar@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-02-07drm/i915: Annotate more of the BIOS fb takeover failure pathsVille Syrjälä
Annotate a few more of the failure paths on the initial BIOS fb takeover to avoid having to guess why things aren't working the way we expect. Reviewed-by: Uma Shankar <uma.shankar@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-17-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Try to relocate the BIOS fb to the start of ggttVille Syrjälä
On MTL the GOP (for whatever reason) likes to bind its framebuffer high up in the ggtt address space. This can conflict with whatever ggtt_reserve_guc_top() is trying to do, and the result is that ggtt_reserve_guc_top() fails and then we proceed to explode when trying to tear down the driver. Thus far I haven't analyzed what causes the actual fireworks, but it's not super important as even if it didn't explode we'd still fail the driver load and the user would be left with an unusable GPU. To remedy this (without having to figure out exactly what ggtt_reserve_guc_top() is trying to achieve) we can attempt to relocate the BIOS framebuffer to a lower ggtt address. We can do this at this early point in driver init because nothing else is supposed to be clobbering the ggtt yet. So we simply change where in the ggtt we pin the vma, the original PTEs will be left as is, and the new PTEs will get written with the same dma addresses. The plane will keep on scanning out from the original PTEs until we are done with the whole process, and at that point we rewrite the plane's surface address register to point at the new ggtt address. Since we don't need a specific ggtt address for the plane (apart from needing it to land in the mappable region for normal stolen objects) we'll just try to pin it without a fixed offset first. It should end up at the lowest available address (which really should be 0 at this point in the driver init). If that fails we'll fall back to just pinning it exactly to the origianal address. To make sure we don't accidentlally pin it partially over the original ggtt range (as that would corrupt the original PTEs) we reserve the original range temporarily during this process. v2: Try to pin explicitly to ggtt offset 0 as otherwise DG2 puts it even higher (atm we have no PIN_LOW flag to force it low) v3: "fix" xe Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-16-ville.syrjala@linux.intel.com Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-02-07drm/i915: Tweak BIOS fb reuse checkVille Syrjälä
Currently we assume that we bind the BIOS fb exactly into the same ggtt address where the BIOS left it. That is about to change, and in order to keep intel_reuse_initial_plane_obj() working as intended we need to compare the original ggtt offset (called 'base' here) as opposed to the actual vma ggtt offset we selected. Otherwise the first plane could change the ggtt offset, and then subsequent planes would no longer notice that they are in fact using the same ggtt offset that the first plane was already using. Thus the reuse check will fail and we proceed to turn off these subsequent planes. TODO: would probably make more sense to do the pure readout first for all the planes, then check for fb reuse, and only then proceed to pin the object into the final location in the ggtt... v2: "fix" xe Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-15-ville.syrjala@linux.intel.com Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-02-07drm/i915/fbdev: Fix smem_start for LMEMBAR stolen objectsVille Syrjälä
The "io" address of an object is its dma address minus the region.start. Subtract the latter to make smem_start correct. The current code happens to work for genuine LMEM objects as LMEM region.start==0, but for LMEMBAR stolen objects region.start!=0. TODO: perhaps just set smem_start=0 always as our .fb_mmap() implementation no longer depends on it? Need to double check it's not needed for anything else... Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-14-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Simplify intel_initial_plane_config() calling conventionVille Syrjälä
There's no reason the caller of intel_initial_plane_config() should have to loop over the CRTCs. Pull the loop into the function to make life simpler for the caller. v2: "fix" xe Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-13-ville.syrjala@linux.intel.com Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-02-07drm/i915: Split the smem and lmem plane readout apartVille Syrjälä
Declutter initial_plane_vma() a bit by pulling the lmem and smem readout paths into their own functions. TODO: the smem path should still be fixed to get and validate the dma address from the pte as well Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-12-ville.syrjala@linux.intel.com
2024-02-07drm/i915: s/phys_base/dma_addr/Ville Syrjälä
The address we read from the PTE is a dma address, not a physical address. Rename the variable to say so. Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-11-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Fix MTL initial plane readoutVille Syrjälä
MTL stolen memory looks more like local memory, so use the (now fixed) lmem path when doing the initial plane readout. Reviewed-by: Uma Shankar <uma.shankar@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-10-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Fix region start during initial plane readoutVille Syrjälä
On MTL the stolen region starts at offset 8MiB from the start of LMEMBAR. The dma addresses are thus also offset by 8MiB. However the mm_node/etc. is zero based, and i915_pages_create_for_stolen() will add the appropriate region.start into the sg dma address. So when we do the readout we need to convert the dma address read from the PTE to be zero based as well. Note that currently we don't take this path on MTL, but we should and thus this needs to be fixed. For lmem this works correctly already as the lmem region.start==0. While at it let's also make sure the address points to somewhere within the memory region. We don't need to check the size as i915_gem_object_create_region_at() should later fail if the object size exceeds the region size. Reviewed-by: Uma Shankar <uma.shankar@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-9-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Fix PTE decode during initial plane readoutVille Syrjälä
When multiple pipes are enabled by the BIOS we try to read out each in turn. But we do the readout for the second only after the inherited vma for the first has been rebound into its original place (and thus the PTEs have been rewritten). Unlike the BIOS we set some high caching bits in the PTE on MTL which confuses the readout for the second plane. Filter out the non-address bits from the PTE value appropriately to fix this. I suppose it might also be possible that the BIOS would already set some caching bits as well, in which case we'd run into this same issue already for the first plane. TODO: - should abstract the PTE decoding to avoid details leaking all over - should probably do the readout for all the planes before we touch anything (including the PTEs) so that we truly read out the BIOS state Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-8-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Rename the DSM/GSM registersVille Syrjälä
0x108100 and 0x1080c0 have been around since snb. Rename the defines appropriately. v2: Rebase Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-7-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Disable the "binder"Ville Syrjälä
Now that the GGTT PTE updates go straight to GSMBASE (bypassing GTTMMADR) there should be no more risk of system hangs? So the "binder" (ie. update the PTEs via MI_UPDATE_GTT) is no longer necessary, disable it. My main worry with the MI_UPDATE_GTT are: - only used on this one platform so very limited testing coverage - async so more opprtunities to screw things up - what happens if the engine hangs while we're waiting for MI_UPDATE_GTT to finish? - requires working command submission, so even getting a working display now depends on a lot more extra components working correctly TODO: MI_UPDATE_GTT might be interesting as an optimization though, so perhaps someone should look into always using it (assuming the GPU is alive and well)? v2: Keep using MI_UPDATE_GTT on VM guests v3: use i915_direct_stolen_access() Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-6-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Bypass LMEMBAR/GTTMMADR for MTL stolen memory accessVille Syrjälä
On MTL accessing stolen memory via the BARs is somehow borked, and it can hang the machine. As a workaround let's bypass the BARs and just go straight to DSMBASE/GSMBASE instead. Note that on every other platform this itself would hang the machine, but on MTL the system firmware is expected to relax the access permission guarding stolen memory to enable this workaround, and thus direct CPU accesses should be fine. The raw stolen memory areas won't be passed to VMs so we'll need to risk using the BAR there for the initial setup. Once command submission is up we should switch to MI_UPDATE_GTT which at least shouldn't hang the whole machine. v2: Don't use direct GSM/DSM access on guests Add w/a number v3: Check register 0x138914 to see if pcode did its job Add some debug prints Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-5-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Remove ad-hoc lmem/stolen debugsVille Syrjälä
Now that intel_memory_regions_hw_probe() prints out each and every memory region there's no reason to have ad-hoc debugs to do similar things elsewhere. Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-4-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Print memory region info during probeVille Syrjälä
Dump the details about every memory region into dmesg at probe time. Avoids having to dig those out from random places when debugging stuff. Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-3-ville.syrjala@linux.intel.com
2024-02-07drm/i915: Use struct resource for memory region IO as wellVille Syrjälä
mem->region is a struct resource, but mem->io_start and mem->io_size are not for whatever reason. Let's unify this and convert the io stuff into a struct resource as well. Should make life a little less annoying when you don't have juggle between two different approaches all the time. Mostly done using cocci (with manual tweaks at all the places where we mutate io_size by hand): @@ struct intel_memory_region *M; expression START, SIZE; @@ - M->io_start = START; - M->io_size = SIZE; + M->io = DEFINE_RES_MEM(START, SIZE); @@ struct intel_memory_region *M; @@ - M->io_start + M->io.start @@ struct intel_memory_region M; @@ - M.io_start + M.io.start @@ expression M; @@ - M->io_size + resource_size(&M->io) @@ expression M; @@ - M.io_size + resource_size(&M.io) Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Tested-by: Paz Zcharya <pazz@chromium.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202224340.30647-2-ville.syrjala@linux.intel.com
2024-02-07firewire: core: send bus reset promptly on gap count errorTakashi Sakamoto
If we are bus manager and the bus has inconsistent gap counts, send a bus reset immediately instead of trying to read the root node's config ROM first. Otherwise, we could spend a lot of time trying to read the config ROM but never succeeding. This eliminates a 50+ second delay before the FireWire bus is usable after a newly connected device is powered on in certain circumstances. The delay occurs if a gap count inconsistency occurs, we are not the root node, and we become bus manager. One scenario that causes this is with a TI XIO2213B OHCI, the first time a Sony DSR-25 is powered on after being connected to the FireWire cable. In this configuration, the Linux box will not receive the initial PHY configuration packet sent by the DSR-25 as IRM, resulting in the DSR-25 having a gap count of 44 while the Linux box has a gap count of 63. FireWire devices have a gap count parameter, which is set to 63 on power-up and can be changed with a PHY configuration packet. This determines the duration of the subaction and arbitration gaps. For reliable communication, all nodes on a FireWire bus must have the same gap count. A node may have zero or more of the following roles: root node, bus manager (BM), isochronous resource manager (IRM), and cycle master. Unless a root node was forced with a PHY configuration packet, any node might become root node after a bus reset. Only the root node can become cycle master. If the root node is not cycle master capable, the BM or IRM should force a change of root node. After a bus reset, each node sends a self-ID packet, which contains its current gap count. A single bus reset does not change the gap count, but two bus resets in a row will set the gap count to 63. Because a consistent gap count is required for reliable communication, IEEE 1394a-2000 requires that the bus manager generate a bus reset if it detects that the gap count is inconsistent. When the gap count is inconsistent, build_tree() will notice this after the self identification process. It will set card->gap_count to the invalid value 0. If we become bus master, this will force bm_work() to send a bus reset when it performs gap count optimization. After a bus reset, there is no bus manager. We will almost always try to become bus manager. Once we become bus manager, we will first determine whether the root node is cycle master capable. Then, we will determine if the gap count should be changed. If either the root node or the gap count should be changed, we will generate a bus reset. To determine if the root node is cycle master capable, we read its configuration ROM. bm_work() will wait until we have finished trying to read the configuration ROM. However, an inconsistent gap count can make this take a long time. read_config_rom() will read the first few quadlets from the config ROM. Due to the gap count inconsistency, eventually one of the reads will time out. When read_config_rom() fails, fw_device_init() calls it again until MAX_RETRIES is reached. This takes 50+ seconds. Once we give up trying to read the configuration ROM, bm_work() will wake up, assume that the root node is not cycle master capable, and do a bus reset. Hopefully, this will resolve the gap count inconsistency. This change makes bm_work() check for an inconsistent gap count before waiting for the root node's configuration ROM. If the gap count is inconsistent, bm_work() will immediately do a bus reset. This eliminates the 50+ second delay and rapidly brings the bus to a working state. I considered that if the gap count is inconsistent, a PHY configuration packet might not be successful, so it could be desirable to skip the PHY configuration packet before the bus reset in this case. However, IEEE 1394a-2000 and IEEE 1394-2008 say that the bus manager may transmit a PHY configuration packet before a bus reset when correcting a gap count error. Since the standard endorses this, I decided it's safe to retain the PHY configuration packet transmission. Normally, after a topology change, we will reset the bus a maximum of 5 times to change the root node and perform gap count optimization. However, if there is a gap count inconsistency, we must always generate a bus reset. Otherwise the gap count inconsistency will persist and communication will be unreliable. For that reason, if there is a gap count inconstency, we generate a bus reset even if we already reached the 5 reset limit. Signed-off-by: Adam Goldman <adamg@pobox.com> Reference: https://sourceforge.net/p/linux1394/mailman/message/58727806/ Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2024-02-07Revert "parisc: Only list existing CPUs in cpu_possible_mask"Helge Deller
This reverts commit 0921244f6f4f0d05698b953fe632a99b38907226. It broke CPU hotplugging because it modifies the __cpu_possible_mask after bootup, so that it will be different than nr_cpu_ids, which then effictively breaks the workqueue setup code and triggers crashes when shutting down CPUs at runtime. Guenter was the first who noticed the wrong values in __cpu_possible_mask, since the cpumask Kunit tests were failig. Reverting this commit fixes both issues, but sadly brings back this uncritical runtime warning: register_cpu_capacity_sysctl: too early to get CPU4 device! Signed-off-by: Helge Deller <deller@gmx.de> Reported-by: Guenter Roeck <linux@roeck-us.net> Link: https://lkml.org/lkml/2024/2/4/146 Link: https://lore.kernel.org/lkml/Zb0mbHlIud_bqftx@slm.duckdns.org/t/ Cc: stable@vger.kernel.org # 6.0+