summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-05-13Merge branch 'x86/fpu' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/cpu' into x86/core, to resolve conflictsIngo Molnar
Conflicts: arch/x86/kernel/cpu/bugs.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/boot' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/bugs' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/asm' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/alternatives' into x86/core, to merge dependent commitsIngo Molnar
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'address-eee-regressions-on-ksz-switches-since-v6-9-v6-14'Paolo Abeni
Oleksij Rempel says: ==================== address EEE regressions on KSZ switches since v6.9 (v6.14+) This patch series addresses a regression in Energy Efficient Ethernet (EEE) handling for KSZ switches with integrated PHYs, introduced in kernel v6.9 by commit fe0d4fd9285e ("net: phy: Keep track of EEE configuration"). The first patch updates the DSA driver to allow phylink to properly manage PHY EEE configuration. Since integrated PHYs handle LPI internally and ports without integrated PHYs do not document MAC-level LPI support, dummy MAC LPI callbacks are provided. The second patch removes outdated EEE workarounds from the micrel PHY driver, as they are no longer needed with correct phylink handling. This series addresses the regression for mainline and kernels starting from v6.14. It is not easily possible to fully fix older kernels due to missing infrastructure changes. Tested on KSZ9893 hardware. ==================== Link: https://patch.msgid.link/20250504081434.424489-1-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13net: phy: micrel: remove KSZ9477 EEE quirks now handled by phylinkOleksij Rempel
The KSZ9477 PHY driver contained workarounds for broken EEE capability advertisements by manually masking supported EEE modes and forcibly disabling EEE if MICREL_NO_EEE was set. With proper MAC-side EEE handling implemented via phylink, these quirks are no longer necessary. Remove MICREL_NO_EEE handling and the use of ksz9477_get_features(). This simplifies the PHY driver and avoids duplicated EEE management logic. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Cc: stable@vger.kernel.org # v6.14+ Link: https://patch.msgid.link/20250504081434.424489-3-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13net: dsa: microchip: let phylink manage PHY EEE configuration on KSZ switchesOleksij Rempel
Phylink expects MAC drivers to provide LPI callbacks to properly manage Energy Efficient Ethernet (EEE) configuration. On KSZ switches with integrated PHYs, LPI is internally handled by hardware, while ports without integrated PHYs have no documented MAC-level LPI support. Provide dummy mac_disable_tx_lpi() and mac_enable_tx_lpi() callbacks to satisfy phylink requirements. Also, set default EEE capabilities during phylink initialization where applicable. Since phylink can now gracefully handle optional EEE configuration, remove the need for the MICREL_NO_EEE PHY flag. This change addresses issues caused by incomplete EEE refactoring introduced in commit fe0d4fd9285e ("net: phy: Keep track of EEE configuration"). It is not easily possible to fix all older kernels, but this patch ensures proper behavior on latest kernels and can be considered for backporting to stable kernels starting from v6.14. Fixes: fe0d4fd9285e ("net: phy: Keep track of EEE configuration") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Cc: stable@vger.kernel.org # v6.14+ Link: https://patch.msgid.link/20250504081434.424489-2-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13genirq: Consistently use '%u' format specifier for unsigned int variablesAndy Shevchenko
There are three cases in the genirq code when the irq, as an unsigned integer variable, is converted to text representation by sprintf(). In two cases it uses '%d' specifier which is for signed values. While it's not a problem right now, potentially it might be in the future in case too big (> INT_MAX) number will appear there. Consistently use '%u' format specifier for @irq which is declared as unsigned int in all these cases. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250509154643.1499171-1-andriy.shevchenko@linux.intel.com
2025-05-13genirq: Ensure flags in lock guard is consistently initializedNathan Chancellor
After the conversion to locking guards within the interrupt core code, several builds with clang show the "Interrupts were enabled early" WARN() in start_kernel() on boot. In class_irqdesc_lock_constructor(), _t.flags is initialized via __irq_get_desc_lock() within the _t initializer list. However, the C11 standard 6.7.9.23 states that the evaluation of the initialization list expressions are indeterminately sequenced relative to one another, meaning _t.flags could be initialized by __irq_get_desc_lock() then be initialized to zero due to flags being absent from the initializer list. To ensure _t.flags is consistently initialized, move the call to __irq_get_desc_lock() and the assignment of its result to _t.lock out of the designated initializer. Fixes: 0f70a49f3fa3 ("genirq: Provide conditional lock guards") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/all/20250513-irq-guards-fix-flags-init-v1-1-1dca3f5992d6@kernel.org
2025-05-13Documentation: fix typo in root= kernel parameter descriptionPetr Vaněk
Fixes a typo in the root= parameter description, changing "this a a" to "this is a". Fixes: c0c1a7dcb6f5 ("init: move the nfs/cifs/ram special cases out of name_to_dev_t") Signed-off-by: Petr Vaněk <arkamar@atlas.cz> Link: https://lore.kernel.org/20250512110827.32530-1-arkamar@atlas.cz Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-13nvmet: pci-epf: remove NVMET_PCI_EPF_Q_IS_SQDamien Le Moal
The flag NVMET_PCI_EPF_Q_IS_SQ is set but never used. Remove it. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13nvmet: pci-epf: improve debug messageDamien Le Moal
Improve the debug message of nvmet_pci_epf_create_cq() to indicate if a completion queue IRQ is disabled. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13nvmet: pci-epf: cleanup nvmet_pci_epf_raise_irq()Damien Le Moal
There is no point in taking the controller irq_lock and calling nvmet_pci_epf_should_raise_irq() for a completion queue which does not have IRQ enabled (NVMET_PCI_EPF_Q_IRQ_ENABLED flag is not set). Move the test for the NVMET_PCI_EPF_Q_IRQ_ENABLED flag out of nvmet_pci_epf_should_raise_irq() to the top of nvmet_pci_epf_raise_irq() to return early when no IRQ should be raised. Also, use dev_err_ratelimited() to avoid a message storm under load when raising IRQs is failing. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13nvmet: pci-epf: do not fall back to using INTX if not supportedDamien Le Moal
Some endpoint PCIe controllers do not support raising legacy INTX interrupts. This support is indicated by the intx_capable field of struct pci_epc_features. Modify nvmet_pci_epf_raise_irq() to not automatically fallback to trying raising an INTX interrupt after an MSI or MSI-X error if the controller does not support INTX. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13nvmet: pci-epf: clear completion queue IRQ flag on deleteDamien Le Moal
The function nvmet_pci_epf_delete_cq() unconditionally calls nvmet_pci_epf_remove_irq_vector() even for completion queues that do not have interrupts enabled. Furthermore, for completion queues that do have IRQ enabled, deleting and re-creating the completion queue leaves the flag NVMET_PCI_EPF_Q_IRQ_ENABLED set, even if the completion queue is being re-created with IRQ disabled. Fix these issues by calling nvmet_pci_epf_remove_irq_vector() only if NVMET_PCI_EPF_Q_IRQ_ENABLED is set and make sure to always clear that flag. Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13nvme-pci: acquire cq_poll_lock in nvme_poll_irqdisableKeith Busch
We need to lock this queue for that condition because the timeout work executes per-namespace and can poll the poll CQ. Reported-by: Hannes Reinecke <hare@kernel.org> Closes: https://lore.kernel.org/all/20240902130728.1999-1-hare@kernel.org/ Fixes: a0fa9647a54e ("NVMe: add blk polling support") Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Daniel Wagner <wagi@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13nvme-pci: make nvme_pci_npages_prp() __always_inlineKees Cook
The only reason nvme_pci_npages_prp() could be used as a compile-time known result in BUILD_BUG_ON() is because the compiler was always choosing to inline the function. Under special circumstances (sanitizer coverage functions disabled for __init functions on ARCH=um), the compiler decided to stop inlining it: drivers/nvme/host/pci.c: In function 'nvme_init': include/linux/compiler_types.h:557:45: error: call to '__compiletime_assert_678' declared with attribute error: BUILD_BUG_ON failed: nvme_pci_npages_prp() > NVME_MAX_NR_ALLOCATIONS 557 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^ include/linux/compiler_types.h:538:25: note: in definition of macro '__compiletime_assert' 538 | prefix ## suffix(); \ | ^~~~~~ include/linux/compiler_types.h:557:9: note: in expansion of macro '_compiletime_assert' 557 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) | ^~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:50:9: note: in expansion of macro 'BUILD_BUG_ON_MSG' 50 | BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) | ^~~~~~~~~~~~~~~~ drivers/nvme/host/pci.c:3804:9: note: in expansion of macro 'BUILD_BUG_ON' 3804 | BUILD_BUG_ON(nvme_pci_npages_prp() > NVME_MAX_NR_ALLOCATIONS); | ^~~~~~~~~~~~ Force it to be __always_inline to make sure it is always available for use with BUILD_BUG_ON(). Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505061846.12FMyRjj-lkp@intel.com/ Fixes: c372cdd1efdf ("nvme-pci: iod npages fits in s8") Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-05-13regulator: gpio: Use dev_err_probeNishanth Menon
During probe the gpio driver may not yet be available. Use dev_err_probe to provide just the pertinent log. Since dev_err_probe takes care of reporting the error value, drop the redundant ret variable while at it. Signed-off-by: Nishanth Menon <nm@ti.com> Link: https://patch.msgid.link/20250512185727.2907411-1-nm@ti.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-12scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES bufferSteve Siwinski
The REPORT ZONES buffer size is currently limited by the HBA's maximum segment count to ensure the buffer can be mapped. However, the block layer further limits the number of iovec entries to 1024 when allocating a bio. To avoid allocation of buffers too large to be mapped, further restrict the maximum buffer size to BIO_MAX_INLINE_VECS. Replace the UIO_MAXIOV symbolic name with the more contextually appropriate BIO_MAX_INLINE_VECS. Fixes: b091ac616846 ("sd_zbc: Fix report zones buffer allocation") Cc: stable@vger.kernel.org Signed-off-by: Steve Siwinski <ssiwinski@atto.com> Link: https://lore.kernel.org/r/20250508200122.243129-1-ssiwinski@atto.com Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-05-12MAINTAINERS: add crc_kunit.c back to CRC LIBRARYEric Biggers
Restore lib/tests/crc_kunit.c to CRC LIBRARY following the rename in commit db6fe4d61ece ("lib: Move KUnit tests into tests/ subdirectory") which made it no longer match the file pattern lib/crc*. Reviewed-by: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/r/20250511052151.420228-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-05-12net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENINGVladimir Oltean
It has been reported that when under a bridge with stp_state=1, the logs get spammed with this message: [ 251.734607] fsl_dpaa2_eth dpni.5 eth0: Couldn't decode source port Further debugging shows the following info associated with packets: source_port=-1, switch_id=-1, vid=-1, vbid=1 In other words, they are data plane packets which are supposed to be decoded by dsa_tag_8021q_find_port_by_vbid(), but the latter (correctly) refuses to do so, because no switch port is currently in BR_STATE_LEARNING or BR_STATE_FORWARDING - so the packet is effectively unexpected. The error goes away after the port progresses to BR_STATE_LEARNING in 15 seconds (the default forward_time of the bridge), because then, dsa_tag_8021q_find_port_by_vbid() can correctly associate the data plane packets with a plausible bridge port in a plausible STP state. Re-reading IEEE 802.1D-1990, I see the following: "4.4.2 Learning: (...) The Forwarding Process shall discard received frames." IEEE 802.1D-2004 further clarifies: "DISABLED, BLOCKING, LISTENING, and BROKEN all correspond to the DISCARDING port state. While those dot1dStpPortStates serve to distinguish reasons for discarding frames, the operation of the Forwarding and Learning processes is the same for all of them. (...) LISTENING represents a port that the spanning tree algorithm has selected to be part of the active topology (computing a Root Port or Designated Port role) but is temporarily discarding frames to guard against loops or incorrect learning." Well, this is not what the driver does - instead it sets mac[port].ingress = true. To get rid of the log spam, prevent unexpected data plane packets to be received by software by discarding them on ingress in the LISTENING state. In terms of blame attribution: the prints only date back to commit d7f9787a763f ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID"). However, the settings would permit a LISTENING port to forward to a FORWARDING port, and the standard suggests that's not OK. Fixes: 640f763f98c2 ("net: dsa: sja1105: Add support for Spanning Tree Protocol") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250509113816.2221992-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-12net: Lock lower level devices when updating featuresCosmin Ratiu
__netdev_update_features() expects the netdevice to be ops-locked, but it gets called recursively on the lower level netdevices to sync their features, and nothing locks those. This commit fixes that, with the assumption that it shouldn't be possible for both higher-level and lover-level netdevices to require the instance lock, because that would lead to lock dependency warnings. Without this, playing with higher level (e.g. vxlan) netdevices on top of netdevices with instance locking enabled can run into issues: WARNING: CPU: 59 PID: 206496 at ./include/net/netdev_lock.h:17 netif_napi_add_weight_locked+0x753/0xa60 [...] Call Trace: <TASK> mlx5e_open_channel+0xc09/0x3740 [mlx5_core] mlx5e_open_channels+0x1f0/0x770 [mlx5_core] mlx5e_safe_switch_params+0x1b5/0x2e0 [mlx5_core] set_feature_lro+0x1c2/0x330 [mlx5_core] mlx5e_handle_feature+0xc8/0x140 [mlx5_core] mlx5e_set_features+0x233/0x2e0 [mlx5_core] __netdev_update_features+0x5be/0x1670 __netdev_update_features+0x71f/0x1670 dev_ethtool+0x21c5/0x4aa0 dev_ioctl+0x438/0xae0 sock_ioctl+0x2ba/0x690 __x64_sys_ioctl+0xa78/0x1700 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250509072850.2002821-1-cratiu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-12net: cadence: macb: Fix a possible deadlock in macb_halt_tx.Mathieu Othacehe
There is a situation where after THALT is set high, TGO stays high as well. Because jiffies are never updated, as we are in a context with interrupts disabled, we never exit that loop and have a deadlock. That deadlock was noticed on a sama5d4 device that stayed locked for days. Use retries instead of jiffies so that the timeout really works and we do not have a deadlock anymore. Fixes: e86cd53afc590 ("net/macb: better manage tx errors") Signed-off-by: Mathieu Othacehe <othacehe@gnu.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250509121935.16282-1-othacehe@gnu.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-12Merge tag 'sched_ext-for-6.15-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: "A little bit invasive for rc6 but they're important fixes, pass tests fine and won't break anything outside sched_ext: - scx_bpf_cpuperf_set() calls internal functions that require the rq to be locked. It assumed that the BPF caller has rq locked but that's not always true. Fix it by tracking whether rq is currently held by the CPU and grabbing it if necessary - bpf_iter_scx_dsq_new() was leaving the DSQ iterator in an uninitialized state after an error. However, next() and destroy() can be called on an iterator which failed initialization and thus they always need to be initialized even after an init error. Fix by always initializing the iterator - Remove duplicate BTF_ID_FLAGS() entries" * tag 'sched_ext-for-6.15-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: bpf_iter_scx_dsq_new() should always initialize iterator sched_ext: Fix rq lock state in hotplug ops sched_ext: Remove duplicate BTF_ID_FLAGS definitions sched_ext: Fix missing rq lock in scx_bpf_cpuperf_set() sched_ext: Track currently locked rq
2025-05-12Merge tag 'cgroup-for-6.15-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "One low-risk patch to fix a cpuset bug where it over-eagerly tries to modify CPU affinity of kernel threads" * tag 'cgroup-for-6.15-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup/cpuset: Extend kthread_is_per_cpu() check to all PF_NO_SETAFFINITY tasks
2025-05-12btrfs: add back warning for mount option commit values exceeding 300Kyoji Ogasawara
The Btrfs documentation states that if the commit value is greater than 300 a warning should be issued. The warning was accidentally lost in the new mount API update. Fixes: 6941823cc878 ("btrfs: remove old mount API code") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-12btrfs: fix folio leak in submit_one_async_extent()Boris Burkov
If btrfs_reserve_extent() fails while submitting an async_extent for a compressed write, then we fail to call free_async_extent_pages() on the async_extent and leak its folios. A likely cause for such a failure would be btrfs_reserve_extent() failing to find a large enough contiguous free extent for the compressed extent. I was able to reproduce this by: 1. mount with compress-force=zstd:3 2. fallocating most of a filesystem to a big file 3. fragmenting the remaining free space 4. trying to copy in a file which zstd would generate large compressed extents for (vmlinux worked well for this) Step 4. hits the memory leak and can be repeated ad nauseam to eventually exhaust the system memory. Fix this by detecting the case where we fallback to uncompressed submission for a compressed async_extent and ensuring that we call free_async_extent_pages(). Fixes: 131a821a243f ("btrfs: fallback if compressed IO fails for ENOSPC") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Co-developed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-12btrfs: fix discard worker infinite loop after disabling discardFilipe Manana
If the discard worker is running and there's currently only one block group, that block group is a data block group, it's in the unused block groups discard list and is being used (it got an extent allocated from it after becoming unused), the worker can end up in an infinite loop if a transaction abort happens or the async discard is disabled (during remount or unmount for example). This happens like this: 1) Task A, the discard worker, is at peek_discard_list() and find_next_block_group() returns block group X; 2) Block group X is in the unused block groups discard list (its discard index is BTRFS_DISCARD_INDEX_UNUSED) since at some point in the past it become an unused block group and was added to that list, but then later it got an extent allocated from it, so its ->used counter is not zero anymore; 3) The current transaction is aborted by task B and we end up at __btrfs_handle_fs_error() in the transaction abort path, where we call btrfs_discard_stop(), which clears BTRFS_FS_DISCARD_RUNNING from fs_info, and then at __btrfs_handle_fs_error() we set the fs to RO mode (setting SB_RDONLY in the super block's s_flags field); 4) Task A calls __add_to_discard_list() with the goal of moving the block group from the unused block groups discard list into another discard list, but at __add_to_discard_list() we end up doing nothing because btrfs_run_discard_work() returns false, since the super block has SB_RDONLY set in its flags and BTRFS_FS_DISCARD_RUNNING is not set anymore in fs_info->flags. So block group X remains in the unused block groups discard list; 5) Task A then does a goto into the 'again' label, calls find_next_block_group() again we gets block group X again. Then it repeats the previous steps over and over since there are not other block groups in the discard lists and block group X is never moved out of the unused block groups discard list since btrfs_run_discard_work() keeps returning false and therefore __add_to_discard_list() doesn't move block group X out of that discard list. When this happens we can get a soft lockup report like this: [71.957] watchdog: BUG: soft lockup - CPU#0 stuck for 27s! [kworker/u4:3:97] [71.957] Modules linked in: xfs af_packet rfkill (...) [71.957] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G W 6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa [71.957] Tainted: [W]=WARN [71.957] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [71.957] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs] [71.957] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs] [71.957] Code: c1 01 48 83 (...) [71.957] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246 [71.957] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000 [71.957] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad [71.957] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080 [71.957] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860 [71.957] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000 [71.957] FS: 0000000000000000(0000) GS:ffff89707bc00000(0000) knlGS:0000000000000000 [71.957] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [71.957] CR2: 00005605bcc8d2f0 CR3: 000000010376a001 CR4: 0000000000770ef0 [71.957] PKRU: 55555554 [71.957] Call Trace: [71.957] <TASK> [71.957] process_one_work+0x17e/0x330 [71.957] worker_thread+0x2ce/0x3f0 [71.957] ? __pfx_worker_thread+0x10/0x10 [71.957] kthread+0xef/0x220 [71.957] ? __pfx_kthread+0x10/0x10 [71.957] ret_from_fork+0x34/0x50 [71.957] ? __pfx_kthread+0x10/0x10 [71.957] ret_from_fork_asm+0x1a/0x30 [71.957] </TASK> [71.957] Kernel panic - not syncing: softlockup: hung tasks [71.987] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G W L 6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa [71.989] Tainted: [W]=WARN, [L]=SOFTLOCKUP [71.989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [71.991] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs] [71.992] Call Trace: [71.993] <IRQ> [71.994] dump_stack_lvl+0x5a/0x80 [71.994] panic+0x10b/0x2da [71.995] watchdog_timer_fn.cold+0x9a/0xa1 [71.996] ? __pfx_watchdog_timer_fn+0x10/0x10 [71.997] __hrtimer_run_queues+0x132/0x2a0 [71.997] hrtimer_interrupt+0xff/0x230 [71.998] __sysvec_apic_timer_interrupt+0x55/0x100 [71.999] sysvec_apic_timer_interrupt+0x6c/0x90 [72.000] </IRQ> [72.000] <TASK> [72.001] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [72.002] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs] [72.002] Code: c1 01 48 83 (...) [72.005] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246 [72.006] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000 [72.006] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad [72.007] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080 [72.008] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860 [72.009] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000 [72.010] ? btrfs_discard_workfn+0x51/0x400 [btrfs 23b01089228eb964071fb7ca156eee8cd3bf996f] [72.011] process_one_work+0x17e/0x330 [72.012] worker_thread+0x2ce/0x3f0 [72.013] ? __pfx_worker_thread+0x10/0x10 [72.014] kthread+0xef/0x220 [72.014] ? __pfx_kthread+0x10/0x10 [72.015] ret_from_fork+0x34/0x50 [72.015] ? __pfx_kthread+0x10/0x10 [72.016] ret_from_fork_asm+0x1a/0x30 [72.017] </TASK> [72.017] Kernel Offset: 0x15000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [72.019] Rebooting in 90 seconds.. So fix this by making sure we move a block group out of the unused block groups discard list when calling __add_to_discard_list(). Fixes: 2bee7eb8bb81 ("btrfs: discard one region at a time in async discard") Link: https://bugzilla.suse.com/show_bug.cgi?id=1242012 CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Daniel Vacek <neelx@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-12Merge tag 'platform-drivers-x86-v6.15-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: - amd/pmc: Use spurious 8042 quirk with MECHREVO Wujie 14XA - amd/pmf: - Ensure Smart PC policies are valid - Fix memory leak when the engine fails to start - amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive drivers - asus-wmi: Fix wlan_ctrl_by_user detection - thinkpad_acpi: Add support for NEC Lavie X1475JAS * tag 'platform-drivers-x86-v6.15-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection platform/x86/amd/pmc: Declare quirk_spurious_8042 for MECHREVO Wujie 14XA (GX4HRXL) platform/x86: thinkpad_acpi: Support also NEC Lavie X1475JAS platform/x86/amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive drivers drivers/platform/x86/amd: pmf: Check for invalid Smart PC Policies drivers/platform/x86/amd: pmf: Check for invalid sideloaded Smart PC Policies
2025-05-12Merge tag 'udf_for_v6.15-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF fix from Jan Kara: "Fix a bug in UDF inode eviction leading to spewing pointless error messages" * tag 'udf_for_v6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Make sure i_lenExtents is uptodate on inode eviction
2025-05-12tracing: samples: Initialize trace_array_printk() with the correct functionSteven Rostedt
When using trace_array_printk() on a created instance, the correct function to use to initialize it is: trace_array_init_printk() Not trace_printk_init_buffer() The former is a proper function to use, the latter is for initializing trace_printk() and causes the NOTICE banner to be displayed. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Divya Indi <divya.indi@oracle.com> Link: https://lore.kernel.org/20250509152657.0f6744d9@gandalf.local.home Fixes: 89ed42495ef4a ("tracing: Sample module to demonstrate kernel access to Ftrace instances.") Fixes: 38ce2a9e33db6 ("tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-05-12Merge tag 'vfs-6.15-rc7.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Ensure that simple_xattr_list() always includes security.* xattrs - Fix eventpoll busy loop optimization when combined with timeouts - Disable swapon() for devices with block sizes greater than page sizes - Don't call errseq_set() twice during mark_buffer_write_io_error(). Just use mapping_set_error() which takes care to not deference unconditionally * tag 'vfs-6.15-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: Remove redundant errseq_set call in mark_buffer_write_io_error. swapfile: disable swapon for bs > ps devices fs/eventpoll: fix endless busy loop after timeout has expired fs/xattr.c: fix simple_xattr_list to always include security.* xattrs
2025-05-12MAINTAINERS: add me as maintainer for the gpio sloppy logic analyzerWolfram Sang
This was forgotten when the analyzer went upstream. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20250424212234.5313-2-wsa+renesas@sang-engineering.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-05-12io_uring/memmap: don't use page_address() on a highmem pageJens Axboe
For older/32-bit systems with highmem, don't assume that the pages in a mapped region are always going to be mapped. If io_region_init_ptr() finds that the pages are coalescable, also check if the first page is a HighMem page or not. If it is, fall through to the usual vmap() mapping rather than attempt to get the unmapped page address. Cc: stable@vger.kernel.org Fixes: c4d0ac1c1567 ("io_uring/memmap: optimise single folio regions") Link: https://lore.kernel.org/all/681fe2fb.050a0220.f2294.001a.GAE@google.com/ Reported-by: syzbot+5b8c4abafcb1d791ccfc@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/681fed0a.050a0220.f2294.001c.GAE@google.com/ Reported-by: syzbot+6456a99dfdc2e78c4feb@syzkaller.appspotmail.com Tested-by: syzbot+6456a99dfdc2e78c4feb@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-12io_uring: drain based on allocates reqsPavel Begunkov
Don't rely on CQ sequence numbers for draining, as it has become messy and needs cq_extra adjustments. Instead, base it on the number of allocated requests and only allow flushing when all requests are in the drain list. As a result, cq_extra is gone, no overhead for its accounting in aux cqe posting, less bloating as it was inlined before, and it's in general simpler than trying to track where we should bump it and where it should be put back like in cases of overflow. Also, it'll likely help with cleaning and unifying some of the CQ posting helpers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/46ece1e34320b046c06fee2498d6b4cd12a700f2.1746788718.git.asml.silence@gmail.com Link: https://lore.kernel.org/r/24497b04b004bceada496033d3c9d09ff8e81ae9.1746944903.git.asml.silence@gmail.com [axboe: fold in fix from link2] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-12ACPI: tables: Improve logging around acpi_initialize_tables()Bartosz Szczepanek
Emit a warning that includes return code in a readable format. Example: ACPI: Failed to initialize tables, status=0x5 (AE_NOT_FOUND) No other functional changes intended. Signed-off-by: Bartosz Szczepanek <bsz@amazon.de> Link: https://patch.msgid.link/20250423085637.38658-1-bsz@amazon.de [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-05-12ACPI: VIOT: Remove (explicitly) unused headerAndy Shevchenko
The fwnode.h is not supposed to be used by the drivers as it has the definitions for the core parts for different device property provider implementations. Drop it. Note, that fwnode API for drivers is provided in property.h which is included here. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://patch.msgid.link/20250331072311.3987967-1-andriy.shevchenko@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-05-12ACPI: Add documentation for exposing MRRM dataTony Luck
Initial implementation provides enumeration of the address ranges NUMA node numbers, and BIOS assigned region IDs for each range. Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://patch.msgid.link/20250505173819.419271-4-tony.luck@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-05-12ACPI: MRRM: Add /sys files to describe memory rangesTony Luck
Perf and resctrl users need an enumeration of which memory addresses are bound to which "region" tag. Parse the ACPI MRRM table and add /sys entries for each memory range describing base address, length, NUMA node, and which region tags apply for same-socket and cross-socket access. Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://patch.msgid.link/20250505173819.419271-3-tony.luck@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-05-12ACPI: MRRM: Minimal parse of ACPI MRRM tableTony Luck
The resctrl file system code needs to know how many region tags are supported. Parse the ACPI MRRM table and save the max_mem_region value. Provide a function for resctrl to collect that value. Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://patch.msgid.link/20250505173819.419271-2-tony.luck@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-05-12Merge ACPICA material for 6.16 to satisfy dependenciesRafael J. Wysocki
* acpica: (30 commits) ACPICA: Update copyright year ACPICA: Logfile: Changes for version 20250404 ACPICA: Replace strncpy() with memcpy() ACPICA: Apply ACPI_NONSTRING in more places ACPICA: Avoid sequence overread in call to strncmp() ACPICA: Adjust the position of code lines ACPICA: actbl2.h: ACPI 6.5: RAS2: Rename structure and field names of the RAS2 table ACPICA: Apply ACPI_NONSTRING ACPICA: Introduce ACPI_NONSTRING ACPICA: actbl2.h: ERDT: Add typedef and other definitions ACPICA: infrastructure: Add new DMT_BUF types and shorten a long name ACPICA: Utilities: Fix spelling mistake "Incremement" -> "Increment" ACPICA: MRRM: Some cleanups ACPICA: actbl2: Add definitions for RIMT ACPICA: actbl2.h: MRRM: Add typedef and other definitions ACPICA: infrastructure: Add new header and ACPI_DMT_BUF26 types ACPICA: Interpret SIDP structures in DMAR ACPICA: utilities: Fix overflow check in vsnprintf() ACPICA: Apply pack(1) to union aml_resource ACPICA: Drop stale comment about the header file content ...
2025-05-12ACPICA: Update copyright yearSaket Dumbre
ACPICA commit 45253be18b3f37d46cd0072aa3f8a0a21a70e0a4 Changes needed by acpisrc to update copyright year when building for release. Link: https://github.com/acpica/acpica/commit/45253be1 Signed-off-by: Saket Dumbre <saket.dumbre@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-05-12ACPICA: Logfile: Changes for version 20250404Saket Dumbre
ACPICA commit 52de9040740c562a35244de19d21c0313964c874 Link: https://github.com/acpica/acpica/commit/52de9040 Signed-off-by: Saket Dumbre <saket.dumbre@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/2251077.Icojqenx9y@rjwysocki.net
2025-05-12ACPICA: Replace strncpy() with memcpy()Ahmed Salem
ACPICA commit 83019b471e1902151e67c588014ba2d09fa099a3 strncpy() is deprecated for NUL-terminated destination buffers[1]. Use memcpy() for length-bounded destinations. Link: https://github.com/KSPP/linux/issues/90 [1] Link: https://github.com/acpica/acpica/commit/83019b47 Signed-off-by: Ahmed Salem <x0rw3ll@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/1910878.atdPhlSkOF@rjwysocki.net
2025-05-12ACPICA: Apply ACPI_NONSTRING in more placesAhmed Salem
ACPICA commit 1035a3d453f7dd49a235a59ee84ebda9d2d2f41b Add ACPI_NONSTRING for destination char arrays without a terminating NUL character. This is a follow-up to commit 35ad99236f3a ("ACPICA: Apply ACPI_NONSTRING") where not all instances received the same treatment, in preparation for replacing strncpy() calls with memcpy() Link: https://github.com/acpica/acpica/commit/1035a3d4 Signed-off-by: Ahmed Salem <x0rw3ll@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/3833065.MHq7AAxBmi@rjwysocki.net
2025-05-12ACPICA: Avoid sequence overread in call to strncmp()Ahmed Salem
ACPICA commit 8b83a8d88dfec59ea147fad35fc6deea8859c58c ap_get_table_length() checks if tables are valid by calling ap_is_valid_header(). The latter then calls ACPI_VALIDATE_RSDP_SIG(Table->Signature). ap_is_valid_header() accepts struct acpi_table_header as an argument, so the signature size is always fixed to 4 bytes. The problem is when the string comparison is between ACPI-defined table signature and ACPI_SIG_RSDP. Common ACPI table header specifies the Signature field to be 4 bytes long[1], with the exception of the RSDP structure whose signature is 8 bytes long "RSD PTR " (including the trailing blank character)[2]. Calling strncmp(sig, rsdp_sig, 8) would then result in a sequence overread[3] as sig would be smaller (4 bytes) than the specified bound (8 bytes). As a workaround, pass the bound conditionally based on the size of the signature being passed. Link: https://uefi.org/specs/ACPI/6.5_A/05_ACPI_Software_Programming_Model.html#system-description-table-header [1] Link: https://uefi.org/specs/ACPI/6.5_A/05_ACPI_Software_Programming_Model.html#root-system-description-pointer-rsdp-structure [2] Link: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wstringop-overread [3] Link: https://github.com/acpica/acpica/commit/8b83a8d8 Signed-off-by: Ahmed Salem <x0rw3ll@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/2248233.Mh6RI2rZIc@rjwysocki.net
2025-05-12ACPICA: Adjust the position of code linesZhe Qiao
ACPICA commit 5da6daf5691169d2bf2e5c9e55baf093757312ca In the acpica/utcache.c file, adjust the position of the "ACPI_MEM_TRACKING(cache->total_allocated++);" code line to ensure that the increment operation on total_allocated is included within the ACPI_DBG_TRACK_ALLOCATIONS configuration. Link: https://github.com/acpica/acpica/commit/5da6daf5 Signed-off-by: Zhe Qiao <qiaozhe@iscas.ac.cn> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/2670567.Lt9SDvczpP@rjwysocki.net
2025-05-12ACPICA: actbl2.h: ACPI 6.5: RAS2: Rename structure and field names of the ↵Shiju Jose
RAS2 table ACPICA commit 2c8a38f747de9d977491a494faf0dfaf799b777b Rename the structure and field names of the RAS2 table to shorten them and avoid long lines in the ACPI RAS2 drivers. 1. struct acpi_ras2_shared_memory to struct acpi_ras2_shmem 2. In struct acpi_ras2_shared_memory: fields, - set_capabilities[16] to set_caps[16] - num_parameter_blocks to num_param_blks - set_capabilities_status to set_caps_status 3. struct acpi_ras2_patrol_scrub_parameter to struct acpi_ras2_patrol_scrub_param 4. In struct acpi_ras2_patrol_scrub_parameter: fields, - patrol_scrub_command to command - requested_address_range to req_addr_range - actual_address_range to actl_addr_range Link: https://github.com/acpica/acpica/commit/2c8a38f7 Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/1942053.CQOukoFCf9@rjwysocki.net