summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-06-17net: mana: explain irq_setup() algorithmYury Norov
Commit 91bfe210e196 ("net: mana: add a function to spread IRQs per CPUs") added the irq_setup() function that distributes IRQs on CPUs according to a tricky heuristic. The corresponding commit message explains the heuristic. Duplicate it in the source code to make available for readers without digging git in history. Also, add more detailed explanation about how the heuristics is implemented. Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
2025-06-17PCI: hv: Allow dynamic MSI-X vector allocationShradha Gupta
Allow dynamic MSI-X vector allocation for pci_hyperv PCI controller by adding support for the flag MSI_FLAG_PCI_MSIX_ALLOC_DYN and using pci_msix_prepare_desc() to prepare the MSI-X descriptors. Feature support added for both x86 and ARM64 Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2025-06-17PCI/MSI: Export pci_msix_prepare_desc() for dynamic MSI-X allocationsShradha Gupta
For supporting dynamic MSI-X vector allocation by PCI controllers, enabling the flag MSI_FLAG_PCI_MSIX_ALLOC_DYN is not enough, msix_prepare_msi_desc() to prepare the MSI descriptor is also needed. Export pci_msix_prepare_desc() to allow PCI controllers to support dynamic MSI-X vector allocation. Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2025-06-17RISC-V: KVM: Don't treat SBI HFENCE calls as NOPsAnup Patel
The SBI specification clearly states that SBI HFENCE calls should return SBI_ERR_NOT_SUPPORTED when one of the target hart doesn’t support hypervisor extension (aka nested virtualization in-case of KVM RISC-V). Fixes: c7fa3c48de86 ("RISC-V: KVM: Treat SBI HFENCE calls as NOPs") Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Link: https://lore.kernel.org/r/20250605061458.196003-3-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-06-17RISC-V: KVM: Fix the size parameter check in SBI SFENCE callsAnup Patel
As-per the SBI specification, an SBI remote fence operation applies to the entire address space if either: 1) start_addr and size are both 0 2) size is equal to 2^XLEN-1 >From the above, only #1 is checked by SBI SFENCE calls so fix the size parameter check in SBI SFENCE calls to cover #2 as well. Fixes: 13acfec2dbcc ("RISC-V: KVM: Add remote HFENCE functions based on VCPU requests") Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Link: https://lore.kernel.org/r/20250605061458.196003-2-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-06-16eth: gianfar: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). Uniquely, this driver supports only the SET operation. It does not support GET at all. The SET callback also always returns 0, even tho it checks a bunch of conditions, and if my quick reading is right, expects the user to insert filtering rules for given flow type first? Long story short it seems too convoluted to easily add the GET as part of the conversion. Link: https://patch.msgid.link/20250613172751.3754732-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16Merge branch 'net-phy-remove-phy_driver_is_genphy-and-phy_driver_is_genphy_10g'Jakub Kicinski
Heiner Kallweit says: ==================== net: phy: remove phy_driver_is_genphy and phy_driver_is_genphy_10g Replace phy_driver_is_genphy() and phy_driver_is_genphy_10g() with a new flag in struct phy_device. ==================== Link: https://patch.msgid.link/5778e86e-dd54-4388-b824-6132729ad481@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16net: phy: remove phy_driver_is_genphy_10gHeiner Kallweit
Remove now unused function phy_driver_is_genphy_10g(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/49b0589a-9604-4ee9-add5-28fbbbe2c2f3@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16net: phy: improve phy_driver_is_genphyHeiner Kallweit
Use new flag phydev->is_genphy_driven to simplify this function. Note that this includes a minor functional change: Now this function returns true if ANY of the genphy drivers is bound to the PHY device. We have only one user in DSA driver mt7530, and there the functional change doesn't matter. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/c9ac3a7d-262a-425d-9153-97fe3ca6280a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16net: phy: add flag is_genphy_driven to struct phy_deviceHeiner Kallweit
In order to get rid of phy_driver_is_genphy() and phy_driver_is_genphy_10g(), as first step add and use a flag phydev->is_genphy_driven. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/3f3ad6dc-402e-4915-8d5a-2306b6d5562b@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16Merge branch 'eth-intel-migrate-to-new-rxfh-callbacks'Jakub Kicinski
Jakub Kicinski says: ==================== eth: intel: migrate to new RXFH callbacks Migrate Intel drivers to the recently added dedicated .get_rxfh_fields and .set_rxfh_fields ethtool callbacks. Note that I'm deleting all the boilerplate kdoc from the affected functions in the more recent drivers. If the maintainers feel strongly I can respin and add it back, but it really feels useless and undue burden for refactoring. No other vendor does this. v1: https://lore.kernel.org/20250613010111.3548291-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250614180907.4167714-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: iavf: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). I'm deleting all the boilerplate kdoc from the affected functions. It is somewhere between pointless and incorrect, just a burden for people refactoring the code. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250614180907.4167714-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: ice: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). I'm deleting all the boilerplate kdoc from the affected functions. It is somewhere between pointless and incorrect, just a burden for people refactoring the code. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250614180907.4167714-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: i40e: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). I'm deleting all the boilerplate kdoc from the affected functions. It is somewhere between pointless and incorrect, just a burden for people refactoring the code. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250614180907.4167714-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: fm10k: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). .get callback moves out of the switch and set_rxnfc disappears as ETHTOOL_SRXFH as the only functionality. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250614180907.4167714-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: ixgbe: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250614180907.4167714-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: igc: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250614180907.4167714-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: igb: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250614180907.4167714-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16Merge branch 'eth-migrate-to-new-rxfh-callbacks-get-only-drivers'Jakub Kicinski
Jakub Kicinski says: ==================== eth: migrate to new RXFH callbacks (get-only drivers) Migrate the drivers which only implement ETHTOOL_GRXFH to the recently added dedicated .get_rxfh_fields ethtool callback. v1: https://lore.kernel.org/20250613005409.3544529-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250614180638.4166766-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: enetc: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). This driver's RXFH config is read only / fixed so the conversion is trivial. Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20250614180638.4166766-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: e1000e: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). This driver's RXFH config is read only / fixed and it's the only get_rxnfc sub-command the driver supports. So convert the get_rxnfc handler into a get_rxfh_fields handler. Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250614180638.4166766-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: lan743x: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). This driver's RXFH config is read only / fixed so the conversion is purely factoring out the handling into a helper. Reviewed-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20250614180638.4166766-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: cxgb4: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). This driver's RXFH config is read only / fixed so the conversion is purely factoring out the handling into a helper. Reviewed-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20250614180638.4166766-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16eth: cisco: migrate to new RXFH callbacksJakub Kicinski
Migrate to new callbacks added by commit 9bb00786fc61 ("net: ethtool: add dedicated callbacks for getting and setting rxfh fields"). This driver's RXFH config is read only / fixed so the conversion is trivial. Reviewed-by: Joe Damato <joe@dama.to> Link: https://patch.msgid.link/20250614180638.4166766-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16Merge branch 'cn20k-silicon-with-mbox-support'Jakub Kicinski
Subbaraya Sundeep says: ==================== CN20K silicon with mbox support CN20K is the next generation silicon in the Octeon series with various improvements and new features. Along with other changes the mailbox communication mechanism between RVU (Resource virtualization Unit) SRIOV PFs/VFs with Admin function (AF) has also gone through some changes. Some of those changes are - Separate IRQs for mbox request and response/ack. - Configurable mbox size, default being 64KB. - Ability for VFs to communicate with RVU AF instead of going through parent SRIOV PF. Due to more memory requirement due to configurable mbox size, mbox memory will now have to be allocated by - AF (PF0) for communicating with other PFs and all VFs in the system. - PF for communicating with it's child VFs. On previous silicons mbox memory was reserved and configured by firmware. This patch series add basic mbox support for AF (PF0) <=> PFs and PF <=> VFs. AF <=> VFs communication and variable mbox size support will come in later. Patch #1 Supported co-existance of bit encoding PFs and VFs in 16-bit hardware pcifunc format between CN20K silicon and older octeon series. Also exported PF,VF masks and shifts present in mailbox module to all other modules. Patch #2 Added basic mbox operation APIs and structures to support both CN20K and previous version of silicons. Patch #3 This patch adds support for basic mbox infrastructure implementation for CN20K silicon in AF perspective. There are few updates w.r.t MBOX ACK interrupt and offsets in CN20k. Patch #4 Added mbox implementation between NIC PF and AF for CN20K. Patch #5 Added mbox communication support between AF and AF's VFs. Patch #6 This patch adds support for MBOX communication between NIC PF and its VFs. ==================== Link: https://patch.msgid.link/1749639716-13868-1-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16octeontx2-pf: CN20K mbox implementation between PF-VFSai Krishna
This patch implements the CN20k MBOX communication between PF and it's VFs. CN20K silicon got extra interrupt of MBOX response for trigger interrupt. Also few of the CSR offsets got changed in CN20K against prior series of silicons. Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1749639716-13868-7-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16octeontx2-af: CN20K mbox implementation for AF's VFSai Krishna
This patch implements the CN20k MBOX communication between AF and AF's VFs. This implementation uses separate trigger interrupts for request, response messages against using trigger message data in CN10K. Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1749639716-13868-6-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16octeontx2-pf: CN20K mbox REQ/ACK implementation for NIC PFSai Krishna
This implementation uses separate trigger interrupts for request, response messages against using trigger message data in CN10K. This patch adds support for basic mbox implementation for CN20K from NIC PF side. Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1749639716-13868-5-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16octeontx2-af: CN20k mbox to support AF REQ/ACK functionalitySai Krishna
This implementation uses separate trigger interrupts for request, response MBOX messages against using trigger message data in CN10K. This patch adds support for basic mbox implementation for CN20K from AF side. Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1749639716-13868-4-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16octeontx2-af: CN20k basic mbox operations and structuresSai Krishna
This patch adds basic mbox operation APIs and structures to add support for mbox module on CN20k silicon. There are few CSR offsets, interrupts changed between CN20k and prior Octeon series of devices. Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/1749639716-13868-3-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16octeontx2: Set appropriate PF, VF masks and shifts based on siliconSubbaraya Sundeep
Number of RVU PFs on CN20K silicon have increased to 96 from maximum of 32 that were supported on earlier silicons. Every RVU PF and VF is identified by HW using a 16bit PF_FUNC value. Due to the change in Max number of PFs in CN20K, the bit encoding of this PF_FUNC has changed. This patch handles the change by using helper functions(using silicon check) to use PF,VF masks and shifts to support both new silicon CN20K, OcteonTx series. These helper functions are used in different modules. Also moved the NIX AF register offset macros to other files which will be posted in coming patches. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Link: https://patch.msgid.link/1749639716-13868-2-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16bcachefs: Fix bch2_read_bio_to_text()Kent Overstreet
We can only pass negative error codes to bch2_err_str(); if it's a positive integer it's not an error and we trip an assert. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16drm/etnaviv: Protect the scheduler's pending list with its lockMaíra Canal
Commit 704d3d60fec4 ("drm/etnaviv: don't block scheduler when GPU is still active") ensured that active jobs are returned to the pending list when extending the timeout. However, it didn't use the pending list's lock to manipulate the list, which causes a race condition as the scheduler's workqueues are running. Hold the lock while manipulating the scheduler's pending list to prevent a race. Cc: stable@vger.kernel.org Fixes: 704d3d60fec4 ("drm/etnaviv: don't block scheduler when GPU is still active") Reported-by: Philipp Stanner <phasta@kernel.org> Closes: https://lore.kernel.org/dri-devel/964e59ba1539083ef29b06d3c78f5e2e9b138ab8.camel@mailbox.org/ Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250602132240.93314-1-mcanal@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-06-16bcachefs: fsck: Fix check_path_loop() + snapshotsKent Overstreet
A path exists in a particular snapshot: we should do the pathwalk in the snapshot ID of the inode we started from, _not_ change snapshot ID as we walk inodes and dirents. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: fsck: check_subdir_count logs pathKent Overstreet
We can easily go from inode number -> path now, which makes for more useful log messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: fsck: additional diagnostics for reattach_inode()Kent Overstreet
Log the inode's new path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: fsck: check_directory_structure runs in reverse orderKent Overstreet
When we find a directory connectivity problem, we should do the repair in the oldest snapshot that has the issue - so that we don't end up duplicating work or making a real mess of things. Oldest snapshot IDs have the highest integer value, so - just walk inodes in reverse order. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: fsck: Fix reattach_inode() for subvol rootsKent Overstreet
bch_subvolume.fs_path_parent needs to be updated as well, it should match inode.bi_parent_subvol. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: fsck: Fix remove_backpointer() for subvol rootsKent Overstreet
The dirent will be in a different snapshot if the inode is a subvolume root. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: fsck: Print path when we find a subvol loopKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: Fix __bch2_inum_to_path() when crossing subvol boundariesKent Overstreet
The bch2_subvolume_get_snapshot() call needs to happen before the dirent lookup - the dirent is in the parent subvolume. Also, check for loops. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: Call bch2_fs_init_rw() early if we'll be going rwKent Overstreet
kthread creation checks for pending signals, which is _very_ annoying if we have to do a long recovery and don't go rw until we've done significant work. Check if we'll be going rw and pre-allocate kthreads/workqueues. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: fsck: Improve check_key_has_inode()Kent Overstreet
Print out more info when we find a key (extent, dirent, xattr) for a missing inode - was there a good inode in an older snapshot, full(ish) list of keys for that missing inode, so we can make better decisions on how to repair. If it looks like it should've been deleted, autofix it. If we ever hit the non-autofix cases, we'll want to write more repair code (possibly reconstituting the inode). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: don't return fsck_fix for unfixable node errors in __btree_errBharadwaj Raju
After cd3cdb1ef706 ("Single err message for btree node reads"), all errors caused __btree_err to return -BCH_ERR_fsck_fix no matter what the actual error type was if the recovery pass was scanning for btree nodes. This lead to the code continuing despite things like bad node formats when they earlier would have caused a jump to fsck_err, because btree_err only jumps when the return from __btree_err does not match fsck_fix. Ultimately this lead to undefined behavior by attempting to unpack a key based on an invalid format. Make only errors of type -BCH_ERR_btree_node_read_err_fixable cause __btree_err to return -BCH_ERR_fsck_fix when scanning for btree nodes. Reported-by: syzbot+cfd994b9cdf00446fd54@syzkaller.appspotmail.com Fixes: cd3cdb1ef706 ("bcachefs: Single err message for btree node reads") Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: Fix pool->alloc NULL pointer dereferenceAlan Huang
btree_interior_update_pool has not been initialized before the filesystem becomes read-write, thus mempool_alloc in bch2_btree_update_start will trigger pool->alloc NULL pointer dereference in mempool_alloc_noprof Reported-by: syzbot+2f3859bd28f20fa682e6@syzkaller.appspotmail.com Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: Move bset size check before csum checkAlan Huang
In syzbot's crash, the bset's u64s is larger than the btree node. Reported-by: syzbot+bfaeaa8e26281970158d@syzkaller.appspotmail.com Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: mark more errors autofixKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: Kill unused tracepointsKent Overstreet
Dead code cleanup. Link: https://lore.kernel.org/linux-bcachefs/20250612224059.39fddd07@batman.local.home/ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16bcachefs: opts.journal_rewindKent Overstreet
Add a mount option for rewinding the journal, bringing the entire filesystem to where it was at a previous point in time. This is for extreme disaster recovery scenarios - it's not intended as an undelete operation. The option takes a journal sequence number; the desired sequence number can be determined with 'bcachefs list_journal' Caveats: - The 'journal_transaction_names' option must have been enabled (it's on by default). The option controls emitting of extra debug info in the journal, so we can see what individual transactions were doing; It also enables journalling of keys being overwritten, which is what we rely on here. - A full fsck run will be automatically triggered since alloc info will be inconsistent. Only leaf node updates to non-alloc btrees are rewound, since rewinding interior btree updates isn't possible or desirable. - We can't do anything about data that was deleted and overwritten. Lots of metadata updates after the point in time we're rewinding to shouldn't cause a problem, since we segragate data and metadata allocations (this is in order to make repair by btree node scan practical on larger filesystems; there's a small 64-bit per device bitmap in the superblock of device ranges with btree nodes, and we try to keep this small). However, having discards enabled will cause problems, since buckets are discarded as soon as they become empty (this is why we don't implement fstrim: we don't need it). Hopefully, this feature will be a one-off thing that's never used again: this was implemented for recovering from the "vfs i_nlink 0 -> subvol deletion" bug, and that bug was unusually disastrous and additional safeguards have since been implemented. But if it does turn out that we need this more in the future, I'll have to implement an option so that empty buckets aren't discarded immediately - lagging by perhaps 1% of device capacity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16selinux: fix selinux_xfrm_alloc_user() to set correct ctx_lenStephen Smalley
We should count the terminating NUL byte as part of the ctx_len. Otherwise, UBSAN logs a warning: UBSAN: array-index-out-of-bounds in security/selinux/xfrm.c:99:14 index 60 is out of range for type 'char [*]' The allocation itself is correct so there is no actual out of bounds indexing, just a warning. Cc: stable@vger.kernel.org Suggested-by: Christian Göttsche <cgzones@googlemail.com> Link: https://lore.kernel.org/selinux/CAEjxPJ6tA5+LxsGfOJokzdPeRomBHjKLBVR6zbrg+_w3ZZbM3A@mail.gmail.com/ Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>