summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-05-30net: usb: aqc111: debug info before sanitationOliver Neukum
If we sanitize error returns, the debug statements need to come before that so that we don't lose information. Signed-off-by: Oliver Neukum <oneukum@suse.com> Fixes: 405b0d610745 ("net: usb: aqc111: fix error handling of usbnet read calls") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-30Merge tag 'md-6.16-20250530' of ↵Jens Axboe
https://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux into block-6.16 Pull MD fixes from Yu: "- fix REQ_RAHEAD and REQ_NOWAIT IO err handling for raid1/10 - fix max_write_behind setting for dm-raid - some minor cleanups" * tag 'md-6.16-20250530' of https://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux: md/md-bitmap: remove parameter slot from bitmap_create() md/md-bitmap: cleanup bitmap_ops->startwrite() md/dm-raid: remove max_write_behind setting limit md/md-bitmap: fix dm-raid max_write_behind setting md/raid1,raid10: don't handle IO error for REQ_RAHEAD and REQ_NOWAIT
2025-05-30spi: bcm63xx-hsspi: fix shared resetÁlvaro Fernández Rojas
Some bmips SoCs (bcm6362, bcm63268) share the same SPI reset for both SPI and HSSPI controllers, so reset shouldn't be exclusive. Fixes: 0eeadddbf09a ("spi: bcm63xx-hsspi: add reset support") Reported-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250529130915.2519590-3-noltari@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-30spi: bcm63xx-spi: fix shared resetÁlvaro Fernández Rojas
Some bmips SoCs (bcm6362, bcm63268) share the same SPI reset for both SPI and HSSPI controllers, so reset shouldn't be exclusive. Fixes: 38807adeaf1e ("spi: bcm63xx-spi: add reset support") Reported-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250529130915.2519590-2-noltari@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-30KVM: arm64: vgic-debug: Avoid dereferencing NULL ITE pointerMarc Zyngier
Dan reports that iterating over a device ITEs can legitimately lead to a NULL pointer, and that the NULL check is placed *after* the pointer has already been dereferenced. Hoist the pointer check as early as possible and be done with it. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: 30deb51a677b ("KVM: arm64: vgic-its: Add debugfs interface to expose ITS tables") Link: https://lore.kernel.org/r/aDBylI1YnjPatAbr@stanley.mountain Cc: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20250530091647.1152489-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-30KVM: arm64: vgic-init: Plug vCPU vs. VGIC creation raceOliver Upton
syzkaller has found another ugly race in the VGIC, this time dealing with VGIC creation. Since kvm_vgic_create() doesn't sufficiently protect against in-flight vCPU creations, it is possible to get a vCPU into the kernel w/ an in-kernel VGIC but no allocation of private IRQs: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000d20 Mem abort info: ESR = 0x0000000096000046 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x06: level 2 translation fault Data abort info: ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000 CM = 0, WnR = 1, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000103e4f000 [0000000000000d20] pgd=0800000102e1c403, p4d=0800000102e1c403, pud=0800000101146403, pmd=0000000000000000 Internal error: Oops: 0000000096000046 [#1] PREEMPT SMP CPU: 9 UID: 0 PID: 246 Comm: test Not tainted 6.14.0-rc6-00097-g0c90821f5db8 #16 Hardware name: linux,dummy-virt (DT) pstate: 814020c5 (Nzcv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : _raw_spin_lock_irqsave+0x34/0x8c lr : kvm_vgic_set_owner+0x54/0xa4 sp : ffff80008086ba20 x29: ffff80008086ba20 x28: ffff0000c19b5640 x27: 0000000000000000 x26: 0000000000000000 x25: ffff0000c4879bd0 x24: 000000000000001e x23: 0000000000000000 x22: 0000000000000000 x21: ffff0000c487af80 x20: ffff0000c487af18 x19: 0000000000000000 x18: 0000001afadd5a8b x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000001 x14: ffff0000c19b56c0 x13: 0030c9adf9d9889e x12: ffffc263710e1908 x11: 0000001afb0d74f2 x10: e0966b840b373664 x9 : ec806bf7d6a57cd5 x8 : ffff80008086b980 x7 : 0000000000000001 x6 : 0000000000000001 x5 : 0000000080800054 x4 : 4ec4ec4ec4ec4ec5 x3 : 0000000000000000 x2 : 0000000000000001 x1 : 0000000000000000 x0 : 0000000000000d20 Call trace: _raw_spin_lock_irqsave+0x34/0x8c (P) kvm_vgic_set_owner+0x54/0xa4 kvm_timer_enable+0xf4/0x274 kvm_arch_vcpu_run_pid_change+0xe0/0x380 kvm_vcpu_ioctl+0x93c/0x9e0 __arm64_sys_ioctl+0xb4/0xec invoke_syscall+0x48/0x110 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x30/0xd0 el0t_64_sync_handler+0x10c/0x138 el0t_64_sync+0x198/0x19c Code: b9000841 d503201f 52800001 52800022 (88e17c02) ---[ end trace 0000000000000000 ]--- Plug the race by explicitly checking for an in-progress vCPU creation and failing kvm_vgic_create() when that's the case. Add some comments to document all the things kvm_vgic_create() is trying to guard against too. Reported-by: Alexander Potapenko <glider@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Tested-by: Alexander Potapenko <glider@google.com> Link: https://lore.kernel.org/r/20250523194722.4066715-6-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-30KVM: arm64: Unmap vLPIs affected by changes to GSI routing informationOliver Upton
KVM's interrupt infrastructure is dodgy at best, allowing for some ugly 'off label' usage of the various UAPIs. In one example, userspace can change the routing entry of a particular "GSI" after configuring irqbypass with KVM_IRQFD. KVM/arm64 is oblivious to this, and winds up preserving the stale translation in cases where vLPIs are configured. Honor userspace's intentions and tear down the vLPI mapping if affected by a "GSI" routing change. Make no attempt to reconstruct vLPIs if the new target is an MSI and just fall back to software injection. Tested-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250523194722.4066715-5-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-30KVM: arm64: Resolve vLPI by host IRQ in vgic_v4_unset_forwarding()Oliver Upton
The virtual mapping and "GSI" routing of a particular vLPI is subject to change in response to the guest / userspace. This can be pretty annoying to deal with when KVM needs to track the physical state that's managed for vLPI direct injection. Make vgic_v4_unset_forwarding() resilient by using the host IRQ to resolve the vgic IRQ. Since this uses the LPI xarray directly, finding the ITS by doorbell address + grabbing it's its_lock is no longer necessary. Note that matching the right ITS / ITE is already handled in vgic_v4_set_forwarding(), and unless there's a bug in KVM's VGIC ITS emulation the virtual mapping that should remain stable for the lifetime of the vLPI mapping. Tested-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250523194722.4066715-4-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-30KVM: arm64: Protect vLPI translation with vgic_irq::irq_lockOliver Upton
Though undocumented, KVM generally protects the translation of a vLPI with the its_lock. While this makes perfectly good sense, as the ITS itself contains the guest translation, an upcoming change will require twiddling the vLPI mapping in an atomic context. Switch to using the vIRQ's irq_lock to protect the translation. Use of the its_lock in vgic_v4_unset_forwarding() is preserved for now as it still needs to walk the ITS. Tested-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250523194722.4066715-3-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-30KVM: arm64: Use lock guard in vgic_v4_set_forwarding()Oliver Upton
The locking dance is about to get more interesting, switch the its_lock over to a lock guard to make it a bit easier to handle. Tested-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250523194722.4066715-2-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-30KVM: arm64: Mask out non-VA bits from TLBI VA* on VNCR invalidationMarc Zyngier
When handling a TLBI VA* instruction that potentially targets a VNCR page mapping, we fail to mask out the top bits that contain the ASID and TTL fields, hence potentially failing the VA check in the TLB code. An additional wrinkle is that we fail to sign extend the VA, again leading to failed VA checks. Fix both in one go by sign-extending the VA from bit 48, making it comparable to the way we interpret VNCR_EL2.BADDR. Fixes: 4ffa72ad8f37e ("KVM: arm64: nv: Add S1 TLB invalidation primitive for VNCR_EL2") Link: https://lore.kernel.org/r/20250525175759.780891-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-30arm64: sysreg: Drag linux/kconfig.h to work around vdso build issueMarc Zyngier
Broonie reports that fed55f49fad18 ("arm64: errata: Work around AmpereOne's erratum AC04_CPU_23") breaks one of the vdso selftests (vdso_test_chacha) as it indirectly drags asm/sysreg.h. It is rather unfortunate (and worrying) that userspace gets built with non-UAPI headers. In any case, paper over the issue by dragging linux/kconfig.h in asm/sysreg.h. It is the right thing to do, at least from the kernel perspective. Reported-by: Mark Brown <broonie@kernel.org> Fixes: fed55f49fad18 ("arm64: errata: Work around AmpereOne's erratum AC04_CPU_23") Link: https://lore.kernel.org/r/aDCDGZ-G-nCP3hJI@finisterre.sirena.org.uk Cc: D Scott Phillips <scott@os.amperecomputing.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250523170208.530818-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-30md/md-bitmap: remove parameter slot from bitmap_create()Yu Kuai
All callers pass in '-1' for 'slot', hence it can be removed. Link: https://lore.kernel.org/linux-raid/20250524061320.370630-6-yukuai1@huaweicloud.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de>
2025-05-30md/md-bitmap: cleanup bitmap_ops->startwrite()Yu Kuai
bitmap_startwrite() always return 0, and the caller doesn't check return value as well, hence change the method to void. Also rename startwrite/endwrite to start_write/end_write, which is more in line with the usual naming convention. Link: https://lore.kernel.org/linux-raid/20250524061320.370630-4-yukuai1@huaweicloud.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de>
2025-05-30md/dm-raid: remove max_write_behind setting limitYu Kuai
The comments said 'vaule in kB', while the value actually means the number of write_behind IOs. And since md-bitmap will automatically adjust the value to max COUNTER_MAX / 2, there is no need to fail early. Also move some macros that is only used md-bitmap.c. Link: https://lore.kernel.org/linux-raid/20250524061320.370630-15-yukuai1@huaweicloud.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Xiao Ni <xni@redhat.com>
2025-05-30md/md-bitmap: fix dm-raid max_write_behind settingYu Kuai
It's supposed to be COUNTER_MAX / 2, not COUNTER_MAX. Link: https://lore.kernel.org/linux-raid/20250524061320.370630-14-yukuai1@huaweicloud.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de>
2025-05-30md/raid1,raid10: don't handle IO error for REQ_RAHEAD and REQ_NOWAITYu Kuai
IO with REQ_RAHEAD or REQ_NOWAIT can fail early, even if the storage medium is fine, hence record badblocks or remove the disk from array does not make sense. This problem if found by lvm2 test lvcreate-large-raid, where dm-zero will fail read ahead IO directly. Fixes: e879a0d9cb08 ("md/raid1,raid10: don't ignore IO flags") Reported-and-tested-by: Mikulas Patocka <mpatocka@redhat.com> Closes: https://lore.kernel.org/all/34fa755d-62c8-4588-8ee1-33cb1249bdf2@redhat.com/ Link: https://lore.kernel.org/linux-raid/20250527081407.3004055-1-yukuai1@huaweicloud.com Signed-off-by: Yu Kuai <yukuai3@huawei.com>
2025-05-30Merge tag 'renesas-dts-for-v6.16-tag5' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into soc/dt Renesas DTS updates for v6.16 (take five) - Reduce I2C2 clock frequency on the RZ/G3E SMARC SoM. * tag 'renesas-dts-for-v6.16-tag5' of https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel: arm64: dts: renesas: rzg3e-smarc-som: Reduce I2C2 clock frequency Link: https://lore.kernel.org/r/cover.1748355530.git.geert+renesas@glider.be Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-30MAINTAINERS, mailmap: update Sven Peter's email addressSven Peter
Update my mail address to my new @kernel.org one and also add a mailmap entry to make sure everything gets sent there for easier filtering. Signed-off-by: Sven Peter <sven@kernel.org> Link: https://lore.kernel.org/r/20250528221718.45204-1-sven@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-30exportfs: require ->fh_to_parent() to encode connectable file handlesAmir Goldstein
When user requests a connectable file handle explicitly with the AT_HANDLE_CONNECTABLE flag, fail the request if filesystem (e.g. nfs) does not know how to decode a connected non-dir dentry. Fixes: c374196b2b9f ("fs: name_to_handle_at() support for "explicit connectable" file handles") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/20250525104731.1461704-1-amir73il@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-30bcachefs: bch2_check_fix_ptrs() can now repair btree rootsKent Overstreet
This is straightforward enough: check_fix_ptrs() currently only runs before we go RW, so updating the btree root pointer in c->btree_roots suffices - it'll be written out in the first journal write we do. For that, do_bch2_trans_commit_to_journal_replay() now handles JSET_ENTRY_btree_root entries. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Include b->ob.nr in cached_btree_node_to_text()Kent Overstreet
We have a bug report that looks like we might be leaking open buckets - let's check if they got left attached to the cached btree node. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Move devs_sorted to alloc_requestKent Overstreet
More stack usage work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: reduce stack usage in alloc_sectors_start()Kent Overstreet
with typical config options, variables in different inline functions aren't sharing stack space - and these are slowpaths. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: bch2_alloc_v4_to_text()Kent Overstreet
Specialize the .to_text() for alloc_v4, to avoid the temporary on the stack for conversion from old versions. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Tweak bch2_data_update_init() for stack usageKent Overstreet
- Separate out a slowpath for bkey_nocow_lock() - Don't call bch2_bkey_ptrs_c() or loop over pointers more than necessary Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: kill replicas_sectors arg to __trigger_extent()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Don't stack allocate bch_writepage_stateKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: factor out break_cycle_fail()Kent Overstreet
More stack usage work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: btree_node_missing_err()Kent Overstreet
Factor out an error path for a small stack usage improvement. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Kill bkey_buf in btree_path_down()Kent Overstreet
Allocate some (smaller) temporary storage in btree_trans for this - btree_path_down() is in our max-stack call stack. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Add missing error logging in delete_dead_inodes()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Fix misaligned bucket check in journal space calculationsKent Overstreet
Fix an assertion pop in the tiering_misaligned test: rounding down to bucket size at the end of the journal space calculations leaves cur_entry_sectors == 0, which is incorrect with !cur_entry_err. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Fix incorrect multiple dev check in journal write pathKent Overstreet
It's uncomon to have multiple devices with journalling only on a subset, but can be specified with the 'data_allowed' option. We need to know if we're doing data/metadata writes to multiple devices, as that requires issuing flushes before the journal writes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Catch data_update_done events in trace_io_move_start_failKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: io_move_evacuate_bucket tracepoint, counterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: trace_io_move_predKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Fix infinite loop in journal_entry_btree_keys_to_text()Kent Overstreet
Fix an infinite loop when bkey_i->k.u64s is 0. This only happens in userspace, where 'bcachefs list_journal' can print the entire contents of the journal, and non-dirty entries aren't validated. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Journal read error message improvementsKent Overstreet
- Don't print a checksum error when we first read a journal entry: we print a checksum error later if we'll be using the journal entry. - Continuing with the theme of of improving error messages and grouping errors into a single log message per error, print a single 'checksum error' message per journal entry, and use bch2_journal_ptr_to_text() to print out where on the device it was. - Factor out checksum error messages and checking for missing journal entries into helpers, bch2_journal_read() has gotten obnoxiously big. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-29Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "Updates to the usual drivers (smartpqi, ufs, lpfc, scsi_debug, target, hisi_sas) with the only substantive core change being the removal of the stream_status member from the scsi_stream_status_header (to get rid of flex array members)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (77 commits) scsi: target: core: Constify struct target_opcode_descriptor scsi: target: core: Constify enabled() in struct target_opcode_descriptor scsi: hisi_sas: Fix warning detected by sparse scsi: mpt3sas: Fix _ctl_get_mpt_mctp_passthru_adapter() to return IOC pointer scsi: sg: Remove unnecessary NULL check before unregister_sysctl_table() scsi: ufs: mcq: Delete ufshcd_release_scsi_cmd() in ufshcd_mcq_abort() scsi: ufs: qcom: dt-bindings: Document the SM8750 UFS Controller scsi: mvsas: Fix typos in SAS/SATA VSP register comments scsi: fnic: Replace memset() with eth_zero_addr() scsi: ufs: core: Support updating device command timeout scsi: ufs: core: Change hwq_id type and value scsi: ufs: core: Increase the UIC command timeout further scsi: zfcp: Simplify workqueue allocation scsi: ufs: core: Print error value as hex format in ufshcd_err_handler() scsi: sd: Remove the stream_status member from scsi_stream_status_header scsi: docs: Clean up some style in scsi_mid_low_api scsi: core: Remove unused scsi_dev_info_list_del_keyed() scsi: isci: Remove unused sci_remote_device_reset() scsi: scsi_debug: Reduce DEF_ATOMIC_WR_MAX_LENGTH scsi: smartpqi: Delete a stray tab in pqi_is_parity_write_stream() ...
2025-05-30Merge patch series "rust: file: mark `LocalFile` as `repr(transparent)`"Christian Brauner
Mark files as repr(transparent) to ensure identical layout between C and Rust. * patches from https://lore.kernel.org/20250527204636.12573-1-pekkarr@protonmail.com: rust: file: improve safety comments rust: file: mark `LocalFile` as `repr(transparent)` Link: https://lore.kernel.org/20250527204636.12573-1-pekkarr@protonmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-30rust: file: improve safety commentsPekka Ristola
Some of the safety comments in `LocalFile`'s methods incorrectly refer to the `File` type instead of `LocalFile`, so fix them to use the correct type. Also add missing Markdown code spans around lifetimes in the safety comments, i.e. change 'a to `'a`. Link: https://github.com/Rust-for-Linux/linux/issues/1165 Signed-off-by: Pekka Ristola <pekkarr@protonmail.com> Link: https://lore.kernel.org/20250527204636.12573-2-pekkarr@protonmail.com Reviewed-by: Benno Lossin <lossin@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-30rust: file: mark `LocalFile` as `repr(transparent)`Pekka Ristola
Unsafe code in `LocalFile`'s methods assumes that the type has the same layout as the inner `bindings::file`. This is not guaranteed by the default struct representation in Rust, but requires specifying the `transparent` representation. The `File` struct (which also wraps `bindings::file`) is already marked as `repr(transparent)`, so this change makes their layouts equivalent. Fixes: 851849824bb5 ("rust: file: add Rust abstraction for `struct file`") Closes: https://github.com/Rust-for-Linux/linux/issues/1165 Signed-off-by: Pekka Ristola <pekkarr@protonmail.com> Link: https://lore.kernel.org/20250527204636.12573-1-pekkarr@protonmail.com Reviewed-by: Benno Lossin <lossin@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-30fs/dax: Fix "don't skip locked entries when scanning entries"Alistair Popple
Commit 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning entries") introduced a new function, wait_entry_unlocked_exclusive(), which waits for the current entry to become unlocked without advancing the XArray iterator state. Waiting for the entry to become unlocked requires dropping the XArray lock. This requires calling xas_pause() prior to dropping the lock which leaves the xas in a suitable state for the next iteration. However this has the side-effect of advancing the xas state to the next index. Normally this isn't an issue because xas_for_each() contains code to detect this state and thus avoid advancing the index a second time on the next loop iteration. However both callers of and wait_entry_unlocked_exclusive() itself subsequently use the xas state to reload the entry. As xas_pause() updated the state to the next index this will cause the current entry which is being waited on to be skipped. This caused the following warning to fire intermittently when running xftest generic/068 on an XFS filesystem with FS DAX enabled: [ 35.067397] ------------[ cut here ]------------ [ 35.068229] WARNING: CPU: 21 PID: 1640 at mm/truncate.c:89 truncate_folio_batch_exceptionals+0xd8/0x1e0 [ 35.069717] Modules linked in: nd_pmem dax_pmem nd_btt nd_e820 libnvdimm [ 35.071006] CPU: 21 UID: 0 PID: 1640 Comm: fstest Not tainted 6.15.0-rc7+ #77 PREEMPT(voluntary) [ 35.072613] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/204 [ 35.074845] RIP: 0010:truncate_folio_batch_exceptionals+0xd8/0x1e0 [ 35.075962] Code: a1 00 00 00 f6 47 0d 20 0f 84 97 00 00 00 4c 63 e8 41 39 c4 7f 0b eb 61 49 83 c5 01 45 39 ec 7e 58 42 f68 [ 35.079522] RSP: 0018:ffffb04e426c7850 EFLAGS: 00010202 [ 35.080359] RAX: 0000000000000000 RBX: ffff9d21e3481908 RCX: ffffb04e426c77f4 [ 35.081477] RDX: ffffb04e426c79e8 RSI: ffffb04e426c79e0 RDI: ffff9d21e34816e8 [ 35.082590] RBP: ffffb04e426c79e0 R08: 0000000000000001 R09: 0000000000000003 [ 35.083733] R10: 0000000000000000 R11: 822b53c0f7a49868 R12: 000000000000001f [ 35.084850] R13: 0000000000000000 R14: ffffb04e426c78e8 R15: fffffffffffffffe [ 35.085953] FS: 00007f9134c87740(0000) GS:ffff9d22abba0000(0000) knlGS:0000000000000000 [ 35.087346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.088244] CR2: 00007f9134c86000 CR3: 000000040afff000 CR4: 00000000000006f0 [ 35.089354] Call Trace: [ 35.089749] <TASK> [ 35.090168] truncate_inode_pages_range+0xfc/0x4d0 [ 35.091078] truncate_pagecache+0x47/0x60 [ 35.091735] xfs_setattr_size+0xc7/0x3e0 [ 35.092648] xfs_vn_setattr+0x1ea/0x270 [ 35.093437] notify_change+0x1f4/0x510 [ 35.094219] ? do_truncate+0x97/0xe0 [ 35.094879] do_truncate+0x97/0xe0 [ 35.095640] path_openat+0xabd/0xca0 [ 35.096278] do_filp_open+0xd7/0x190 [ 35.096860] do_sys_openat2+0x8a/0xe0 [ 35.097459] __x64_sys_openat+0x6d/0xa0 [ 35.098076] do_syscall_64+0xbb/0x1d0 [ 35.098647] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 35.099444] RIP: 0033:0x7f9134d81fc1 [ 35.100033] Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d 2a 26 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff5 [ 35.102993] RSP: 002b:00007ffcd41e0d10 EFLAGS: 00000202 ORIG_RAX: 0000000000000101 [ 35.104263] RAX: ffffffffffffffda RBX: 0000000000000242 RCX: 00007f9134d81fc1 [ 35.105452] RDX: 0000000000000242 RSI: 00007ffcd41e1200 RDI: 00000000ffffff9c [ 35.106663] RBP: 00007ffcd41e1200 R08: 0000000000000000 R09: 0000000000000064 [ 35.107923] R10: 00000000000001a4 R11: 0000000000000202 R12: 0000000000000066 [ 35.109112] R13: 0000000000100000 R14: 0000000000100000 R15: 0000000000000400 [ 35.110357] </TASK> [ 35.110769] irq event stamp: 8415587 [ 35.111486] hardirqs last enabled at (8415599): [<ffffffff8d74b562>] __up_console_sem+0x52/0x60 [ 35.113067] hardirqs last disabled at (8415610): [<ffffffff8d74b547>] __up_console_sem+0x37/0x60 [ 35.114575] softirqs last enabled at (8415300): [<ffffffff8d6ac625>] handle_softirqs+0x315/0x3f0 [ 35.115933] softirqs last disabled at (8415291): [<ffffffff8d6ac811>] __irq_exit_rcu+0xa1/0xc0 [ 35.117316] ---[ end trace 0000000000000000 ]--- Fix this by using xas_reset() instead, which is equivalent in implementation to xas_pause() but does not advance the XArray state. Fixes: 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning entries") Signed-off-by: Alistair Popple <apopple@nvidia.com> Link: https://lore.kernel.org/20250523043749.1460780-1-apopple@nvidia.com Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: "Matthew Wilcow (Oracle)" <willy@infradead.org> Cc: Balbir Singh <balbirs@nvidia.com> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-29Merge tag 'vfio-v6.16-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: - Remove an outdated DMA unmap optimization that relies on a feature only implemented in AMDv1 page tables. (Jason Gunthorpe) - Fix various migration issues in the hisi_acc_vfio_pci variant driver, including use of a wrong DMA address requiring an update to the migration data structure, resending task completion interrupt after migration to re-sync queues, fixing a write-back cache sequencing issue, fixing a driver unload issue, behaving correctly when the guest driver is not loaded, and avoiding to squash errors from sub-functions. (Longfang Liu) - mlx5-vfio-pci variant driver update to make use of the new two-step DMA API for migration, using a page array directly rather than using a page list mapped across a scatter list. (Leon Romanovsky) - Fix an incorrect loop index used when unwinding allocation of dirty page bitmaps on error, resulting in temporary failure in freeing unused bitmaps. (Li RongQing) * tag 'vfio-v6.16-rc1' of https://github.com/awilliam/linux-vfio: vfio/type1: Fix error unwind in migration dirty bitmap allocation vfio/mlx5: Enable the DMA link API vfio/mlx5: Rewrite create mkey flow to allow better code reuse vfio/mlx5: Explicitly use number of pages instead of allocated length hisi_acc_vfio_pci: update function return values. hisi_acc_vfio_pci: bugfix live migration function without VF device driver hisi_acc_vfio_pci: bugfix the problem of uninstalling driver hisi_acc_vfio_pci: bugfix cache write-back issue hisi_acc_vfio_pci: add eq and aeq interruption restore hisi_acc_vfio_pci: fix XQE dma address error vfio/type1: Remove Fine Grained Superpages detection
2025-05-29Merge tag 'for-linus-6.16-1' of https://github.com/cminyard/linux-ipmiLinus Torvalds
Pull IPMI updates from Corey Minyard: "Restructure the IPMI driver. This is a restructure of the IPMI driver, mostly to remove SRCU. The locking had issues, and they were not going to be straightforward to fix. Plus it used tons of memory and was generally a pain. Most of this moves handling of messages out of bh and interrupt context and runs it in thread context. Then getting rid of SRCU is easy. This also has a minor cleanup to remove a warning on newer GCCs and to fix some documentation" * tag 'for-linus-6.16-1' of https://github.com/cminyard/linux-ipmi: (26 commits) docs: ipmi: fix spelling and grammar mistakes ipmi:msghandler: Fix potential memory corruption in ipmi_create_user() ipmi:watchdog: Use the new interface for panic messages ipmi:msghandler: Export and fix panic messaging capability Documentation:ipmi: Remove comments about interrupt level ipmi:ssif: Fix a shutdown race ipmi:msghandler: Don't deliver messages to deleted users ipmi:si: Rework startup of IPMI devices ipmi:msghandler: Add a error return from unhandle LAN cmds ipmi:msghandler: Shut down lower layer first at unregister ipmi:msghandler: Remove proc_fs.h ipmi:msghandler: Don't check for shutdown when returning responses ipmi:msghandler: Don't acquire a user refcount for queued messages ipmi:msghandler: Fix locking around users and interfaces ipmi:msghandler: Remove some user level processing in panic mode ipmi: Add a note about the pretimeout callback ipmi:watchdog: Change lock to mutex ipmi:msghandler: Remove srcu for the ipmi_interfaces list ipmi:msghandler: Remove srcu from the ipmi user structure ipmi:msghandler: Use the system_wq, not system_bh_wq ...
2025-05-29Merge tag 'tsm-for-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm Pull trusted security manager (TSM) updates from Dan Williams: - Add a general sysfs scheme for publishing "Measurement" values provided by the architecture's TEE Security Manager. Use it to publish TDX "Runtime Measurement Registers" ("RTMRs") that either maintain a hash of stored values (similar to a TPM PCR) or provide statically provisioned data. These measurements are validated by a relying party. - Reorganize the drivers/virt/coco/ directory for "host" and "guest" shared infrastructure. - Fix a configfs-tsm-report unregister bug - With CONFIG_TSM_MEASUREMENTS joining CONFIG_TSM_REPORTS and in anticipation of more shared "TSM" infrastructure arriving, rename the maintainer entry to "TRUSTED SECURITY MODULE (TSM) INFRASTRUCTURE". * tag 'tsm-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm: tsm-mr: Fix init breakage after bin_attrs constification by scoping non-const pointers to init phase sample/tsm-mr: Fix missing static for sample_report virt: tdx-guest: Transition to scoped_cond_guard for mutex operations virt: tdx-guest: Refactor and streamline TDREPORT generation virt: tdx-guest: Expose TDX MRs as sysfs attributes x86/tdx: tdx_mcall_get_report0: Return -EBUSY on TDCALL_OPERAND_BUSY error x86/tdx: Add tdx_mcall_extend_rtmr() interface tsm-mr: Add tsm-mr sample code tsm-mr: Add TVM Measurement Register support configfs-tsm-report: Fix NULL dereference of tsm_ops coco/guest: Move shared guest CC infrastructure to drivers/virt/coco/guest/ configfs-tsm: Namespace TSM report symbols
2025-05-29Merge tag 'x86_sgx_for_6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Intel software guard extension (SGX) updates from Dave Hansen: "A couple of x86/sgx changes. The first one is a no-brainer to use the (simple) SHA-256 library. For the second one, some folks doing testing noticed that SGX systems under memory pressure were inducing fatal machine checks at pretty unnerving rates, despite the SGX code having _some_ awareness of memory poison. It turns out that the SGX reclaim path was not checking for poison _and_ it always accesses memory to copy it around. Make sure that poisoned pages are not reclaimed" * tag 'x86_sgx_for_6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sgx: Prevent attempts to reclaim poisoned pages x86/sgx: Use SHA-256 library API instead of crypto_shash API
2025-05-29Merge tag 'trace-v6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Have module addresses get updated in the persistent ring buffer The addresses of the modules from the previous boot are saved in the persistent ring buffer. If the same modules are loaded and an address is in the old buffer points to an address that was both saved in the persistent ring buffer and is loaded in memory, shift the address to point to the address that is loaded in memory in the trace event. - Print function names for irqs off and preempt off callsites When ignoring the print fmt of a trace event and just printing the fields directly, have the fields for preempt off and irqs off events still show the function name (via kallsyms) instead of just showing the raw address. - Clean ups of the histogram code The histogram functions saved over 800 bytes on the stack to process events as they come in. Instead, create per-cpu buffers that can hold this information and have a separate location for each context level (thread, softirq, IRQ and NMI). Also add some more comments to the code. - Add "common_comm" field for histograms Add "common_comm" that uses the current->comm as a field in an event histogram and acts like any of the other fields of the event. - Show "subops" in the enabled_functions file When the function graph infrastructure is used, a subsystem has a "subops" that it attaches its callback function to. Instead of the enabled_functions just showing a function calling the function that calls the subops functions, also show the subops functions that will get called for that function too. - Add "copy_trace_marker" option to instances There are cases where an instance is created for tooling to write into, but the old tooling has the top level instance hardcoded into the application. New tools want to consume the data from an instance and not the top level buffer. By adding a copy_trace_marker option, whenever the top instance trace_marker is written into, a copy of it is also written into the instance with this option set. This allows new tools to read what old tools are writing into the top buffer. If this option is cleared by the top instance, then what is written into the trace_marker is not written into the top instance. This is a way to redirect the trace_marker writes into another instance. - Have tracepoints created by DECLARE_TRACE() use trace_<name>_tp() If a tracepoint is created by DECLARE_TRACE() instead of TRACE_EVENT(), then it will not be exposed via tracefs. Currently there's no way to differentiate in the kernel the tracepoint functions between those that are exposed via tracefs or not. A calling convention has been made manually to append a "_tp" prefix for events created by DECLARE_TRACE(). Instead of doing this manually, force it so that all DECLARE_TRACE() events have this notation. - Use __string() for task->comm in some sched events Instead of hardcoding the comm to be TASK_COMM_LEN in some of the scheduler events use __string() which makes it dynamic. Note, if these events are parsed by user space it they may break, and the event may have to be converted back to the hardcoded size. - Have function graph "depth" be unsigned to the user Internally to the kernel, the "depth" field of the function graph event is signed due to -1 being used for end of boundary. What actually gets recorded in the event itself is zero or positive. Reflect this to user space by showing "depth" as unsigned int and be consistent across all events. - Allow an arbitrary long CPU string to osnoise_cpus_write() The filtering of which CPUs to write to can exceed 256 bytes. If a machine has 256 CPUs, and the filter is to filter every other CPU, the write would take a string larger than 256 bytes. Instead of using a fixed size buffer on the stack that is 256 bytes, allocate it to handle what is passed in. - Stop having ftrace check the per-cpu data "disabled" flag The "disabled" flag in the data structure passed to most ftrace functions is checked to know if tracing has been disabled or not. This flag was added back in 2008 before the ring buffer had its own way to disable tracing. The "disable" flag is now not always set when needed, and the ring buffer flag should be used in all locations where the disabled is needed. Since the "disable" flag is redundant and incorrect, stop using it. Fix up some locations that use the "disable" flag to use the ring buffer info. - Use a new tracer_tracing_disable/enable() instead of data->disable flag There's a few cases that set the data->disable flag to stop tracing, but this flag is not consistently used. It is also an on/off switch where if a function set it and calls another function that sets it, the called function may incorrectly enable it. Use a new trace_tracing_disable() and tracer_tracing_enable() that uses a counter and can be nested. These use the ring buffer flags which are always checked making the disabling more consistent. - Save the trace clock in the persistent ring buffer Save what clock was used for tracing in the persistent ring buffer and set it back to that clock after a reboot. - Remove unused reference to a per CPU data pointer in mmiotrace functions - Remove unused buffer_page field from trace_array_cpu structure - Remove more strncpy() instances - Other minor clean ups and fixes * tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (36 commits) tracing: Fix compilation warning on arm32 tracing: Record trace_clock and recover when reboot tracing/sched: Use __string() instead of fixed lengths for task->comm tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp suffix tracing: Cleanup upper_empty() in pid_list tracing: Allow the top level trace_marker to write into another instances tracing: Add a helper function to handle the dereference arg in verifier tracing: Remove unnecessary "goto out" that simply returns ret is trigger code tracing: Fix error handling in event_trigger_parse() tracing: Rename event_trigger_alloc() to trigger_data_alloc() tracing: Replace deprecated strncpy() with strscpy() for stack_trace_filter_buf tracing: Remove unused buffer_page field from trace_array_cpu structure tracing: Use atomic_inc_return() for updating "disabled" counter in irqsoff tracer tracing: Convert the per CPU "disabled" counter to local from atomic tracing: branch: Use trace_tracing_is_on_cpu() instead of "disabled" field ring-buffer: Add ring_buffer_record_is_on_cpu() tracing: Do not use per CPU array_buffer.data->disabled for cpumask ftrace: Do not disabled function graph based on "disabled" field tracing: kdb: Use tracer_tracing_on/off() instead of setting per CPU disabled tracing: Use tracer_tracing_disable() instead of "disabled" field for ftrace_dump_one() ...
2025-05-29Merge tag 'trace-tools-v6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing tools updates from Steven Rostedt: - Set distinctive value for failed tests When running "make check" that performs tests on rtla the failure is checked by examining the output. Instead have the tool return an error status if it exceeds the threadhold. - Define __NR_sched_setattr for LoongArch Define __NR_sched_setattr to allow this to build for LoongArch. - Define _GNU_SOURCE for timerlat_bpf.c Due to modifications of struct sched_attr in utils.h when _GNU_SOURCE is not defined, this can cause errors for timerlat_bpf_init() and breakage in BPF sample collection mode. * tag 'trace-tools-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rtla: Define _GNU_SOURCE in timerlat_bpf.c rtla: Define __NR_sched_setattr for LoongArch rtla: Set distinctive exit value for failed tests