Age | Commit message (Collapse) | Author |
|
If we sanitize error returns, the debug statements need
to come before that so that we don't lose information.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Fixes: 405b0d610745 ("net: usb: aqc111: fix error handling of usbnet read calls")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux into block-6.16
Pull MD fixes from Yu:
"- fix REQ_RAHEAD and REQ_NOWAIT IO err handling for raid1/10
- fix max_write_behind setting for dm-raid
- some minor cleanups"
* tag 'md-6.16-20250530' of https://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux:
md/md-bitmap: remove parameter slot from bitmap_create()
md/md-bitmap: cleanup bitmap_ops->startwrite()
md/dm-raid: remove max_write_behind setting limit
md/md-bitmap: fix dm-raid max_write_behind setting
md/raid1,raid10: don't handle IO error for REQ_RAHEAD and REQ_NOWAIT
|
|
Some bmips SoCs (bcm6362, bcm63268) share the same SPI reset for both SPI
and HSSPI controllers, so reset shouldn't be exclusive.
Fixes: 0eeadddbf09a ("spi: bcm63xx-hsspi: add reset support")
Reported-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250529130915.2519590-3-noltari@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Some bmips SoCs (bcm6362, bcm63268) share the same SPI reset for both SPI
and HSSPI controllers, so reset shouldn't be exclusive.
Fixes: 38807adeaf1e ("spi: bcm63xx-spi: add reset support")
Reported-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250529130915.2519590-2-noltari@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Dan reports that iterating over a device ITEs can legitimately lead
to a NULL pointer, and that the NULL check is placed *after* the
pointer has already been dereferenced.
Hoist the pointer check as early as possible and be done with it.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 30deb51a677b ("KVM: arm64: vgic-its: Add debugfs interface to expose ITS tables")
Link: https://lore.kernel.org/r/aDBylI1YnjPatAbr@stanley.mountain
Cc: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20250530091647.1152489-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
syzkaller has found another ugly race in the VGIC, this time dealing
with VGIC creation. Since kvm_vgic_create() doesn't sufficiently protect
against in-flight vCPU creations, it is possible to get a vCPU into the
kernel w/ an in-kernel VGIC but no allocation of private IRQs:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000d20
Mem abort info:
ESR = 0x0000000096000046
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x06: level 2 translation fault
Data abort info:
ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000
CM = 0, WnR = 1, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=0000000103e4f000
[0000000000000d20] pgd=0800000102e1c403, p4d=0800000102e1c403, pud=0800000101146403, pmd=0000000000000000
Internal error: Oops: 0000000096000046 [#1] PREEMPT SMP
CPU: 9 UID: 0 PID: 246 Comm: test Not tainted 6.14.0-rc6-00097-g0c90821f5db8 #16
Hardware name: linux,dummy-virt (DT)
pstate: 814020c5 (Nzcv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : _raw_spin_lock_irqsave+0x34/0x8c
lr : kvm_vgic_set_owner+0x54/0xa4
sp : ffff80008086ba20
x29: ffff80008086ba20 x28: ffff0000c19b5640 x27: 0000000000000000
x26: 0000000000000000 x25: ffff0000c4879bd0 x24: 000000000000001e
x23: 0000000000000000 x22: 0000000000000000 x21: ffff0000c487af80
x20: ffff0000c487af18 x19: 0000000000000000 x18: 0000001afadd5a8b
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000001
x14: ffff0000c19b56c0 x13: 0030c9adf9d9889e x12: ffffc263710e1908
x11: 0000001afb0d74f2 x10: e0966b840b373664 x9 : ec806bf7d6a57cd5
x8 : ffff80008086b980 x7 : 0000000000000001 x6 : 0000000000000001
x5 : 0000000080800054 x4 : 4ec4ec4ec4ec4ec5 x3 : 0000000000000000
x2 : 0000000000000001 x1 : 0000000000000000 x0 : 0000000000000d20
Call trace:
_raw_spin_lock_irqsave+0x34/0x8c (P)
kvm_vgic_set_owner+0x54/0xa4
kvm_timer_enable+0xf4/0x274
kvm_arch_vcpu_run_pid_change+0xe0/0x380
kvm_vcpu_ioctl+0x93c/0x9e0
__arm64_sys_ioctl+0xb4/0xec
invoke_syscall+0x48/0x110
el0_svc_common.constprop.0+0x40/0xe0
do_el0_svc+0x1c/0x28
el0_svc+0x30/0xd0
el0t_64_sync_handler+0x10c/0x138
el0t_64_sync+0x198/0x19c
Code: b9000841 d503201f 52800001 52800022 (88e17c02)
---[ end trace 0000000000000000 ]---
Plug the race by explicitly checking for an in-progress vCPU creation
and failing kvm_vgic_create() when that's the case. Add some comments to
document all the things kvm_vgic_create() is trying to guard against
too.
Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Tested-by: Alexander Potapenko <glider@google.com>
Link: https://lore.kernel.org/r/20250523194722.4066715-6-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
KVM's interrupt infrastructure is dodgy at best, allowing for some ugly
'off label' usage of the various UAPIs. In one example, userspace can
change the routing entry of a particular "GSI" after configuring
irqbypass with KVM_IRQFD. KVM/arm64 is oblivious to this, and winds up
preserving the stale translation in cases where vLPIs are configured.
Honor userspace's intentions and tear down the vLPI mapping if affected
by a "GSI" routing change. Make no attempt to reconstruct vLPIs if the
new target is an MSI and just fall back to software injection.
Tested-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250523194722.4066715-5-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The virtual mapping and "GSI" routing of a particular vLPI is subject to
change in response to the guest / userspace. This can be pretty annoying
to deal with when KVM needs to track the physical state that's managed
for vLPI direct injection.
Make vgic_v4_unset_forwarding() resilient by using the host IRQ to
resolve the vgic IRQ. Since this uses the LPI xarray directly, finding
the ITS by doorbell address + grabbing it's its_lock is no longer
necessary. Note that matching the right ITS / ITE is already handled in
vgic_v4_set_forwarding(), and unless there's a bug in KVM's VGIC ITS
emulation the virtual mapping that should remain stable for the lifetime
of the vLPI mapping.
Tested-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250523194722.4066715-4-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Though undocumented, KVM generally protects the translation of a vLPI
with the its_lock. While this makes perfectly good sense, as the ITS
itself contains the guest translation, an upcoming change will require
twiddling the vLPI mapping in an atomic context.
Switch to using the vIRQ's irq_lock to protect the translation. Use of
the its_lock in vgic_v4_unset_forwarding() is preserved for now as it
still needs to walk the ITS.
Tested-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250523194722.4066715-3-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The locking dance is about to get more interesting, switch the its_lock
over to a lock guard to make it a bit easier to handle.
Tested-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250523194722.4066715-2-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When handling a TLBI VA* instruction that potentially targets a
VNCR page mapping, we fail to mask out the top bits that contain
the ASID and TTL fields, hence potentially failing the VA check
in the TLB code.
An additional wrinkle is that we fail to sign extend the VA,
again leading to failed VA checks.
Fix both in one go by sign-extending the VA from bit 48, making
it comparable to the way we interpret VNCR_EL2.BADDR.
Fixes: 4ffa72ad8f37e ("KVM: arm64: nv: Add S1 TLB invalidation primitive for VNCR_EL2")
Link: https://lore.kernel.org/r/20250525175759.780891-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Broonie reports that fed55f49fad18 ("arm64: errata: Work around
AmpereOne's erratum AC04_CPU_23") breaks one of the vdso selftests
(vdso_test_chacha) as it indirectly drags asm/sysreg.h.
It is rather unfortunate (and worrying) that userspace gets built
with non-UAPI headers. In any case, paper over the issue by dragging
linux/kconfig.h in asm/sysreg.h. It is the right thing to do, at
least from the kernel perspective.
Reported-by: Mark Brown <broonie@kernel.org>
Fixes: fed55f49fad18 ("arm64: errata: Work around AmpereOne's erratum AC04_CPU_23")
Link: https://lore.kernel.org/r/aDCDGZ-G-nCP3hJI@finisterre.sirena.org.uk
Cc: D Scott Phillips <scott@os.amperecomputing.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250523170208.530818-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
All callers pass in '-1' for 'slot', hence it can be removed.
Link: https://lore.kernel.org/linux-raid/20250524061320.370630-6-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
|
|
bitmap_startwrite() always return 0, and the caller doesn't check return
value as well, hence change the method to void.
Also rename startwrite/endwrite to start_write/end_write, which is more in
line with the usual naming convention.
Link: https://lore.kernel.org/linux-raid/20250524061320.370630-4-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
|
|
The comments said 'vaule in kB', while the value actually means the
number of write_behind IOs. And since md-bitmap will automatically
adjust the value to max COUNTER_MAX / 2, there is no need to fail
early.
Also move some macros that is only used md-bitmap.c.
Link: https://lore.kernel.org/linux-raid/20250524061320.370630-15-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Xiao Ni <xni@redhat.com>
|
|
It's supposed to be COUNTER_MAX / 2, not COUNTER_MAX.
Link: https://lore.kernel.org/linux-raid/20250524061320.370630-14-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
|
|
IO with REQ_RAHEAD or REQ_NOWAIT can fail early, even if the storage medium
is fine, hence record badblocks or remove the disk from array does not
make sense.
This problem if found by lvm2 test lvcreate-large-raid, where dm-zero
will fail read ahead IO directly.
Fixes: e879a0d9cb08 ("md/raid1,raid10: don't ignore IO flags")
Reported-and-tested-by: Mikulas Patocka <mpatocka@redhat.com>
Closes: https://lore.kernel.org/all/34fa755d-62c8-4588-8ee1-33cb1249bdf2@redhat.com/
Link: https://lore.kernel.org/linux-raid/20250527081407.3004055-1-yukuai1@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into soc/dt
Renesas DTS updates for v6.16 (take five)
- Reduce I2C2 clock frequency on the RZ/G3E SMARC SoM.
* tag 'renesas-dts-for-v6.16-tag5' of https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
arm64: dts: renesas: rzg3e-smarc-som: Reduce I2C2 clock frequency
Link: https://lore.kernel.org/r/cover.1748355530.git.geert+renesas@glider.be
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Update my mail address to my new @kernel.org one and also add a mailmap
entry to make sure everything gets sent there for easier filtering.
Signed-off-by: Sven Peter <sven@kernel.org>
Link: https://lore.kernel.org/r/20250528221718.45204-1-sven@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
When user requests a connectable file handle explicitly with the
AT_HANDLE_CONNECTABLE flag, fail the request if filesystem (e.g. nfs)
does not know how to decode a connected non-dir dentry.
Fixes: c374196b2b9f ("fs: name_to_handle_at() support for "explicit connectable" file handles")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/20250525104731.1461704-1-amir73il@gmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This is straightforward enough: check_fix_ptrs() currently only runs
before we go RW, so updating the btree root pointer in c->btree_roots
suffices - it'll be written out in the first journal write we do.
For that, do_bch2_trans_commit_to_journal_replay() now handles
JSET_ENTRY_btree_root entries.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We have a bug report that looks like we might be leaking open buckets -
let's check if they got left attached to the cached btree node.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More stack usage work.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
with typical config options, variables in different inline functions
aren't sharing stack space - and these are slowpaths.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Specialize the .to_text() for alloc_v4, to avoid the temporary on the
stack for conversion from old versions.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
- Separate out a slowpath for bkey_nocow_lock()
- Don't call bch2_bkey_ptrs_c() or loop over pointers more than
necessary
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More stack usage work.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Factor out an error path for a small stack usage improvement.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Allocate some (smaller) temporary storage in btree_trans for this -
btree_path_down() is in our max-stack call stack.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix an assertion pop in the tiering_misaligned test: rounding down to
bucket size at the end of the journal space calculations leaves
cur_entry_sectors == 0, which is incorrect with !cur_entry_err.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It's uncomon to have multiple devices with journalling only on a subset,
but can be specified with the 'data_allowed' option. We need to know if
we're doing data/metadata writes to multiple devices, as that requires
issuing flushes before the journal writes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix an infinite loop when bkey_i->k.u64s is 0.
This only happens in userspace, where 'bcachefs list_journal' can print
the entire contents of the journal, and non-dirty entries aren't
validated.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
- Don't print a checksum error when we first read a journal entry: we
print a checksum error later if we'll be using the journal entry.
- Continuing with the theme of of improving error messages and grouping
errors into a single log message per error, print a single 'checksum
error' message per journal entry, and use bch2_journal_ptr_to_text()
to print out where on the device it was.
- Factor out checksum error messages and checking for missing journal
entries into helpers, bch2_journal_read() has gotten obnoxiously big.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (smartpqi, ufs, lpfc, scsi_debug, target,
hisi_sas) with the only substantive core change being the removal of
the stream_status member from the scsi_stream_status_header (to get
rid of flex array members)"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (77 commits)
scsi: target: core: Constify struct target_opcode_descriptor
scsi: target: core: Constify enabled() in struct target_opcode_descriptor
scsi: hisi_sas: Fix warning detected by sparse
scsi: mpt3sas: Fix _ctl_get_mpt_mctp_passthru_adapter() to return IOC pointer
scsi: sg: Remove unnecessary NULL check before unregister_sysctl_table()
scsi: ufs: mcq: Delete ufshcd_release_scsi_cmd() in ufshcd_mcq_abort()
scsi: ufs: qcom: dt-bindings: Document the SM8750 UFS Controller
scsi: mvsas: Fix typos in SAS/SATA VSP register comments
scsi: fnic: Replace memset() with eth_zero_addr()
scsi: ufs: core: Support updating device command timeout
scsi: ufs: core: Change hwq_id type and value
scsi: ufs: core: Increase the UIC command timeout further
scsi: zfcp: Simplify workqueue allocation
scsi: ufs: core: Print error value as hex format in ufshcd_err_handler()
scsi: sd: Remove the stream_status member from scsi_stream_status_header
scsi: docs: Clean up some style in scsi_mid_low_api
scsi: core: Remove unused scsi_dev_info_list_del_keyed()
scsi: isci: Remove unused sci_remote_device_reset()
scsi: scsi_debug: Reduce DEF_ATOMIC_WR_MAX_LENGTH
scsi: smartpqi: Delete a stray tab in pqi_is_parity_write_stream()
...
|
|
Mark files as repr(transparent) to ensure identical layout between C and Rust.
* patches from https://lore.kernel.org/20250527204636.12573-1-pekkarr@protonmail.com:
rust: file: improve safety comments
rust: file: mark `LocalFile` as `repr(transparent)`
Link: https://lore.kernel.org/20250527204636.12573-1-pekkarr@protonmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Some of the safety comments in `LocalFile`'s methods incorrectly refer to
the `File` type instead of `LocalFile`, so fix them to use the correct
type.
Also add missing Markdown code spans around lifetimes in the safety
comments, i.e. change 'a to `'a`.
Link: https://github.com/Rust-for-Linux/linux/issues/1165
Signed-off-by: Pekka Ristola <pekkarr@protonmail.com>
Link: https://lore.kernel.org/20250527204636.12573-2-pekkarr@protonmail.com
Reviewed-by: Benno Lossin <lossin@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Unsafe code in `LocalFile`'s methods assumes that the type has the same
layout as the inner `bindings::file`. This is not guaranteed by the default
struct representation in Rust, but requires specifying the `transparent`
representation.
The `File` struct (which also wraps `bindings::file`) is already marked as
`repr(transparent)`, so this change makes their layouts equivalent.
Fixes: 851849824bb5 ("rust: file: add Rust abstraction for `struct file`")
Closes: https://github.com/Rust-for-Linux/linux/issues/1165
Signed-off-by: Pekka Ristola <pekkarr@protonmail.com>
Link: https://lore.kernel.org/20250527204636.12573-1-pekkarr@protonmail.com
Reviewed-by: Benno Lossin <lossin@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Commit 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning
entries") introduced a new function, wait_entry_unlocked_exclusive(),
which waits for the current entry to become unlocked without advancing
the XArray iterator state.
Waiting for the entry to become unlocked requires dropping the XArray
lock. This requires calling xas_pause() prior to dropping the lock
which leaves the xas in a suitable state for the next iteration. However
this has the side-effect of advancing the xas state to the next index.
Normally this isn't an issue because xas_for_each() contains code to
detect this state and thus avoid advancing the index a second time on
the next loop iteration.
However both callers of and wait_entry_unlocked_exclusive() itself
subsequently use the xas state to reload the entry. As xas_pause()
updated the state to the next index this will cause the current entry
which is being waited on to be skipped. This caused the following
warning to fire intermittently when running xftest generic/068 on an XFS
filesystem with FS DAX enabled:
[ 35.067397] ------------[ cut here ]------------
[ 35.068229] WARNING: CPU: 21 PID: 1640 at mm/truncate.c:89 truncate_folio_batch_exceptionals+0xd8/0x1e0
[ 35.069717] Modules linked in: nd_pmem dax_pmem nd_btt nd_e820 libnvdimm
[ 35.071006] CPU: 21 UID: 0 PID: 1640 Comm: fstest Not tainted 6.15.0-rc7+ #77 PREEMPT(voluntary)
[ 35.072613] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/204
[ 35.074845] RIP: 0010:truncate_folio_batch_exceptionals+0xd8/0x1e0
[ 35.075962] Code: a1 00 00 00 f6 47 0d 20 0f 84 97 00 00 00 4c 63 e8 41 39 c4 7f 0b eb 61 49 83 c5 01 45 39 ec 7e 58 42 f68
[ 35.079522] RSP: 0018:ffffb04e426c7850 EFLAGS: 00010202
[ 35.080359] RAX: 0000000000000000 RBX: ffff9d21e3481908 RCX: ffffb04e426c77f4
[ 35.081477] RDX: ffffb04e426c79e8 RSI: ffffb04e426c79e0 RDI: ffff9d21e34816e8
[ 35.082590] RBP: ffffb04e426c79e0 R08: 0000000000000001 R09: 0000000000000003
[ 35.083733] R10: 0000000000000000 R11: 822b53c0f7a49868 R12: 000000000000001f
[ 35.084850] R13: 0000000000000000 R14: ffffb04e426c78e8 R15: fffffffffffffffe
[ 35.085953] FS: 00007f9134c87740(0000) GS:ffff9d22abba0000(0000) knlGS:0000000000000000
[ 35.087346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 35.088244] CR2: 00007f9134c86000 CR3: 000000040afff000 CR4: 00000000000006f0
[ 35.089354] Call Trace:
[ 35.089749] <TASK>
[ 35.090168] truncate_inode_pages_range+0xfc/0x4d0
[ 35.091078] truncate_pagecache+0x47/0x60
[ 35.091735] xfs_setattr_size+0xc7/0x3e0
[ 35.092648] xfs_vn_setattr+0x1ea/0x270
[ 35.093437] notify_change+0x1f4/0x510
[ 35.094219] ? do_truncate+0x97/0xe0
[ 35.094879] do_truncate+0x97/0xe0
[ 35.095640] path_openat+0xabd/0xca0
[ 35.096278] do_filp_open+0xd7/0x190
[ 35.096860] do_sys_openat2+0x8a/0xe0
[ 35.097459] __x64_sys_openat+0x6d/0xa0
[ 35.098076] do_syscall_64+0xbb/0x1d0
[ 35.098647] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 35.099444] RIP: 0033:0x7f9134d81fc1
[ 35.100033] Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d 2a 26 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff5
[ 35.102993] RSP: 002b:00007ffcd41e0d10 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
[ 35.104263] RAX: ffffffffffffffda RBX: 0000000000000242 RCX: 00007f9134d81fc1
[ 35.105452] RDX: 0000000000000242 RSI: 00007ffcd41e1200 RDI: 00000000ffffff9c
[ 35.106663] RBP: 00007ffcd41e1200 R08: 0000000000000000 R09: 0000000000000064
[ 35.107923] R10: 00000000000001a4 R11: 0000000000000202 R12: 0000000000000066
[ 35.109112] R13: 0000000000100000 R14: 0000000000100000 R15: 0000000000000400
[ 35.110357] </TASK>
[ 35.110769] irq event stamp: 8415587
[ 35.111486] hardirqs last enabled at (8415599): [<ffffffff8d74b562>] __up_console_sem+0x52/0x60
[ 35.113067] hardirqs last disabled at (8415610): [<ffffffff8d74b547>] __up_console_sem+0x37/0x60
[ 35.114575] softirqs last enabled at (8415300): [<ffffffff8d6ac625>] handle_softirqs+0x315/0x3f0
[ 35.115933] softirqs last disabled at (8415291): [<ffffffff8d6ac811>] __irq_exit_rcu+0xa1/0xc0
[ 35.117316] ---[ end trace 0000000000000000 ]---
Fix this by using xas_reset() instead, which is equivalent in
implementation to xas_pause() but does not advance the XArray state.
Fixes: 6be3e21d25ca ("fs/dax: don't skip locked entries when scanning entries")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Link: https://lore.kernel.org/20250523043749.1460780-1-apopple@nvidia.com
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: "Matthew Wilcow (Oracle)" <willy@infradead.org>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Pull VFIO updates from Alex Williamson:
- Remove an outdated DMA unmap optimization that relies on a feature
only implemented in AMDv1 page tables. (Jason Gunthorpe)
- Fix various migration issues in the hisi_acc_vfio_pci variant driver,
including use of a wrong DMA address requiring an update to the
migration data structure, resending task completion interrupt after
migration to re-sync queues, fixing a write-back cache sequencing
issue, fixing a driver unload issue, behaving correctly when the
guest driver is not loaded, and avoiding to squash errors from
sub-functions. (Longfang Liu)
- mlx5-vfio-pci variant driver update to make use of the new two-step
DMA API for migration, using a page array directly rather than using
a page list mapped across a scatter list. (Leon Romanovsky)
- Fix an incorrect loop index used when unwinding allocation of dirty
page bitmaps on error, resulting in temporary failure in freeing
unused bitmaps. (Li RongQing)
* tag 'vfio-v6.16-rc1' of https://github.com/awilliam/linux-vfio:
vfio/type1: Fix error unwind in migration dirty bitmap allocation
vfio/mlx5: Enable the DMA link API
vfio/mlx5: Rewrite create mkey flow to allow better code reuse
vfio/mlx5: Explicitly use number of pages instead of allocated length
hisi_acc_vfio_pci: update function return values.
hisi_acc_vfio_pci: bugfix live migration function without VF device driver
hisi_acc_vfio_pci: bugfix the problem of uninstalling driver
hisi_acc_vfio_pci: bugfix cache write-back issue
hisi_acc_vfio_pci: add eq and aeq interruption restore
hisi_acc_vfio_pci: fix XQE dma address error
vfio/type1: Remove Fine Grained Superpages detection
|
|
Pull IPMI updates from Corey Minyard:
"Restructure the IPMI driver.
This is a restructure of the IPMI driver, mostly to remove SRCU. The
locking had issues, and they were not going to be straightforward to
fix. Plus it used tons of memory and was generally a pain.
Most of this moves handling of messages out of bh and interrupt
context and runs it in thread context. Then getting rid of SRCU is
easy.
This also has a minor cleanup to remove a warning on newer GCCs and to
fix some documentation"
* tag 'for-linus-6.16-1' of https://github.com/cminyard/linux-ipmi: (26 commits)
docs: ipmi: fix spelling and grammar mistakes
ipmi:msghandler: Fix potential memory corruption in ipmi_create_user()
ipmi:watchdog: Use the new interface for panic messages
ipmi:msghandler: Export and fix panic messaging capability
Documentation:ipmi: Remove comments about interrupt level
ipmi:ssif: Fix a shutdown race
ipmi:msghandler: Don't deliver messages to deleted users
ipmi:si: Rework startup of IPMI devices
ipmi:msghandler: Add a error return from unhandle LAN cmds
ipmi:msghandler: Shut down lower layer first at unregister
ipmi:msghandler: Remove proc_fs.h
ipmi:msghandler: Don't check for shutdown when returning responses
ipmi:msghandler: Don't acquire a user refcount for queued messages
ipmi:msghandler: Fix locking around users and interfaces
ipmi:msghandler: Remove some user level processing in panic mode
ipmi: Add a note about the pretimeout callback
ipmi:watchdog: Change lock to mutex
ipmi:msghandler: Remove srcu for the ipmi_interfaces list
ipmi:msghandler: Remove srcu from the ipmi user structure
ipmi:msghandler: Use the system_wq, not system_bh_wq
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm
Pull trusted security manager (TSM) updates from Dan Williams:
- Add a general sysfs scheme for publishing "Measurement" values
provided by the architecture's TEE Security Manager. Use it to
publish TDX "Runtime Measurement Registers" ("RTMRs") that either
maintain a hash of stored values (similar to a TPM PCR) or provide
statically provisioned data. These measurements are validated by a
relying party.
- Reorganize the drivers/virt/coco/ directory for "host" and "guest"
shared infrastructure.
- Fix a configfs-tsm-report unregister bug
- With CONFIG_TSM_MEASUREMENTS joining CONFIG_TSM_REPORTS and in
anticipation of more shared "TSM" infrastructure arriving, rename the
maintainer entry to "TRUSTED SECURITY MODULE (TSM) INFRASTRUCTURE".
* tag 'tsm-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm:
tsm-mr: Fix init breakage after bin_attrs constification by scoping non-const pointers to init phase
sample/tsm-mr: Fix missing static for sample_report
virt: tdx-guest: Transition to scoped_cond_guard for mutex operations
virt: tdx-guest: Refactor and streamline TDREPORT generation
virt: tdx-guest: Expose TDX MRs as sysfs attributes
x86/tdx: tdx_mcall_get_report0: Return -EBUSY on TDCALL_OPERAND_BUSY error
x86/tdx: Add tdx_mcall_extend_rtmr() interface
tsm-mr: Add tsm-mr sample code
tsm-mr: Add TVM Measurement Register support
configfs-tsm-report: Fix NULL dereference of tsm_ops
coco/guest: Move shared guest CC infrastructure to drivers/virt/coco/guest/
configfs-tsm: Namespace TSM report symbols
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull Intel software guard extension (SGX) updates from Dave Hansen:
"A couple of x86/sgx changes.
The first one is a no-brainer to use the (simple) SHA-256 library.
For the second one, some folks doing testing noticed that SGX systems
under memory pressure were inducing fatal machine checks at pretty
unnerving rates, despite the SGX code having _some_ awareness of
memory poison.
It turns out that the SGX reclaim path was not checking for poison
_and_ it always accesses memory to copy it around. Make sure that
poisoned pages are not reclaimed"
* tag 'x86_sgx_for_6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sgx: Prevent attempts to reclaim poisoned pages
x86/sgx: Use SHA-256 library API instead of crypto_shash API
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- Have module addresses get updated in the persistent ring buffer
The addresses of the modules from the previous boot are saved in the
persistent ring buffer. If the same modules are loaded and an address
is in the old buffer points to an address that was both saved in the
persistent ring buffer and is loaded in memory, shift the address to
point to the address that is loaded in memory in the trace event.
- Print function names for irqs off and preempt off callsites
When ignoring the print fmt of a trace event and just printing the
fields directly, have the fields for preempt off and irqs off events
still show the function name (via kallsyms) instead of just showing
the raw address.
- Clean ups of the histogram code
The histogram functions saved over 800 bytes on the stack to process
events as they come in. Instead, create per-cpu buffers that can hold
this information and have a separate location for each context level
(thread, softirq, IRQ and NMI).
Also add some more comments to the code.
- Add "common_comm" field for histograms
Add "common_comm" that uses the current->comm as a field in an event
histogram and acts like any of the other fields of the event.
- Show "subops" in the enabled_functions file
When the function graph infrastructure is used, a subsystem has a
"subops" that it attaches its callback function to. Instead of the
enabled_functions just showing a function calling the function that
calls the subops functions, also show the subops functions that will
get called for that function too.
- Add "copy_trace_marker" option to instances
There are cases where an instance is created for tooling to write
into, but the old tooling has the top level instance hardcoded into
the application. New tools want to consume the data from an instance
and not the top level buffer. By adding a copy_trace_marker option,
whenever the top instance trace_marker is written into, a copy of it
is also written into the instance with this option set. This allows
new tools to read what old tools are writing into the top buffer.
If this option is cleared by the top instance, then what is written
into the trace_marker is not written into the top instance. This is a
way to redirect the trace_marker writes into another instance.
- Have tracepoints created by DECLARE_TRACE() use trace_<name>_tp()
If a tracepoint is created by DECLARE_TRACE() instead of
TRACE_EVENT(), then it will not be exposed via tracefs. Currently
there's no way to differentiate in the kernel the tracepoint
functions between those that are exposed via tracefs or not. A
calling convention has been made manually to append a "_tp" prefix
for events created by DECLARE_TRACE(). Instead of doing this
manually, force it so that all DECLARE_TRACE() events have this
notation.
- Use __string() for task->comm in some sched events
Instead of hardcoding the comm to be TASK_COMM_LEN in some of the
scheduler events use __string() which makes it dynamic. Note, if
these events are parsed by user space it they may break, and the
event may have to be converted back to the hardcoded size.
- Have function graph "depth" be unsigned to the user
Internally to the kernel, the "depth" field of the function graph
event is signed due to -1 being used for end of boundary. What
actually gets recorded in the event itself is zero or positive.
Reflect this to user space by showing "depth" as unsigned int and be
consistent across all events.
- Allow an arbitrary long CPU string to osnoise_cpus_write()
The filtering of which CPUs to write to can exceed 256 bytes. If a
machine has 256 CPUs, and the filter is to filter every other CPU,
the write would take a string larger than 256 bytes. Instead of using
a fixed size buffer on the stack that is 256 bytes, allocate it to
handle what is passed in.
- Stop having ftrace check the per-cpu data "disabled" flag
The "disabled" flag in the data structure passed to most ftrace
functions is checked to know if tracing has been disabled or not.
This flag was added back in 2008 before the ring buffer had its own
way to disable tracing. The "disable" flag is now not always set when
needed, and the ring buffer flag should be used in all locations
where the disabled is needed. Since the "disable" flag is redundant
and incorrect, stop using it. Fix up some locations that use the
"disable" flag to use the ring buffer info.
- Use a new tracer_tracing_disable/enable() instead of data->disable
flag
There's a few cases that set the data->disable flag to stop tracing,
but this flag is not consistently used. It is also an on/off switch
where if a function set it and calls another function that sets it,
the called function may incorrectly enable it.
Use a new trace_tracing_disable() and tracer_tracing_enable() that
uses a counter and can be nested. These use the ring buffer flags
which are always checked making the disabling more consistent.
- Save the trace clock in the persistent ring buffer
Save what clock was used for tracing in the persistent ring buffer
and set it back to that clock after a reboot.
- Remove unused reference to a per CPU data pointer in mmiotrace
functions
- Remove unused buffer_page field from trace_array_cpu structure
- Remove more strncpy() instances
- Other minor clean ups and fixes
* tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (36 commits)
tracing: Fix compilation warning on arm32
tracing: Record trace_clock and recover when reboot
tracing/sched: Use __string() instead of fixed lengths for task->comm
tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp suffix
tracing: Cleanup upper_empty() in pid_list
tracing: Allow the top level trace_marker to write into another instances
tracing: Add a helper function to handle the dereference arg in verifier
tracing: Remove unnecessary "goto out" that simply returns ret is trigger code
tracing: Fix error handling in event_trigger_parse()
tracing: Rename event_trigger_alloc() to trigger_data_alloc()
tracing: Replace deprecated strncpy() with strscpy() for stack_trace_filter_buf
tracing: Remove unused buffer_page field from trace_array_cpu structure
tracing: Use atomic_inc_return() for updating "disabled" counter in irqsoff tracer
tracing: Convert the per CPU "disabled" counter to local from atomic
tracing: branch: Use trace_tracing_is_on_cpu() instead of "disabled" field
ring-buffer: Add ring_buffer_record_is_on_cpu()
tracing: Do not use per CPU array_buffer.data->disabled for cpumask
ftrace: Do not disabled function graph based on "disabled" field
tracing: kdb: Use tracer_tracing_on/off() instead of setting per CPU disabled
tracing: Use tracer_tracing_disable() instead of "disabled" field for ftrace_dump_one()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing tools updates from Steven Rostedt:
- Set distinctive value for failed tests
When running "make check" that performs tests on rtla the failure is
checked by examining the output. Instead have the tool return an
error status if it exceeds the threadhold.
- Define __NR_sched_setattr for LoongArch
Define __NR_sched_setattr to allow this to build for LoongArch.
- Define _GNU_SOURCE for timerlat_bpf.c
Due to modifications of struct sched_attr in utils.h when _GNU_SOURCE
is not defined, this can cause errors for timerlat_bpf_init() and
breakage in BPF sample collection mode.
* tag 'trace-tools-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla: Define _GNU_SOURCE in timerlat_bpf.c
rtla: Define __NR_sched_setattr for LoongArch
rtla: Set distinctive exit value for failed tests
|