Age | Commit message (Collapse) | Author |
|
Convert kvm_unmap_gfn_range(), which is the helper that zaps rmap SPTEs in
response to an mmu_notifier invalidation, to use __kvm_rmap_zap_gfn_range()
and feed in range->may_block. In other words, honor NEED_RESCHED by way of
cond_resched() when zapping rmaps. This fixes a long-standing issue where
KVM could process an absurd number of rmap entries without ever yielding,
e.g. if an mmu_notifier fired on a PUD (or larger) range.
Opportunistically rename __kvm_zap_rmap() to kvm_zap_rmap(), and drop the
old kvm_zap_rmap(). Ideally, the shuffling would be done in a different
patch, but that just makes the compiler unhappy, e.g.
arch/x86/kvm/mmu/mmu.c:1462:13: error: ‘kvm_zap_rmap’ defined but not used
Reported-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240809194335.1726916-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a dedicated helper to walk and zap rmaps for a given memslot so that
the code can be shared between KVM-initiated zaps and mmu_notifier
invalidations.
No functional change intended.
Link: https://lore.kernel.org/r/20240809194335.1726916-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a @can_yield param to __walk_slot_rmaps() to control whether or not
dropping mmu_lock and conditionally rescheduling is allowed. This will
allow using __walk_slot_rmaps() and thus cond_resched() to handle
mmu_notifier invalidations, which usually allow blocking/yielding, but not
when invoked by the OOM killer.
Link: https://lore.kernel.org/r/20240809194335.1726916-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Move walk_slot_rmaps() and friends up near for_each_slot_rmap_range() so
that the walkers can be used to handle mmu_notifier invalidations, and so
that similar function has some amount of locality in code.
No functional change intended.
Link: https://lore.kernel.org/r/20240809194335.1726916-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
WARN if KVM gets an MMIO cache hit on a RET_PF_WRITE_PROTECTED fault, as
KVM should return RET_PF_WRITE_PROTECTED if and only if there is a memslot,
and creating a memslot is supposed to invalidate the MMIO cache by virtue
of changing the memslot generation.
Keep the code around mainly to provide a convenient location to document
why emulated MMIO should be impossible.
Suggested-by: Yuan Yao <yuan.yao@linux.intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-23-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Explicitly query the list of to-be-zapped shadow pages when checking to
see if unprotecting a gfn for retry has succeeded, i.e. if KVM should
retry the faulting instruction.
Add a comment to explain why the list needs to be checked before zapping,
which is the primary motivation for this change.
No functional change intended.
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-22-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Fold kvm_mmu_unprotect_page() into kvm_mmu_unprotect_gfn_and_retry() now
that all other direct usage is gone.
No functional change intended.
Link: https://lore.kernel.org/r/20240831001538.336683-21-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Rename reexecute_instruction() to kvm_unprotect_and_retry_on_failure() to
make the intent and purpose of the helper much more obvious.
No functional change intended.
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-20-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When retrying the faulting instruction after emulation failure, refresh
the infinite loop protection fields even if no shadow pages were zapped,
i.e. avoid hitting an infinite loop even when retrying the instruction as
a last-ditch effort to avoid terminating the guest.
Link: https://lore.kernel.org/r/20240831001538.336683-19-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Use kvm_mmu_unprotect_gfn_and_retry() in reexecute_instruction() to pick
up protection against infinite loops, e.g. if KVM somehow manages to
encounter an unsupported instruction and unprotecting the gfn doesn't
allow the vCPU to make forward progress. Other than that, the retry-on-
failure logic is a functionally equivalent, open coded version of
kvm_mmu_unprotect_gfn_and_retry().
Note, the emulation failure path still isn't fully protected, as KVM
won't update the retry protection fields if no shadow pages are zapped
(but this change is still a step forward). That flaw will be addressed
in a future patch.
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-18-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Don't bother unprotecting the target gfn if EMULTYPE_WRITE_PF_TO_SP is
set, as KVM will simply report the emulation failure to userspace. This
will allow converting reexecute_instruction() to use
kvm_mmu_unprotect_gfn_instead_retry() instead of kvm_mmu_unprotect_page().
Link: https://lore.kernel.org/r/20240831001538.336683-17-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Drop the manual pfn look when retrying an instruction that KVM failed to
emulation in response to a #PF due to a write-protected gfn. Now that KVM
sets EMULTYPE_ALLOW_RETRY_PF if and only if the page fault hit a write-
protected gfn, i.e. if and only if there's a writable memslot, there's no
need to redo the lookup to avoid retrying an instruction that failed on
emulated MMIO (no slot, or a write to a read-only slot).
I.e. KVM will never attempt to retry an instruction that failed on
emulated MMIO, whereas that was not the case prior to the introduction of
RET_PF_WRITE_PROTECTED.
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-16-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Move the event re-injection unprotect+retry logic into
kvm_mmu_write_protect_fault(), i.e. unprotect and retry if and only if
the #PF actually hit a write-protected gfn. Note, there is a small
possibility that the gfn was unprotected by a different tasking between
hitting the #PF and acquiring mmu_lock, but in that case, KVM will resume
the guest immediately anyways because KVM will treat the fault as spurious.
As a bonus, unprotecting _after_ handling the page fault also addresses the
case where the installing a SPTE to handle fault encounters a shadowed PTE,
i.e. *creates* a read-only SPTE.
Opportunstically add a comment explaining what on earth the intent of the
code is, as based on the changelog from commit 577bdc496614 ("KVM: Avoid
instruction emulation when event delivery is pending").
Link: https://lore.kernel.org/r/20240831001538.336683-15-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When getting a gpa from a gva to unprotect the associated gfn when an
event is awating reinjection, walk the guest PTEs for WRITE as there's no
point in unprotecting the gfn if the guest is unable to write the page,
i.e. if write-protection can't trigger emulation.
Note, the entire flow should be guarded on the access being a write, and
even better should be conditioned on actually triggering a write-protect
fault. This will be addressed in a future commit.
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
If getting the gpa for a gva fails, e.g. because the gva isn't mapped in
the guest page tables, don't try to unprotect the invalid gfn. This is
mostly a performance fix (avoids unnecessarily taking mmu_lock), as
for_each_gfn_valid_sp_with_gptes() won't explode on garbage input, it's
simply pointless.
Link: https://lore.kernel.org/r/20240831001538.336683-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Now that retry_instruction() is reasonably tiny, fold it into its sole
caller, x86_emulate_instruction(). In addition to getting rid of the
absurdly confusing retry_instruction() name, handling the retry in
x86_emulate_instruction() pairs it back up with the code that resets
last_retry_{eip,address}.
No functional change intended.
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Move the sanity checks for EMULTYPE_ALLOW_RETRY_PF to the top of
x86_emulate_instruction(). In addition to deduplicating a small amount
of code, this makes the connection between EMULTYPE_ALLOW_RETRY_PF and
EMULTYPE_PF even more explicit, and will allow dropping retry_instruction()
entirely.
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Try to unprotect shadow pages if and only if indirect_shadow_pages is non-
zero, i.e. iff there is at least one protected such shadow page. Pre-
checking indirect_shadow_pages avoids taking mmu_lock for write when the
gfn is write-protected by a third party, i.e. not for KVM shadow paging,
and in the *extremely* unlikely case that a different task has already
unprotected the last shadow page.
Link: https://lore.kernel.org/r/20240831001538.336683-10-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Move the anti-infinite-loop protection provided by last_retry_{eip,addr}
into kvm_mmu_write_protect_fault() so that it guards unprotect+retry that
never hits the emulator, as well as reexecute_instruction(), which is the
last ditch "might as well try it" logic that kicks in when emulation fails
on an instruction that faulted on a write-protected gfn.
Add a new helper, kvm_mmu_unprotect_gfn_and_retry(), to set the retry
fields and deduplicate other code (with more to come).
Link: https://lore.kernel.org/r/20240831001538.336683-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Store the gpa used to unprotect the faulting gfn for retry as a gpa_t, not
an unsigned long. This fixes a bug where 32-bit KVM would unprotect and
retry the wrong gfn if the gpa had bits 63:32!=0. In practice, this bug
is functionally benign, as unprotecting the wrong gfn is purely a
performance issue (thanks to the anti-infinite-loop logic). And of course,
almost no one runs 32-bit KVM these days.
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Read RIP from vCPU state instead of pulling it from the emulation context
when filling last_retry_eip, which is part of the anti-infinite-loop
protection used when unprotecting and retrying instructions that hit a
write-protected gfn.
This will allow reusing the anti-infinite-loop protection in flows that
never make it into the emulator.
No functional change intended, as ctxt->eip is set to kvm_rip_read() in
init_emulate_ctxt(), and EMULTYPE_PF emulation is mutually exclusive with
EMULTYPE_NO_DECODE and EMULTYPE_SKIP, i.e. always goes through
x86_decode_emulated_instruction() and hasn't advanced ctxt->eip (yet).
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Resume the guest and thus skip emulation of a non-PTE-writing instruction
if and only if unprotecting the gfn actually zapped at least one shadow
page. If the gfn is write-protected for some reason other than shadow
paging, attempting to unprotect the gfn will effectively fail, and thus
retrying the instruction is all but guaranteed to be pointless. This bug
has existed for a long time, but was effectively fudged around by the
retry RIP+address anti-loop detection.
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When doing "fast unprotection" of nested TDP page tables, skip emulation
if and only if at least one gfn was unprotected, i.e. continue with
emulation if simply resuming is likely to hit the same fault and risk
putting the vCPU into an infinite loop.
Note, it's entirely possible to get a false negative, e.g. if a different
vCPU faults on the same gfn and unprotects the gfn first, but that's a
relatively rare edge case, and emulating is still functionally ok, i.e.
saving a few cycles by avoiding emulation isn't worth the risk of putting
the vCPU into an infinite loop.
Opportunistically rewrite the relevant comment to document in gory detail
exactly what scenario the "fast unprotect" logic is handling.
Fixes: 147277540bbc ("kvm: svm: Add support for additional SVM NPF error codes")
Cc: Yuan Yao <yuan.yao@intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Trigger KVM's various "unprotect gfn" paths if and only if the page fault
was a write to a write-protected gfn. To do so, add a new page fault
return code, RET_PF_WRITE_PROTECTED, to explicitly and precisely track
such page faults.
If a page fault requires emulation for any MMIO (or any reason besides
write-protection), trying to unprotect the gfn is pointless and risks
putting the vCPU into an infinite loop. E.g. KVM will put the vCPU into
an infinite loop if the vCPU manages to trigger MMIO on a page table walk.
Fixes: 147277540bbc ("kvm: svm: Add support for additional SVM NPF error codes")
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Link: https://lore.kernel.org/r/20240831001538.336683-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Drop the globally visible PFERR_NESTED_GUEST_PAGE and replace it with a
more appropriately named is_write_to_guest_page_table(). The macro name
is misleading, because while all nNPT walks match PAGE|WRITE|PRESENT, the
reverse is not true.
No functional change intended.
Link: https://lore.kernel.org/r/20240831001538.336683-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Rewrite the comment in FNAME(fetch) to explain why KVM needs to check that
the gPTE is still fresh before continuing the shadow page walk, even if
KVM already has a linked shadow page for the gPTE in question.
No functional change intended.
Link: https://lore.kernel.org/r/20240802203900.348808-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Drop the pointless and poorly named "out_gpte_changed" label, in
FNAME(fetch), and instead return RET_PF_RETRY directly.
No functional change intended.
Link: https://lore.kernel.org/r/20240802203900.348808-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Combine the back-to-back if-statements for synchronizing children when
linking a new indirect shadow page in order to decrease the indentation,
and to make it easier to "see" the logic in its entirety.
No functional change intended.
Link: https://lore.kernel.org/r/20240802203900.348808-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Rework the function comment for kvm_arch_mmu_enable_log_dirty_pt_masked()
into the body of the function, as it has gotten a bit stale, is harder to
read without the code context, and is the last source of warnings for W=1
builds in KVM x86 due to using a kernel-doc comment without documenting
all parameters.
Opportunistically subsume the functions comments for
kvm_mmu_write_protect_pt_masked() and kvm_mmu_clear_dirty_pt_masked(), as
there is no value in regurgitating similar information at a higher level,
and capturing the differences between write-protection and PML-based dirty
logging is best done in a common location.
No functional change intended.
Cc: David Matlack <dmatlack@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Link: https://lore.kernel.org/r/20240802202006.340854-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two driver fixes for regressions from 6.11-rc1 due to the
driver core change making a structure in a driver core callback const.
These were missed by all testing EXCEPT for what Bart happened to be
running, so I appreciate the fixes provided here for some
odd/not-often-used driver subsystems that nothing else happened to
catch.
Both of these fixes have been in linux-next all week with no reported
issues"
* tag 'driver-core-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
mips: sgi-ip22: Fix the build
ARM: riscpc: ecard: Fix the build
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc fixes from Greg KH:
"Here are some small char/misc fixes for 6.11-rc4 to resolve reported
problems. Included in here are:
- fastrpc revert of a change that broke userspace
- xillybus fixes for reported issues
Half of these have been in linux-next this week with no reported
problems, I don't know if the last bit of xillybus driver changes made
it in, but they are 'obviously correct' so will be safe :)"
* tag 'char-misc-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
char: xillybus: Check USB endpoints when probing device
char: xillybus: Refine workqueue handling
Revert "misc: fastrpc: Restrict untrusted app to attach to privileged PD"
char: xillybus: Don't destroy workqueue from work item running on it
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial fixes from Greg KH:
"Here are some small tty and serial driver fixes for 6.11-rc4 to
resolve some reported problems. Included in here are:
- conmakehash.c userspace build issues
- fsl_lpuart driver fix
- 8250_omap revert for reported regression
- atmel_serial rts flag fix
All of these have been in linux-next this week with no reported
issues"
* tag 'tty-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "serial: 8250_omap: Set the console genpd always on if no console suspend"
tty: atmel_serial: use the correct RTS flag.
tty: vt: conmakehash: remove non-portable code printing comment header
tty: serial: fsl_lpuart: mark last busy before uart_add_one_port
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt driver fixes from Greg KH:
"Here are some small USB and Thunderbolt driver fixes for 6.11-rc4 to
resolve some reported issues. Included in here are:
- thunderbolt driver fixes for reported problems
- typec driver fixes
- xhci fixes
- new device id for ljca usb driver
All of these have been in linux-next this week with no reported
issues"
* tag 'usb-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
xhci: Fix Panther point NULL pointer deref at full-speed re-enumeration
usb: misc: ljca: Add Lunar Lake ljca GPIO HID to ljca_gpio_hids[]
Revert "usb: typec: tcpm: clear pd_event queue in PORT_RESET"
usb: typec: ucsi: Fix the return value of ucsi_run_command()
usb: xhci: fix duplicate stall handling in handle_tx_event()
usb: xhci: Check for xhci->interrupters being allocated in xhci_mem_clearup()
thunderbolt: Mark XDomain as unplugged when router is removed
thunderbolt: Fix memory leaks in {port|retimer}_sb_regs_write()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull more btrfs fixes from David Sterba:
"A more fixes. We got reports that shrinker added in 6.10 still causes
latency spikes and the fixes don't handle all corner cases. Due to
summer holidays we're taking a shortcut to disable it for release
builds and will fix it in the near future.
- only enable extent map shrinker for DEBUG builds, temporary quick
fix to avoid latency spikes for regular builds
- update target inode's ctime on unlink, mandated by POSIX
- properly take lock to read/update block group's zoned variables
- add counted_by() annotations"
* tag 'for-6.11-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: only enable extent map shrinker for DEBUG builds
btrfs: zoned: properly take lock to read/update block group's zoned variables
btrfs: tree-checker: add dev extent item checks
btrfs: update target inode's ctime on unlink
btrfs: send: annotate struct name_cache_entry with __counted_by()
|
|
fuse_notify_store(), unlike fuse_do_readpage(), does not enable page
zeroing (because it can be used to change partial page contents).
So fuse_notify_store() must be more careful to fully initialize page
contents (including parts of the page that are beyond end-of-file)
before marking the page uptodate.
The current code can leave beyond-EOF page contents uninitialized, which
makes these uninitialized page contents visible to userspace via mmap().
This is an information leak, but only affects systems which do not
enable init-on-alloc (via CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y or the
corresponding kernel command line parameter).
Link: https://bugs.chromium.org/p/project-zero/issues/detail?id=2574
Cc: stable@kernel.org
Fixes: a1d75f258230 ("fuse: add store request")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"16 hotfixes. All except one are for MM. 10 of these are cc:stable and
the others pertain to post-6.10 issues.
As usual with these merges, singletons and doubletons all over the
place, no identifiable-by-me theme. Please see the lovingly curated
changelogs to get the skinny"
* tag 'mm-hotfixes-stable-2024-08-17-19-34' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/migrate: fix deadlock in migrate_pages_batch() on large folios
alloc_tag: mark pages reserved during CMA activation as not tagged
alloc_tag: introduce clear_page_tag_ref() helper function
crash: fix riscv64 crash memory reserve dead loop
selftests: memfd_secret: don't build memfd_secret test on unsupported arches
mm: fix endless reclaim on machines with unaccepted memory
selftests/mm: compaction_test: fix off by one in check_compaction()
mm/numa: no task_numa_fault() call if PMD is changed
mm/numa: no task_numa_fault() call if PTE is changed
mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
mm/memory-failure: use raw_spinlock_t in struct memory_failure_cpu
mm: don't account memmap per-node
mm: add system wide stats items category
mm: don't account memmap on failure
mm/hugetlb: fix hugetlb vs. core-mm PT locking
mseal: fix is_madv_discard()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix crashes on 85xx with some configs since the recent hugepd rework.
- Fix boot warning with hugepages and CONFIG_DEBUG_VIRTUAL on some
platforms.
- Don't enable offline cores when changing SMT modes, to match existing
userspace behaviour.
Thanks to Christophe Leroy, Dr. David Alan Gilbert, Guenter Roeck, Nysal
Jan K.A, Shrikanth Hegde, Thomas Gleixner, and Tyrel Datwyler.
* tag 'powerpc-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/topology: Check if a core is online
cpu/SMT: Enable SMT only if a core is online
powerpc/mm: Fix boot warning with hugepages and CONFIG_DEBUG_VIRTUAL
powerpc/mm: Fix size of allocated PGDIR
soc: fsl: qbman: remove unused struct 'cgr_comp'
|
|
Pull smb client fixes from Steve French:
- fix for clang warning - additional null check
- fix for cached write with posix locks
- flexible structure fix
* tag 'v6.11-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: smb2pdu.h: Use static_assert() to check struct sizes
smb3: fix lock breakage for cached writes
smb/client: avoid possible NULL dereference in cifs_free_subrequest()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"I2C core fix replacing IS_ENABLED() with IS_REACHABLE()
For host drivers, there are two fixes:
- Tegra I2C Controller: Addresses a potential double-locking issue
during probe. ACPI devices are not IRQ-safe when invoking runtime
suspend and resume functions, so the irq_safe flag should not be
set.
- Qualcomm GENI I2C Controller: Fixes an oversight in the exit path
of the runtime_resume() function, which was missed in the previous
release"
* tag 'i2c-for-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: tegra: Do not mark ACPI devices as irq safe
i2c: Use IS_REACHABLE() for substituting empty ACPI functions
i2c: qcom-geni: Add missing geni_icc_disable in geni_i2c_runtime_resume
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two small fixes to the mpi3mr driver. One to avoid oversize
allocations in tracing and the other to fix an uninitialized spinlock
in the user to driver feature request code (used to trigger dumps and
the like)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpi3mr: Avoid MAX_PAGE_ORDER WARNING for buffer allocations
scsi: mpi3mr: Add missing spin_lock_init() for mrioc->trigger_lock
|
|
Pull xfs fixes from Chandan Babu:
- Check for presence of only 'attr' feature before scrubbing an inode's
attribute fork.
- Restore the behaviour of setting AIL thread to TASK_INTERRUPTIBLE for
long (i.e. 50ms) sleep durations to prevent high load averages.
- Do not allow users to change the realtime flag of a file unless the
datadev and rtdev both support fsdax access modes.
* tag 'xfs-6.11-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: conditionally allow FS_XFLAG_REALTIME changes if S_DAX is set
xfs: revert AIL TASK_KILLABLE threshold
xfs: attr forks require attr, not attr2
|
|
Pull bcachefs fixes from Kent OverstreetL
- New on disk format version, bcachefs_metadata_version_disk_accounting_inum
This adds one more disk accounting counter, which counts disk usage
and number of extents per inode number. This lets us track
fragmentation, for implementing defragmentation later, and it also
counts disk usage per inode in all snapshots, which will be a useful
thing to expose to users.
- One performance issue we've observed is threads spinning when they
should be waiting for dirty keys in the key cache to be flushed by
journal reclaim, so we now have hysteresis for the waiting thread, as
well as improving the tracepoint and a new time_stat, for tracking
time blocked waiting on key cache flushing.
... and various assorted smaller fixes.
* tag 'bcachefs-2024-08-16' of git://evilpiepirate.org/bcachefs:
bcachefs: Fix locking in __bch2_trans_mark_dev_sb()
bcachefs: fix incorrect i_state usage
bcachefs: avoid overflowing LRU_TIME_BITS for cached data lru
bcachefs: Fix forgetting to pass trans to fsck_err()
bcachefs: Increase size of cuckoo hash table on too many rehashes
bcachefs: bcachefs_metadata_version_disk_accounting_inum
bcachefs: Kill __bch2_accounting_mem_mod()
bcachefs: Make bkey_fsck_err() a wrapper around fsck_err()
bcachefs: Fix warning in __bch2_fsck_err() for trans not passed in
bcachefs: Add a time_stat for blocked on key cache flush
bcachefs: Improve trans_blocked_journal_reclaim tracepoint
bcachefs: Add hysteresis to waiting on btree key cache flush
lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc()
bcachefs: Convert for_each_btree_node() to lockrestart_do()
bcachefs: Add missing downgrade table entry
bcachefs: disk accounting: ignore unknown types
bcachefs: bch2_accounting_invalid() fixup
bcachefs: Fix bch2_trigger_alloc when upgrading from old versions
bcachefs: delete faulty fastpath in bch2_btree_path_traverse_cached()
|
|
We run this in full RW mode now, so we have to guard against the
superblock buffer being reallocated.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Pull memcg-v1 fix from Al Viro:
"memcg_write_event_control() oops fix"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
memcg_write_event_control(): fix a user-triggerable oops
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Fix the arm64 __get_mem_asm() to use the _ASM_EXTABLE_##type##ACCESS()
macro instead of the *_ERR() one in order to avoid writing -EFAULT to
the value register in case of a fault
- Initialise all elements of the acpi_early_node_map[] to NUMA_NO_NODE.
Prior to this fix, only the first element was initialised
- Move the KASAN random tag seed initialisation after the per-CPU areas
have been initialised (prng_state is __percpu)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix KASAN random tag seed initialization
arm64: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
arm64: uaccess: correct thinko in __get_mem_asm()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"One fix for the new T-Head TH1520 clk driver that marks a bus clk
critical so that it isn't turned off during late init which breaks
emmc-sdio"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: thead: fix dependency on clk_ignore_unused
|
|
Pull block fixes from Jens Axboe:
- Fix corruption issues with s390/dasd (Eric, Stefan)
- Fix a misuse of non irq locking grab of a lock (Li)
- MD pull request with a single data corruption fix for raid1 (Yu)
* tag 'block-6.11-20240824' of git://git.kernel.dk/linux:
block: Fix lockdep warning in blk_mq_mark_tag_wait
md/raid1: Fix data corruption for degraded array with slow disk
s390/dasd: fix error recovery leading to data corruption on ESE devices
s390/dasd: Remove DMA alignment
|
|
Pull io_uring fixes from Jens Axboe:
- Fix a comment in the uapi header using the wrong member name (Caleb)
- Fix KCSAN warning for a debug check in sqpoll (me)
- Two more NAPI tweaks (Olivier)
* tag 'io_uring-6.11-20240824' of git://git.kernel.dk/linux:
io_uring: fix user_data field name in comment
io_uring/sqpoll: annotate debug task == current with data_race()
io_uring/napi: remove duplicate io_napi_entry timeout assignation
io_uring/napi: check napi_enabled in io_napi_add() before proceeding
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fix a possible (but unlikely) out-of-bounds read in interrupts
parsing code
- Add AT25 EEPROM "fujitsu,mb85rs256" compatible
- Update Konrad Dybcio's email
* tag 'devicetree-fixes-for-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of/irq: Prevent device address out-of-bounds read in interrupt map walk
dt-bindings: eeprom: at25: add fujitsu,mb85rs256 compatible
dt-bindings: Batch-update Konrad Dybcio's email
|