Age | Commit message (Collapse) | Author |
|
Ido Schimmel says:
====================
ipv6: Multipath routing fixes
This patchset contains two fixes for IPv6 multipath routing. See the
commit messages for more details.
====================
Link: https://patch.msgid.link/20250402114224.293392-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Nexthops whose link is down are not supposed to be considered during
path selection when the "ignore_routes_with_linkdown" sysctl is set.
This is done by assigning them a negative region boundary.
However, when comparing the computed hash (unsigned) with the region
boundary (signed), the negative region boundary is treated as unsigned,
resulting in incorrect nexthop selection.
Fix by treating the computed hash as signed. Note that the computed hash
is always in range of [0, 2^31 - 1].
Fixes: 3d709f69a3e7 ("ipv6: Use hash-threshold instead of modulo-N")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250402114224.293392-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cited commit transitioned IPv6 path selection to use hash-threshold
instead of modulo-N. With hash-threshold, each nexthop is assigned a
region boundary in the multipath hash function's output space and a
nexthop is chosen if the calculated hash is smaller than the nexthop's
region boundary.
Hash-threshold does not work correctly if path selection does not start
with the first nexthop. For example, if fib6_select_path() is always
passed the last nexthop in the group, then it will always be chosen
because its region boundary covers the entire hash function's output
space.
Fix this by starting the selection process from the first nexthop and do
not consider nexthops for which rt6_score_route() provided a negative
score.
Fixes: 3d709f69a3e7 ("ipv6: Use hash-threshold instead of modulo-N")
Reported-by: Stanislav Fomichev <stfomichev@gmail.com>
Closes: https://lore.kernel.org/netdev/Z9RIyKZDNoka53EO@mini-arch/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250402114224.293392-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Missing usbnet_going_away Check in Critical Path.
The usb_submit_urb function lacks a usbnet_going_away
validation, whereas __usbnet_queue_skb includes this check.
This inconsistency creates a race condition where:
A URB request may succeed, but the corresponding SKB data
fails to be queued.
Subsequent processes:
(e.g., rx_complete → defer_bh → __skb_unlink(skb, list))
attempt to access skb->next, triggering a NULL pointer
dereference (Kernel Panic).
Fixes: 04e906839a05 ("usbnet: fix cyclical race on disconnect with work queue")
Cc: stable@vger.kernel.org
Signed-off-by: Ying Lu <luying1@xiaomi.com>
Link: https://patch.msgid.link/4c9ef2efaa07eb7f9a5042b74348a67e5a3a7aea.1743584159.git.luying1@xiaomi.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the current implementation octeontx2 manages XDP_ABORTED and XDP
invalid as XDP_PASS forwarding the skb to the networking stack.
Align the behaviour to other XDP drivers handling XDP_ABORTED and XDP
invalid as XDP_DROP.
Please note this patch has just compile tested.
Fixes: 06059a1a9a4a5 ("octeontx2-pf: Add XDP support to netdev PF")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250401-octeontx2-xdp-abort-fix-v1-1-f0587c35a0b9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix a performance regression on AMD iGPU and dGPU drivers, related to
the unintended activation of DMA bounce buffers that regressed game
performance if KASLR disturbed things just enough
- Fix a copy_user_generic() performance regression on certain older
non-FSRM/ERMS CPUs
- Fix a Clang build warning due to a semantic merge conflict the Kunit
tree generated with the x86 tree
- Fix FRED related system hang during S4 resume
- Remove an unused API
* tag 'x86-urgent-2025-04-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fred: Fix system hang during S4 resume with FRED enabled
x86/platform/iosf_mbi: Remove unused iosf_mbi_unregister_pmic_bus_access_notifier()
x86/mm/init: Handle the special case of device private pages in add_pages(), to not increase max_pfn and trigger dma_addressing_limited() bounce buffers
x86/tools: Drop duplicate unlikely() definition in insn_decoder_test.c
x86/uaccess: Improve performance by aligning writes to 8 bytes in copy_user_generic(), on non-FSRM/ERMS CPUs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of device-specific fixes that have been gathered since
the previous pull:
- A few more HD-audio quirks and fixups
- A series of Qualcomm AudioReach fixes
- Various small fixes for ASoC rt5665, WSA, SOF and Cirrus"
* tag 'sound-fix-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Fix built-in mic on another ASUS VivoBook model
ALSA: hda/realtek - Support mute led function for HP platform
ASoC: imx-card: Add NULL check in imx_card_probe()
ASoC: codecs: rt5665: Fix some error handling paths in rt5665_probe()
ASoC: q6apm-dai: make use of q6apm_get_hw_pointer
ASoC: qdsp6: q6apm-dai: fix capture pipeline overruns.
ASoC: qdsp6: q6apm-dai: set 10 ms period and buffer alignment.
ASoC: q6apm: add q6apm_get_hw_pointer helper
ASoC: q6apm-dai: schedule all available frames to avoid dsp under-runs
ASoC: SOF: hda/ptl: Move mic privacy change notification sending to a work
ALSA/hda: intel-sdw-acpi: Remove (explicitly) unused header
ALSA: hda/realtek: Enable Mute LED on HP OMEN 16 Laptop xd000xx
ALSA: hda/tas2781: Upgrade calibratd-data writing code to support Alpha and Beta dsp firmware
ASoC: qdsp6: q6asm-dai: fix q6asm_dai_compr_set_params error path
ALSA: hda/realtek: Fix built-in mic breakage on ASUS VivoBook X515JA
ASoC: sma1307: Fix error handling in sma1307_setting_loaded()
ASoC: codecs: wsa884x: Correct VI sense channel mask
ASoC: codecs: wsa883x: Correct VI sense channel mask
firmware: cs_dsp: Ensure cs_dsp_load[_coeff]() returns 0 on success
|
|
The rz-du driver uses GEM DMA helpers, but does not implement the
drm_driver .gem_prime_import_sg_table operation. This prevents
importing dmabufs. Fix it by implementing the missing operation using
the DRM_GEM_DMA_DRIVER_OPS_WITH_DUMB_CREATE() helper macro.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Tested-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> # RZ/V2H + DSI
Tested-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250321104615.31809-1-laurent.pinchart+renesas@ideasonboard.com
|
|
Add Kconfig dependency between RZG2L_DU and RZG2L_MIPI_DSI, so that
DSI module has functional dependency on DU. It is similar way that
the R-Car MIPI DSI encoder is handled.
While at it drop ARCH_RENESAS dependency as DRM_RZG2L_DU depend on
ARCH_RZG2L.
Suggested-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240827163727.108405-1-biju.das.jz@bp.renesas.com
|
|
We switched to use refcount_t for vmaps and missed to change the vunmap
code to properly unset the vmap pointer, which is now cleared while vmap's
refcount > 0. Clear the cached vmap pointer only when refcounting drops to
zero to fix the bug.
Fixes: e1fc39a92332 ("drm/shmem-helper: Use refcount_t for vmap_use_count")
Reported-by: Lucas De Marchi <lucas.demarchi@intel.com>
Closes: https://lore.kernel.org/dri-devel/20250403105053.788b0f6e@collabora.com/T/#m3dca6d81bedc8d6146a56b82694624fbc6fa4c96
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250403142633.484660-1-dmitry.osipenko@collabora.com
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap into soc/drivers-2
arm/omap: drivers: updates for v6.14
* tag 'omap-for-v6.14/drivers-signed' of https://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap:
Input: tsc2007 - accept standard properties
|
|
soc/drivers-2
FSL SOC Changes for 6.15:
- irqdomain cleanups from Jiry
- Add Ioana as Maintainer of fsl-mc bus and remove Laurentiu and Stuart
- Remove deadcode from fsl-mc bus
* tag 'soc_fsl-6.15-1' of https://github.com/chleroy/linux:
bus: fsl-mc: Remove deadcode
MAINTAINERS: add the linuppc-dev list to the fsl-mc bus entry
MAINTAINERS: fix nonexistent dtbinding file name
MAINTAINERS: add myself as maintainer for the fsl-mc bus
irqdomain: soc: Switch to irq_find_mapping()
|
|
|
|
|
|
|
|
Use a separate subclass when acquiring KVM's per-CPU posted interrupts
wakeup lock in the scheduled out path, i.e. when adding a vCPU on the list
of vCPUs to wake, to workaround a false positive deadlock.
Chain exists of:
&p->pi_lock --> &rq->__lock --> &per_cpu(wakeup_vcpus_on_cpu_lock, cpu)
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&per_cpu(wakeup_vcpus_on_cpu_lock, cpu));
lock(&rq->__lock);
lock(&per_cpu(wakeup_vcpus_on_cpu_lock, cpu));
lock(&p->pi_lock);
*** DEADLOCK ***
In the wakeup handler, the callchain is *always*:
sysvec_kvm_posted_intr_wakeup_ipi()
|
--> pi_wakeup_handler()
|
--> kvm_vcpu_wake_up()
|
--> try_to_wake_up(),
and the lock order is:
&per_cpu(wakeup_vcpus_on_cpu_lock, cpu) --> &p->pi_lock.
For the schedule out path, the callchain is always (for all intents and
purposes; if the kernel is preemptible, kvm_sched_out() can be called from
something other than schedule(), but the beginning of the callchain will
be the same point in vcpu_block()):
vcpu_block()
|
--> schedule()
|
--> kvm_sched_out()
|
--> vmx_vcpu_put()
|
--> vmx_vcpu_pi_put()
|
--> pi_enable_wakeup_handler()
and the lock order is:
&rq->__lock --> &per_cpu(wakeup_vcpus_on_cpu_lock, cpu)
I.e. lockdep sees AB+BC ordering for schedule out, and CA ordering for
wakeup, and complains about the A=>C versus C=>A inversion. In practice,
deadlock can't occur between schedule out and the wakeup handler as they
are mutually exclusive. The entirely of the schedule out code that runs
with the problematic scheduler locks held, does so with IRQs disabled,
i.e. can't run concurrently with the wakeup handler.
Use a subclass instead disabling lockdep entirely, and tell lockdep that
both subclasses are being acquired when loading a vCPU, as the sched_out
and sched_in paths are NOT mutually exclusive, e.g.
CPU 0 CPU 1
--------------- ---------------
vCPU0 sched_out
vCPU1 sched_in
vCPU1 sched_out vCPU 0 sched_in
where vCPU0's sched_in may race with vCPU1's sched_out, on CPU 0's wakeup
list+lock.
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-ID: <20250401154727.835231-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Assert that IRQs are already disabled when putting a vCPU on a CPU's PI
wakeup list, as opposed to saving/disabling+restoring IRQs. KVM relies on
IRQs being disabled until the vCPU task is fully scheduled out, i.e. until
the scheduler has dropped all of its per-CPU locks (e.g. for the runqueue),
as attempting to wake the task while it's being scheduled out could lead
to deadlock.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20250401154727.835231-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Explicitly zero/empty-initialize the unions used for PMU related CPUID
entries, instead of manually zeroing all fields (hopefully), or in the
case of 0x80000022, relying on the compiler to clobber the uninitialized
bitfields.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-ID: <20250315024102.2361628-1-seanjc@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Convert HAVE_KVM_IRQ_BYPASS into a tristate so that selecting
IRQ_BYPASS_MANAGER follows KVM={m,y}, i.e. doesn't force irqbypass.ko to
be built-in.
Note, PPC allows building KVM as a module, but selects HAVE_KVM_IRQ_BYPASS
from a boolean Kconfig, i.e. KVM PPC unnecessarily forces irqbpass.ko to
be built-in. But that flaw is a longstanding PPC specific issue.
Fixes: 61df71ee992d ("kvm: move "select IRQ_BYPASS_MANAGER" to common code")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250315024623.2363994-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Wrap the TDP MMU page counter in CONFIG_KVM_PROVE_MMU so that the sanity
check is omitted from production builds, and more importantly to remove
the atomic accesses to account pages. A one-off memory leak in production
is relatively uninteresting, and a WARN_ON won't help mitigate a systemic
issue; it's as much about helping triage memory leaks as it is about
detecting them in the first place, and doesn't magically stop the leaks.
I.e. production environments will be quite sad if a severe KVM bug escapes,
regardless of whether or not KVM WARNs.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250315023448.2358456-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add a "-l <latency>" param to the rseq test so that the user can override
/dev/cpu_dma_latency, as described by the test's suggested workaround for
not being able to complete enough migrations.
cpu_dma_latency is not a normal file, even as far as procfs files go.
Writes to cpu_dma_latency only persist so long as the file is open, e.g.
so that the kernel automatically reverts back to a power-optimized state
once the sensitive workload completes. Provide the necessary functionality
instead of effectively forcing the user to write a non-obvious wrapper.
Cc: Dongsheng Zhang <dongsheng.x.zhang@intel.com>
Cc: Zide Chen <zide.chen@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250401142238.819487-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Acquire a lock on kvm->srcu when userspace is getting MP state to handle a
rather extreme edge case where "accepting" APIC events, i.e. processing
pending INIT or SIPI, can trigger accesses to guest memory. If the vCPU
is in L2 with INIT *and* a TRIPLE_FAULT request pending, then getting MP
state will trigger a nested VM-Exit by way of ->check_nested_events(), and
emuating the nested VM-Exit can access guest memory.
The splat was originally hit by syzkaller on a Google-internal kernel, and
reproduced on an upstream kernel by hacking the triple_fault_event_test
selftest to stuff a pending INIT, store an MSR on VM-Exit (to generate a
memory access on VMX), and do vcpu_mp_state_get() to trigger the scenario.
=============================
WARNING: suspicious RCU usage
6.14.0-rc3-b112d356288b-vmx/pi_lockdep_false_pos-lock #3 Not tainted
-----------------------------
include/linux/kvm_host.h:1058 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by triple_fault_ev/1256:
#0: ffff88810df5a330 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0x8b/0x9a0 [kvm]
stack backtrace:
CPU: 11 UID: 1000 PID: 1256 Comm: triple_fault_ev Not tainted 6.14.0-rc3-b112d356288b-vmx #3
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
<TASK>
dump_stack_lvl+0x7f/0x90
lockdep_rcu_suspicious+0x144/0x190
kvm_vcpu_gfn_to_memslot+0x156/0x180 [kvm]
kvm_vcpu_read_guest+0x3e/0x90 [kvm]
read_and_check_msr_entry+0x2e/0x180 [kvm_intel]
__nested_vmx_vmexit+0x550/0xde0 [kvm_intel]
kvm_check_nested_events+0x1b/0x30 [kvm]
kvm_apic_accept_events+0x33/0x100 [kvm]
kvm_arch_vcpu_ioctl_get_mpstate+0x30/0x1d0 [kvm]
kvm_vcpu_ioctl+0x33e/0x9a0 [kvm]
__x64_sys_ioctl+0x8b/0xb0
do_syscall_64+0x6c/0x170
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250401150504.829812-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Currently we can run into issues with provisioning VRAM region, due to
requiring contig VRAM BO underneath. We sometimes see that allocation
(multiple GB) can fail even when there is enough free space. We don't
need CPU access to the buffer in the first place, so can forgo pin_map
and therefore also the contig requirement. Keep the same behavior with
save and restore during suspend/resume (which can now be done with
blitter). We also need the VRAM to occupy the same pages so we don't
need to re-program the LMTT, so should still remain pinned (also we
don't want something to try evict it). With that covert over to plain
pinned kernel object.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250403102440.266113-16-matthew.auld@intel.com
|
|
If the kernel bo doesn't care about vmap(), either directly or
indirectly with save/restore then we don't need to force contig for such
buffers.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250403102440.266113-15-matthew.auld@intel.com
|
|
Some users apply PINNED and some don't when using pin_map(). The pin in
pin_map() should imply PINNED so just unconditionally apply it and clean
up all users.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250403102440.266113-14-matthew.auld@intel.com
|
|
With the idea of having more pinned objects using the blitter engine
where possible, during suspend/resume, mark the pinned objects which
can be done during the late phase once submission/migration has been
setup. Start out simple with lrc and page-tables from userspace.
v2:
- s/early_restore/late_restore; early restore was way too bold with too
many places being impacted at once.
v3:
- Split late vs early into separate lists, to align with newly added
apply-to-pinned infra.
v4:
- Rebase.
v5:
- Make sure we restore the late phase kernel_bo_present in igpu.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250403102440.266113-13-matthew.auld@intel.com
|
|
For kernel BOs we don't clear the CCS state on creation, therefore we
should be careful to ignore it when copying pages. In a future patch we
opt for using the copy path here for kernel BOs, so this now needs to be
considered.
v2:
- Drop bogus asserts (CI)
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250403102440.266113-12-matthew.auld@intel.com
|
|
Not all BOs need to be restored on resume / d3cold exit, add
XE_BO_FLAG_PINNED_NO_RESTORE which skips restoring of BOs rather just
allocates VRAM for the BO. This should slightly speedup resume / d3cold
exit flows.
Marking GuC ADS, GuC CT, GuC log, GuC PC, and SA as NORESTORE.
v2:
- s/WONTNEED/NORESTORE (Vivi)
- Rebase on newly added g2g and backup object flow
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250403102440.266113-11-matthew.auld@intel.com
|
|
Currently we move pinned objects, relying on the fact that the lpfn/fpfn
will force the placement to occupy the same pages when restoring.
However this then limits all such pinned objects to be contig
underneath. In addition it is likely a little fragile moving pinned
objects in the first place. Rather than moving such objects rather copy
the page contents to a secondary system memory object, that way the VRAM
pages never move and remain pinned. This also opens the door for
eventually having non-contig pinned objects that can also be
saved/restored using blitter.
v2:
- Make sure to drop the fence ref.
- Handle NULL bo->migrate.
v3:
- Ensure we add the copy fence to the BOs, otherwise backup_obj can
be freed before pipelined copy finishes.
v4:
- Rebase on newly added apply-to-pinned infra.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1182
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250403102440.266113-10-matthew.auld@intel.com
|
|
Trap and emulate virtualization is not available anymore for MIPS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Categorize the capabilities correctly. Section 6 is for enabled vCPU
capabilities; section 7 is for enabled VM capabilities; section 8 is
for informational ones.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Ensure that they have a ":" in front of the defined item.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
It is redundant, and sometimes wrong.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The capability is incorrectly called KVM_CAP_PPC_MULTITCE in the documentation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
TSC_DEADLINE is now advertised unconditionally by KVM_GET_SUPPORTED_CPUID,
since commit 9be4ec35d668 ("KVM: x86: Advertise TSC_DEADLINE_TIMER in
KVM_GET_SUPPORTED_CPUID", 2024-12-18). Adjust the documentation to
reflect the new behavior.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250331150550.510320-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Several tests cover infrastructure from virt/kvm/ and userspace APIs that have
only minimal requirements from architecture-specific code. As such, they are
available on all architectures that have libkvm support, and this presumably
will apply also in the future (for example if loongarch gets selftests support).
Put them in a separate variable and list them only once.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20250401141327.785520-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20250331221851.614582-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Run each testcase in a separate VMs to cover more possibilities;
move WRMSR close to MONITOR/MWAIT to test updating CPUID bits
while in the VM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Considering that the driver doesn't enable the used clocks (and also
that clk_get_rate() returns 0 if CONFIG_HAVE_CLK is unset) better check
the return value of clk_get_rate() for being non-zero before dividing by
it.
Fixes: 3479bbd1e1f8 ("pwm: fsl-ftm: More relaxed permissions for updating period")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/b68351a51017035651bc62ad3146afcb706874f0.1743501688.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
There were several issues in the function rcar_pwm_set_counter():
- The u64 values period_ns and duty_ns were cast to int on function
call which might loose bits on 32 bit architectures.
Fix: Make parameters to rcar_pwm_set_counter() u64
- The algorithm divided by the result of a division which looses
precision.
Fix: Make use of mul_u64_u64_div_u64()
- The calculated values were just masked to fit the respective register
fields which again might loose bits.
Fix: Explicitly check for overlow
Implement the respective fixes.
A side effect of fixing the 2nd issue is that there is no division by 0
if clk_get_rate() returns 0.
Fixes: ed6c1476bf7f ("pwm: Add support for R-Car PWM Timer")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/ab3dac794b2216cc1cc56d65c93dd164f8bd461b.1743501688.git.u.kleine-koenig@baylibre.com
[ukleinek: Added an explicit #include <linux/bitfield.h> to please the
0day build bot]
Link: https://lore.kernel.org/oe-kbuild-all/202504031354.VJtxScP5-lkp@intel.com/
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
Add Wa_16025250150 for the Xe2_HPG (graphics version: 20.01) platforms.
It is a permanent workaround, and applicable on all the steppings.
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250325134421.1489416-1-aradhya.bhatia@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
|
|
Pull dcache fixes from Al Viro:
"Fixes for bugs caught as part of tree-in-dcache work.
Mostly dentry refcount mishandling"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
hypfs_create_cpu_files(): add missing check for hypfs_mkdir() failure
qibfs: fix _another_ leak
spufs: fix a leak in spufs_create_context()
spufs: fix gang directory lifetimes
spufs: fix a leak on spufs_new_file() failure
|
|
Commit 57ed58c13256 ("selftests: ublk: enable zero copy for stripe target")
added test entry of test_stripe_04, but forgot to add the test script.
So fix the test by adding the script file.
Reported-by: Uday Shankar <ushankar@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Uday Shankar <ushankar@purestorage.com>
Link: https://lore.kernel.org/r/20250404001849.1443064-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following batch contains Netfilter fixes for net:
1) conncount incorrectly removes element for non-dynamic sets,
these elements represent a static control plane configuration,
leave them in place.
2) syzbot found a way to unregister a basechain that has been never
registered from the chain update path, fix from Florian Westphal.
3) Fix incorrect pointer arithmetics in geneve support for tunnel,
from Lin Ma.
* tag 'nf-25-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_tunnel: fix geneve_opt type confusion addition
netfilter: nf_tables: don't unregister hook when table is dormant
netfilter: nft_set_hash: GC reaps elements with conncount for dynamic sets only
====================
Link: https://patch.msgid.link/20250403115752.19608-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull smb server fixes from Steve French:
"Four ksmbd SMB3 server fixes, all also for stable"
* tag 'v6.15rc-part2-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix null pointer dereference in alloc_preauth_hash()
ksmbd: validate zero num_subauth before sub_auth is accessed
ksmbd: fix overflow in dacloffset bounds check
ksmbd: fix session use-after-free in multichannel connection
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ring-buffer updates from Steven Rostedt:
"Persistent buffer cleanups and simplifications.
It was mistaken that the physical memory returned from "reserve_mem"
had to be vmap()'d to get to it from a virtual address. But
reserve_mem already maps the memory to the virtual address of the
kernel so a simple phys_to_virt() can be used to get to the virtual
address from the physical memory returned by "reserve_mem". With this
new found knowledge, the code can be cleaned up and simplified.
- Enforce that the persistent memory is page aligned
As the buffers using the persistent memory are all going to be
mapped via pages, make sure that the memory given to the tracing
infrastructure is page aligned. If it is not, it will print a
warning and fail to map the buffer.
- Use phys_to_virt() to get the virtual address from reserve_mem
Instead of calling vmap() on the physical memory returned from
"reserve_mem", use phys_to_virt() instead.
As the memory returned by "memmap" or any other means where a
physical address is given to the tracing infrastructure, it still
needs to be vmap(). Since this memory can never be returned back to
the buddy allocator nor should it ever be memmory mapped to user
space, flag this buffer and up the ref count. The ref count will
keep it from ever being freed, and the flag will prevent it from
ever being memory mapped to user space.
- Use vmap_page_range() for memmap virtual address mapping
For the memmap buffer, instead of allocating an array of struct
pages, assigning them to the contiguous phsycial memory and then
passing that to vmap(), use vmap_page_range() instead
- Replace flush_dcache_folio() with flush_kernel_vmap_range()
Instead of calling virt_to_folio() and passing that to
flush_dcache_folio(), just call flush_kernel_vmap_range() directly.
This also fixes a bug where if a subbuffer was bigger than
PAGE_SIZE only the PAGE_SIZE portion would be flushed"
* tag 'trace-ringbuffer-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Use flush_kernel_vmap_range() over flush_dcache_folio()
tracing: Use vmap_page_range() to map memmap ring buffer
tracing: Have reserve_mem use phys_to_virt() and separate from memmap buffer
tracing: Enforce the persistent ring buffer to be page aligned
|
|
Pull more block updates from Jens Axboe:
- NVMe pull request via Keith:
- PCI endpoint target cleanup (Damien)
- Early import for uring_cmd fixed buffer (Caleb)
- Multipath documentation and notification improvements (John)
- Invalid pci sq doorbell write fix (Maurizio)
- Queue init locking fix
- Remove dead nsegs parameter from blk_mq_get_new_requests()
* tag 'block-6.15-20250403' of git://git.kernel.dk/linux:
block: don't grab elevator lock during queue initialization
nvme-pci: skip nvme_write_sq_db on empty rqlist
nvme-multipath: change the NVME_MULTIPATH config option
nvme: update the multipath warning in nvme_init_ns_head
nvme/ioctl: move fixed buffer lookup to nvme_uring_cmd_io()
nvme/ioctl: move blk_mq_free_request() out of nvme_map_user_request()
nvme/ioctl: don't warn on vectorized uring_cmd with fixed buffer
nvmet: pci-epf: Keep completion queues mapped
block: remove unused nseg parameter
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-04-02 (igc, e1000e, ixgbe, idpf)
For igc:
Joe Damato removes unmapping of XSK queues from NAPI instance.
Zdenek Bouska swaps condition checks/call to prevent AF_XDP Tx drops
with low budget value.
For e1000e:
Vitaly adjusts Kumeran interface configuration to prevent MDI errors.
For ixgbe:
Piotr clears PHY high values on media type detection to ensure stale
values are not used.
For idpf:
Emil adjusts shutdown calls to prevent NULL pointer dereference.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
idpf: fix adapter NULL pointer dereference on reboot
ixgbe: fix media type detection for E610 device
e1000e: change k1 configuration on MTP and later platforms
igc: Fix TX drops in XDP ZC
igc: Fix XSK queue NAPI ID mapping
====================
Link: https://patch.msgid.link/20250402173900.1957261-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull more io_uring updates from Jens Axboe:
"Set of fixes/updates for io_uring that should go into this release.
The ublk bits could've gone via either tree - usually I put them in
block, but they got a bit mixed this series with the zero-copy
supported that ended up dipping into both trees.
This contains:
- Fix for sendmsg zc, include in pinned pages accounting like we do
for the other zc types
- Series for ublk fixing request aborting, doing various little
cleanups, fixing some zc issues, and adding queue_rqs support
- Another ublk series doing some code cleanups
- Series cleaning up the io_uring send path, mostly in preparation
for registered buffers
- Series doing little MSG_RING cleanups
- Fix for the newly added zc rx, fixing len being 0 for the last
invocation of the callback
- Add vectored registered buffer support for ublk. With that, then
ublk also supports this feature in the kernel revision where it
could generically introduced for rw/net
- A bunch of selftest additions for ublk. This is the majority of the
diffstat
- Silence a KCSAN data race warning for io-wq
- Various little cleanups and fixes"
* tag 'io_uring-6.15-20250403' of git://git.kernel.dk/linux: (44 commits)
io_uring: always do atomic put from iowq
selftests: ublk: enable zero copy for stripe target
io_uring: support vectored kernel fixed buffer
block: add for_each_mp_bvec()
io_uring: add validate_fixed_range() for validate fixed buffer
selftests: ublk: kublk: fix an error log line
selftests: ublk: kublk: use ioctl-encoded opcodes
io_uring/zcrx: return early from io_zcrx_recv_skb if readlen is 0
io_uring/net: avoid import_ubuf for regvec send
io_uring/rsrc: check size when importing reg buffer
io_uring: cleanup {g,s]etsockopt sqe reading
io_uring: hide caches sqes from drivers
io_uring: make zcrx depend on CONFIG_IO_URING
io_uring: add req flag invariant build assertion
Documentation: ublk: remove dead footnote
selftests: ublk: specify io_cmd_buf pointer type
ublk: specify io_cmd_buf pointer type
io_uring: don't pass ctx to tw add remote helper
io_uring/msg: initialise msg request opcode
io_uring/msg: rename io_double_lock_ctx()
...
|
|
struct geneve_opt uses 5 bit length for each single option, which
means every vary size option should be smaller than 128 bytes.
However, all current related Netlink policies cannot promise this
length condition and the attacker can exploit a exact 128-byte size
option to *fake* a zero length option and confuse the parsing logic,
further achieve heap out-of-bounds read.
One example crash log is like below:
[ 3.905425] ==================================================================
[ 3.905925] BUG: KASAN: slab-out-of-bounds in nla_put+0xa9/0xe0
[ 3.906255] Read of size 124 at addr ffff888005f291cc by task poc/177
[ 3.906646]
[ 3.906775] CPU: 0 PID: 177 Comm: poc-oob-read Not tainted 6.1.132 #1
[ 3.907131] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 3.907784] Call Trace:
[ 3.907925] <TASK>
[ 3.908048] dump_stack_lvl+0x44/0x5c
[ 3.908258] print_report+0x184/0x4be
[ 3.909151] kasan_report+0xc5/0x100
[ 3.909539] kasan_check_range+0xf3/0x1a0
[ 3.909794] memcpy+0x1f/0x60
[ 3.909968] nla_put+0xa9/0xe0
[ 3.910147] tunnel_key_dump+0x945/0xba0
[ 3.911536] tcf_action_dump_1+0x1c1/0x340
[ 3.912436] tcf_action_dump+0x101/0x180
[ 3.912689] tcf_exts_dump+0x164/0x1e0
[ 3.912905] fw_dump+0x18b/0x2d0
[ 3.913483] tcf_fill_node+0x2ee/0x460
[ 3.914778] tfilter_notify+0xf4/0x180
[ 3.915208] tc_new_tfilter+0xd51/0x10d0
[ 3.918615] rtnetlink_rcv_msg+0x4a2/0x560
[ 3.919118] netlink_rcv_skb+0xcd/0x200
[ 3.919787] netlink_unicast+0x395/0x530
[ 3.921032] netlink_sendmsg+0x3d0/0x6d0
[ 3.921987] __sock_sendmsg+0x99/0xa0
[ 3.922220] __sys_sendto+0x1b7/0x240
[ 3.922682] __x64_sys_sendto+0x72/0x90
[ 3.922906] do_syscall_64+0x5e/0x90
[ 3.923814] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 3.924122] RIP: 0033:0x7e83eab84407
[ 3.924331] Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 faf
[ 3.925330] RSP: 002b:00007ffff505e370 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[ 3.925752] RAX: ffffffffffffffda RBX: 00007e83eaafa740 RCX: 00007e83eab84407
[ 3.926173] RDX: 00000000000001a8 RSI: 00007ffff505e3c0 RDI: 0000000000000003
[ 3.926587] RBP: 00007ffff505f460 R08: 00007e83eace1000 R09: 000000000000000c
[ 3.926977] R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffff505f3c0
[ 3.927367] R13: 00007ffff505f5c8 R14: 00007e83ead1b000 R15: 00005d4fbbe6dcb8
Fix these issues by enforing correct length condition in related
policies.
Fixes: 925d844696d9 ("netfilter: nft_tunnel: add support for geneve opts")
Fixes: 4ece47787077 ("lwtunnel: add options setting and dumping for geneve")
Fixes: 0ed5269f9e41 ("net/sched: add tunnel option support to act_tunnel_key")
Fixes: 0a6e77784f49 ("net/sched: allow flower to match tunnel options")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Link: https://patch.msgid.link/20250402165632.6958-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|