Age | Commit message (Collapse) | Author |
|
addrconf
Set an additional flag IFF_NO_ADDRCONF to prevent ipv6 addrconf.
Commit under Fixes added a new flag change that was not made
to hv_netvsc resulting in the VF being assinged an IPv6.
Fixes: 8a321cf7becc ("net: add IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconf")
Suggested-by: Cathy Avery <cavery@redhat.com>
Signed-off-by: Li Tian <litian@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20250716002607.4927-1-litian@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A new warning in clang [1] points out a place in pep_sock_accept() where
dst is uninitialized then passed as a const pointer to pep_find_pipe():
net/phonet/pep.c:829:37: error: variable 'dst' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
829 | newsk = pep_find_pipe(&pn->hlist, &dst, pipe_handle);
| ^~~:
Move the call to pn_skb_get_dst_sockaddr(), which initializes dst, to
before the call to pep_find_pipe(), so that dst is consistently used
initialized throughout the function.
Cc: stable@vger.kernel.org
Fixes: f7ae8d59f661 ("Phonet: allocate sock from accept syscall rather than soft IRQ")
Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d441f19b319e [1]
Closes: https://github.com/ClangBuiltLinux/linux/issues/2101
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20250715-net-phonet-fix-uninit-const-pointer-v1-1-8efd1bd188b3@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Configuration request only configure the incoming direction of the peer
initiating the request, so using the MTU is the other direction shall
not be used, that said the spec allows the peer responding to adjust:
Bluetooth Core 6.1, Vol 3, Part A, Section 4.5
'Each configuration parameter value (if any is present) in an
L2CAP_CONFIGURATION_RSP packet reflects an ‘adjustment’ to a
configuration parameter value that has been sent (or, in case of
default values, implied) in the corresponding
L2CAP_CONFIGURATION_REQ packet.'
That said adjusting the MTU in the response shall be limited to ERTM
channels only as for older modes the remote stack may not be able to
detect the adjustment causing it to silently drop packets.
Link: https://github.com/bluez/bluez/issues/1422
Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/149
Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4793
Fixes: 042bb9603c44 ("Bluetooth: L2CAP: Fix L2CAP MTU negotiation")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Remove TDVMCALLINFO_GET_QUOTE from user_tdvmcallinfo_1_r11 reported to
userspace to align with the direction of the GHCI spec.
Recently, concern was raised about a gap in the GHCI spec that left
ambiguity in how to expose to the guest that only a subset of GHCI
TDVMCalls were supported. During the back and forth on the spec details[0],
<GetQuote> was moved from an individually enumerable TDVMCall, to one that
is part of the 'base spec', meaning it doesn't have a specific bit in the
<GetTDVMCallInfo> return values. Although the spec[1] is still in draft
form, the GetQoute part has been agreed by the major TDX VMMs.
Unfortunately the commits that were upstreamed still treat <GetQuote> as
individually enumerable. They set bit 0 in the user_tdvmcallinfo_1_r11
which is reported to userspace to tell supported optional TDVMCalls,
intending to say that <GetQuote> is supported.
So stop reporting <GetQute> in user_tdvmcallinfo_1_r11 to align with
the direction of the spec, and allow some future TDVMCall to use that bit.
[0] https://lore.kernel.org/all/aEmuKII8FGU4eQZz@google.com/
[1] https://cdrdv2.intel.com/v1/dl/getContent/858626
Fixes: 28224ef02b56 ("KVM: TDX: Report supported optional TDVMCALLs in TDX capabilities")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20250717022010.677645-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.16
A relatively large set of changes, but most of them are quirk
information for various x86 systems. There is one more substantial fix
for a NULL pointer dereference when removing the AVS driver, plus one
for Kconfig dependencies.
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Couple of fixes:
- ath12k performance regression from -rc1
- cfg80211 counted_by() removal for scan request
as it doesn't match usage and keeps complaining
- iwlwifi crash with certain older devices
- iwlwifi missing an error path unlock
- iwlwifi compatibility with certain BIOS updates
* tag 'wireless-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: iwlwifi: Fix botched indexing conversion
wifi: cfg80211: remove scan request n_channels counted_by
wifi: ath12k: Fix packets received in WBM error ring with REO LUT enabled
wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap
wifi: iwlwifi: pcie: fix locking on invalid TOP reset
====================
Link: https://patch.msgid.link/20250717091831.18787-5-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If we reload the GuC due to suspend/resume or GT reset then we
have to resend not only any VFs provisioning data, but also PF
configuration, like scheduling parameters (EQ, PT), as otherwise
GuC will continue to use default values.
Fixes: 411220808cee ("drm/xe/pf: Restart VFs provisioning after GT reset")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20250711193316.1920-3-michal.wajdeczko@intel.com
(cherry picked from commit 1c38dd6afa4a8ecce28e94da794fd1d205c30f51)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
As part of the resume or GT reset, the PF driver schedules work
which is then used to complete restarting of the SR-IOV support,
including resending to the GuC configurations of provisioned VFs.
However, in case of short delay between those two actions, which
could be seen by triggering a GT reset on the suspened device:
$ echo 1 > /sys/kernel/debug/dri/0000:00:02.0/gt0/force_reset
this PF worker might be still busy, which lead to errors due to
just stopped or disabled GuC CTB communication:
[ ] xe 0000:00:02.0: [drm:xe_gt_resume [xe]] GT0: resumed
[ ] xe 0000:00:02.0: [drm] GT0: trying reset from force_reset_show [xe]
[ ] xe 0000:00:02.0: [drm] GT0: reset queued
[ ] xe 0000:00:02.0: [drm] GT0: reset started
[ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel stopped
[ ] xe 0000:00:02.0: [drm:guc_ct_send_recv [xe]] GT0: H2G request 0x5503 canceled!
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 12 config KLVs (-ECANCELED)
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 configuration (-ECANCELED)
[ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 12 config KLVs (-ENODEV)
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 configuration (-ENODEV)
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push 2 of 2 VFs configurations
[ ] xe 0000:00:02.0: [drm:pf_worker_restart_func [xe]] GT0: PF: restart completed
While this VFs reprovisioning will be successful during next spin
of the worker, to avoid those errors, make sure to cancel restart
worker if we are about to trigger next reset.
Fixes: 411220808cee ("drm/xe/pf: Restart VFs provisioning after GT reset")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20250711193316.1920-2-michal.wajdeczko@intel.com
(cherry picked from commit 9f50b729dd61dfb9f4d7c66900d22a7c7353a8c0)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The check would fail if the address is unaligned, but not when
accounting the offset. Instead of `buf | offset` it should have
been `buf + offset`. To make it more readable and also drop the
uintptr_t, just use the IS_ALIGNED() macro.
Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20250710-migrate-aligned-v1-1-44003ef3c078@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 81e139db6900503a2e68009764054fad128fbf95)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
We need the topology to determine GT page fault queue size, move page
fault init after topology init.
Cc: stable@vger.kernel.org
Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://lore.kernel.org/r/20250710191208.1040215-1-matthew.brost@intel.com
(cherry picked from commit beb72acb5b38dbe670d8eb752d1ad7a32f9c4119)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
MOCS uc_index is used even before it is initialized in the following
callstack
guc_prepare_xfer()
__xe_guc_upload()
xe_guc_min_load_for_hwconfig()
xe_uc_init_hwconfig()
xe_gt_init_hwconfig()
Do MOCS index initialization earlier in the device probe.
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com>
Link: https://lore.kernel.org/r/20250520142445.2792824-1-balasubramani.vivekanandan@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 241cc827c0987d7173714fc5a95a7c8fc9bf15c0)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following batch contains Netfilter fixes for net:
1) Three patches to enhance conntrack selftests for resize and clash
resolution, from Florian Westphal.
2) Expand nft_concat_range.sh selftest to improve coverage from error
path, from Florian Westphal.
3) Hide clash bit to userspace from netlink dumps until there is a
good reason to expose, from Florian Westphal.
4) Revert notification for device registration/unregistration for
nftables basechains and flowtables, we decided to go for a better
way to handle this through the nfnetlink_hook infrastructure which
will come via nf-next, patch from Phil Sutter.
5) Fix crash in conntrack due to race related to SLAB_TYPESAFE_BY_RCU
that results in removing a recycled object that is not yet in the
hashes. Move IPS_CONFIRM setting after the object is in the hashes.
From Florian Westphal.
netfilter pull request 25-07-17
* tag 'nf-25-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_conntrack: fix crash due to removal of uninitialised entry
Revert "netfilter: nf_tables: Add notifications for hook changes"
netfilter: nf_tables: hide clash bit from userspace
selftests: netfilter: nft_concat_range.sh: send packets to empty set
selftests: netfilter: conntrack_resize.sh: also use udpclash tool
selftests: netfilter: add conntrack clash resolution test case
selftests: netfilter: conntrack_resize.sh: extend resize test
====================
Link: https://patch.msgid.link/20250717095808.41725-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
After we do the modification on the host side, ensure we write the
result back to VRAM and not the other way around, otherwise the
modification will be lost if treated like a read.
Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250710134128.800756-2-matthew.auld@intel.com
(cherry picked from commit c12fe703cab93f9d8bfe0ff32b58e7b1fd52be1f)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Skipping TLB invalidations on VF causing unrecoverable
faults. Probable reason for skipping TLB invalidations
on SRIOV could be lack of support for instruction
MI_FLUSH_DW_STORE_INDEX. Add back TLB flush with some
additional handling.
Helps in resolving,
[ 704.913454] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]]
ASID: 0
VFID: 0
PDATA: 0x0d92
Faulted Address: 0x0000000002fa0000
FaultType: 0
AccessType: 1
FaultLevel: 0
EngineClass: 3 bcs
EngineInstance: 8
[ 704.913551] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
V2:
- Use Xmas tree (MichalW)
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Fixes: 97515d0b3ed92 ("drm/xe/vf: Don't emit access to Global HWSP if VF")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250710045945.1023840-1-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit b528e896fa570844d654b5a4617a97fa770a1030)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Pull NVMe fixes from Christoph:
"- revert the cross-controller atomic write size validation that caused
regressions (Christoph Hellwig)
- fix endianness of command word prints in nvme_log_err_passthru()
(John Garry)
- fix callback lock for TLS handshake (Maurizio Lombardi)
- fix misaccounting of nvme-mpath inflight I/O (Yu Kuai)
- fix inconsistent RCU list manipulation in nvme_ns_add_to_ctrl_list()
(Zheng Qixing)"
* tag 'nvme-6.16-2025-07-17' of git://git.infradead.org/nvme:
nvmet-tcp: fix callback lock for TLS handshake
nvme: fix misaccounting of nvme-mpath inflight I/O
nvme: revert the cross-controller atomic write size validation
nvme: fix endianness of command word prints in nvme_log_err_passthru()
nvme: fix inconsistent RCU list manipulation in nvme_ns_add_to_ctrl_list()
|
|
This reverts commit e8afa1557f4f963c9a511bd2c6074a941c308685.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: <stable@vger.kernel.org> # v6.15+
Link: https://lore.kernel.org/r/20250715155934.150656-8-tzimmermann@suse.de
|
|
This reverts commit 1a148af06000e545e714fe3210af3d77ff903c11.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: <stable@vger.kernel.org> # v6.15+
Link: https://lore.kernel.org/r/20250715155934.150656-7-tzimmermann@suse.de
|
|
This reverts commit cce16fcd7446dcff7480cd9d2b6417075ed81065.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: <stable@vger.kernel.org> # v6.15+
Link: https://lore.kernel.org/r/20250715155934.150656-6-tzimmermann@suse.de
|
|
This reverts commit f83a9b8c7fd0557b0c50784bfdc1bbe9140c9bf8.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: <stable@vger.kernel.org> # v6.15+
Link: https://lore.kernel.org/r/20250715155934.150656-5-tzimmermann@suse.de
|
|
This reverts commit e91eb3ae415472b28211d7fed07fa283845b311e.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://lore.kernel.org/r/20250715155934.150656-4-tzimmermann@suse.de
|
|
This reverts commit aec8a40228acb385d60feec59b54573d307e60f3.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://lore.kernel.org/r/20250715155934.150656-3-tzimmermann@suse.de
|
|
This reverts commit 415cb45895f43015515473fbc40563ca5eec9a7c.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://lore.kernel.org/r/20250715155934.150656-3-tzimmermann@suse.de
|
|
When application A submits jobs and application B submits a job with a
dependency on A's fence, the normal flow wakes up the scheduler after
processing each job. However, the optimization in
drm_sched_entity_add_dependency_cb() uses a callback that only clears
dependencies without waking up the scheduler.
When application A is killed before its jobs can run, the callback gets
triggered but only clears the dependency without waking up the scheduler,
causing the scheduler to enter sleep state and application B to hang.
Remove the optimization by deleting drm_sched_entity_clear_dep() and its
usage, ensuring the scheduler is always woken up when dependencies are
cleared.
Fixes: 777dbd458c89 ("drm/amdgpu: drop a dummy wakeup scheduler")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Lin.Cao <lincao12@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250717084453.921097-1-lincao12@amd.com
|
|
A crash in conntrack was reported while trying to unlink the conntrack
entry from the hash bucket list:
[exception RIP: __nf_ct_delete_from_lists+172]
[..]
#7 [ff539b5a2b043aa0] nf_ct_delete at ffffffffc124d421 [nf_conntrack]
#8 [ff539b5a2b043ad0] nf_ct_gc_expired at ffffffffc124d999 [nf_conntrack]
#9 [ff539b5a2b043ae0] __nf_conntrack_find_get at ffffffffc124efbc [nf_conntrack]
[..]
The nf_conn struct is marked as allocated from slab but appears to be in
a partially initialised state:
ct hlist pointer is garbage; looks like the ct hash value
(hence crash).
ct->status is equal to IPS_CONFIRMED|IPS_DYING, which is expected
ct->timeout is 30000 (=30s), which is unexpected.
Everything else looks like normal udp conntrack entry. If we ignore
ct->status and pretend its 0, the entry matches those that are newly
allocated but not yet inserted into the hash:
- ct hlist pointers are overloaded and store/cache the raw tuple hash
- ct->timeout matches the relative time expected for a new udp flow
rather than the absolute 'jiffies' value.
If it were not for the presence of IPS_CONFIRMED,
__nf_conntrack_find_get() would have skipped the entry.
Theory is that we did hit following race:
cpu x cpu y cpu z
found entry E found entry E
E is expired <preemption>
nf_ct_delete()
return E to rcu slab
init_conntrack
E is re-inited,
ct->status set to 0
reply tuplehash hnnode.pprev
stores hash value.
cpu y found E right before it was deleted on cpu x.
E is now re-inited on cpu z. cpu y was preempted before
checking for expiry and/or confirm bit.
->refcnt set to 1
E now owned by skb
->timeout set to 30000
If cpu y were to resume now, it would observe E as
expired but would skip E due to missing CONFIRMED bit.
nf_conntrack_confirm gets called
sets: ct->status |= CONFIRMED
This is wrong: E is not yet added
to hashtable.
cpu y resumes, it observes E as expired but CONFIRMED:
<resumes>
nf_ct_expired()
-> yes (ct->timeout is 30s)
confirmed bit set.
cpu y will try to delete E from the hashtable:
nf_ct_delete() -> set DYING bit
__nf_ct_delete_from_lists
Even this scenario doesn't guarantee a crash:
cpu z still holds the table bucket lock(s) so y blocks:
wait for spinlock held by z
CONFIRMED is set but there is no
guarantee ct will be added to hash:
"chaintoolong" or "clash resolution"
logic both skip the insert step.
reply hnnode.pprev still stores the
hash value.
unlocks spinlock
return NF_DROP
<unblocks, then
crashes on hlist_nulls_del_rcu pprev>
In case CPU z does insert the entry into the hashtable, cpu y will unlink
E again right away but no crash occurs.
Without 'cpu y' race, 'garbage' hlist is of no consequence:
ct refcnt remains at 1, eventually skb will be free'd and E gets
destroyed via: nf_conntrack_put -> nf_conntrack_destroy -> nf_ct_destroy.
To resolve this, move the IPS_CONFIRMED assignment after the table
insertion but before the unlock.
Pablo points out that the confirm-bit-store could be reordered to happen
before hlist add resp. the timeout fixup, so switch to set_bit and
before_atomic memory barrier to prevent this.
It doesn't matter if other CPUs can observe a newly inserted entry right
before the CONFIRMED bit was set:
Such event cannot be distinguished from above "E is the old incarnation"
case: the entry will be skipped.
Also change nf_ct_should_gc() to first check the confirmed bit.
The gc sequence is:
1. Check if entry has expired, if not skip to next entry
2. Obtain a reference to the expired entry.
3. Call nf_ct_should_gc() to double-check step 1.
nf_ct_should_gc() is thus called only for entries that already failed an
expiry check. After this patch, once the confirmed bit check passes
ct->timeout has been altered to reflect the absolute 'best before' date
instead of a relative time. Step 3 will therefore not remove the entry.
Without this change to nf_ct_should_gc() we could still get this sequence:
1. Check if entry has expired.
2. Obtain a reference.
3. Call nf_ct_should_gc() to double-check step 1:
4 - entry is still observed as expired
5 - meanwhile, ct->timeout is corrected to absolute value on other CPU
and confirm bit gets set
6 - confirm bit is seen
7 - valid entry is removed again
First do check 6), then 4) so the gc expiry check always picks up either
confirmed bit unset (entry gets skipped) or expiry re-check failure for
re-inited conntrack objects.
This change cannot be backported to releases before 5.19. Without
commit 8a75a2c17410 ("netfilter: conntrack: remove unconfirmed list")
|= IPS_CONFIRMED line cannot be moved without further changes.
Cc: Razvan Cojocaru <rzvncj@gmail.com>
Link: https://lore.kernel.org/netfilter-devel/20250627142758.25664-1-fw@strlen.de/
Link: https://lore.kernel.org/netfilter-devel/4239da15-83ff-4ca4-939d-faef283471bb@gmail.com/
Fixes: 1397af5bfd7d ("netfilter: conntrack: remove the percpu dying list")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Since "net: gro: use cb instead of skb->network_header", the skb network
header is no longer set in the GRO path.
This breaks fraglist segmentation, which relies on ip_hdr()/tcp_hdr()
to check for address/port changes.
Fix this regression by selectively setting the network header for merged
segment skbs.
Fixes: 186b1ea73ad8 ("net: gro: use cb instead of skb->network_header")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250705150622.10699-1-nbd@nbd.name
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
devm_gpiod_put_array() is meant to undo the effects of
devm_gpiod_get_array() - in particular, it should release the GPIOs
contained in the array acquired with the latter. It is meant to be the
resource-managed version of gpiod_put_array(), and it should behave
similar to the non-array version devm_gpiod_put().
Since commit d1d52c6622a6 ("gpiolib: devres: Finish the conversion to
use devm_add_action()") it doesn't do that anymore, it just removes the
devres action and frees associated memory, but it doesn't actually
release the GPIOs.
Fix by switching from devm_remove_action() to devm_release_action(),
which will in addition invoke the action to release the GPIOs.
Fixes: d1d52c6622a6 ("gpiolib: devres: Finish the conversion to use devm_add_action()")
Signed-off-by: André Draszik <andre.draszik@linaro.org>
Reported-by: Wattson CI <wattson-external@google.com>
Reported-by: Samuel Wu <wusamuel@google.com>
Link: https://lore.kernel.org/r/20250715-gpiolib-devres-put-array-fix-v1-1-970d82a8c887@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
pmc->idev is still used in ip6_mc_clear_src(), so as mld_clear_delrec()
does, the reference should be put after ip6_mc_clear_src() return.
Fixes: 63ed8de4be81 ("mld: add mc_lock for protecting per-interface mld data")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://patch.msgid.link/20250714141957.3301871-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ilya Leoshkevich says:
====================
s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL again
This series fixes a regression causing perf on s390 to trigger a kernel
panic.
Patch 1 fixes the issue, patch 2 adds a test to make sure this doesn't
happen again.
====================
Link: https://patch.msgid.link/20250716194524.48109-1-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a test that invokes a BPF prog in a loop, while concurrently
attaching and detaching another BPF prog to and from it. This helps
identifying race conditions in bpf_arch_text_poke().
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20250716194524.48109-3-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Commit 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic") has
accidentally removed the critical piece of commit c730fce7c70c
("s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL"), causing
intermittent kernel panics in e.g. perf's on_switch() prog to reappear.
Restore the fix and add a comment.
Fixes: 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic")
Cc: stable@vger.kernel.org
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20250716194524.48109-2-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Avoid invoking update_locked_rq() when the runqueue (rq) pointer is NULL
in the SCX_CALL_OP and SCX_CALL_OP_RET macros.
Previously, calling update_locked_rq(NULL) with preemption enabled could
trigger the following warning:
BUG: using __this_cpu_write() in preemptible [00000000]
This happens because __this_cpu_write() is unsafe to use in preemptible
context.
rq is NULL when an ops invoked from an unlocked context. In such cases, we
don't need to store any rq, since the value should already be NULL
(unlocked). Ensure that update_locked_rq() is only called when rq is
non-NULL, preventing calling __this_cpu_write() on preemptible context.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Fixes: 18853ba782bef ("sched_ext: Track currently locked rq")
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v6.15
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-07-15 (ixgbe, fm10k, i40e, ice)
Arnd Bergmann resolves compile issues with large NR_CPUS for ixgbe, fm10k,
and i40e.
For ice:
Dave adds a NULL check for LAG netdev.
Michal corrects a pointer check in debugfs initialization.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: check correct pointer in fwlog debugfs
ice: add NULL check in eswitch lag check
ethernet: intel: fix building with large NR_CPUS
====================
Link: https://patch.msgid.link/20250715202948.3841437-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
np->name was being used after calling of_node_put(np), which
releases the node and can lead to a use-after-free bug.
Previously, of_node_put(np) was called unconditionally after
of_find_device_by_node(np), which could result in a use-after-free if
pdev is NULL.
This patch moves of_node_put(np) after the error check to ensure
the node is only released after both the error and success cases
are handled appropriately, preventing potential resource issues.
Fixes: 23290c7bc190 ("net: airoha: Introduce Airoha NPU support")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250715143102.3458286-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
gso_size is expected by the networking stack to be the size of the
payload (thus, not including ethernet/IP/TCP-headers). However, cqe_bcnt
is the full sized frame (including the headers). Dividing cqe_bcnt by
lro_num_seg will then give incorrect results.
For example, running a bpftrace higher up in the TCP-stack
(tcp_event_data_recv), we commonly have gso_size set to 1450 or 1451 even
though in reality the payload was only 1448 bytes.
This can have unintended consequences:
- In tcp_measure_rcv_mss() len will be for example 1450, but. rcv_mss
will be 1448 (because tp->advmss is 1448). Thus, we will always
recompute scaling_ratio each time an LRO-packet is received.
- In tcp_gro_receive(), it will interfere with the decision whether or
not to flush and thus potentially result in less gro'ed packets.
So, we need to discount the protocol headers from cqe_bcnt so we can
actually divide the payload by lro_num_seg to get the real gso_size.
v2:
- Use "(unsigned char *)tcp + tcp->doff * 4 - skb->data)" to compute header-len
(Tariq Toukan <tariqt@nvidia.com>)
- Improve commit-message (Gal Pressman <gal@nvidia.com>)
Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250715-cpaasch-pf-925-investigate-incorrect-gso_size-on-cx-7-nic-v2-1-e06c3475f3ac@openai.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
maybe_casefold() shouldn't have been nooped, just bch2_casefold().
Fixes: 94426e4201fb ("bcachefs: opts.casefold_disabled")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
94426e4201fb, which added the killswitch for casefolding, accidentally
removed some of the ifdefs we need to avoid build errors.
It appears we need better build testing for different configurations, it
took two weeks for the robots to catch this one.
Fixes: 94426e4201fb ("bcachefs: opts.casefold_disabled")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Use bch2_dev_bucket_tryget() instead of bch2_dev_tryget() before
checking the bucket bitmap.
Reported-by: syzbot+3168625f36f4a539237e@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_btree_node_drop_keys_outside_node() will (re)build aux search
trees, because it's also called by topology repair.
bch2_btree_node_read_done() was calling it before validating individual
keys; invalid ones have to be dropped.
If we call drop_keys_outside_node() first, then
bch2_bset_build_aux_tree() doesn't run because the node already has an
aux search tree - which was invalidated by the repair.
Reported-by: syzbot+c5e7a66b3b23ae65d44f@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The allocator path has a "if we're really low on free buckets, check if
we should issue discards" - tweak this to also trigger discards if more
than 1/128th of the device is in need_discard state.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It becomes possible to do discards after a journal flush, which
naturally the journal code is reponsible for.
A prior refactoring seems to have broken this - which went unnoticed
because the foreground allocator path can also trigger discards.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Commit 42cdf6f687da ("drm/amdgpu/gfx8: always restore kcq MQDs") made the
ring pointer always to be reset on resume from suspend. This caused compute
rings to fail since the reset was done without also resetting it for the
firmware. Reset wptr on the GPU to avoid a disconnect between the driver
and firmware wptr.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3911
Fixes: 42cdf6f687da ("drm/amdgpu/gfx8: always restore kcq MQDs")
Signed-off-by: Eeli Haapalainen <eeli.haapalainen@protonmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2becafc319db3d96205320f31cc0de4ee5a93747)
Cc: stable@vger.kernel.org
|
|
Increment the reset counter only if soft recovery succeeded. This is
consistent with a ring hard reset behaviour where counter gets
incremented only if hard reset succeeded.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 25c314aa3ec3d30e4ee282540e2096b5c66a2437)
Cc: stable@vger.kernel.org
|
|
The function radeon_resume_kms() acquires the console lock. It is
inconsistent, as it depends on the notify_client argument. That
lock then covers a number of suspend operations that are unrelated
to the console.
Remove the calls to console_lock() and console_unlock() from the
radeon function. The console lock is only required by DRM's fbdev
emulation, which acquires it as necessary.
Also fixes a possible circular dependency between the console lock
and the client-list mutex, where the mutex is supposed to be taken
first.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fff8e0504499a929f26e2fb7cf7e2c9854e37b91)
|
|
The radeon driver holds the console lock while suspending in-kernel
DRM clients. This creates a circular dependency with the client-list
mutex, which is supposed to be acquired first. Reported when combining
radeon with another DRM driver.
Therefore, do not take the console lock in radeon, but let the fbdev
DRM client acquire the lock when needed. This is what all other DRM
drivers so.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reported-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Closes: https://lore.kernel.org/dri-devel/0a087cfd-bd4c-48f1-aa2f-4a3b12593935@oss.qualcomm.com/
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 612ec7c69d04cb58beb1332c2806da9f2f47a3ae)
|
|
In DCN401 pre-blending degamma LUT isn't affecting cursor as in previous
DCN version. As this is not the behavior close to what is expected for
CRTC degamma LUT, disable CRTC degamma LUT property in this HW.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4176
---
When enabling HDR on KDE, it takes the first CRTC 1D LUT available and
apply a color transformation (Gamma 2.2 -> PQ). AMD driver usually
advertises a CRTC degamma LUT as the first CRTC 1D LUT, but it's
actually applied pre-blending. In previous HW version, it seems to work
fine because the 1D LUT was applied to cursor too, but DCN401 presents a
different behavior and the 1D LUT isn't affecting the hardware cursor.
To address the wrong gamma on cursor with HDR (see the link), I came up
with this patch that disables CRTC degamma LUT in this hw, since it
presents a different behavior than others. With this KDE sees CRTC
regamma LUT as the first post-blending 1D LUT available. This is
actually more consistent with AMD color pipeline. It was tested by the
reporter, since I don't have the HW available for local testing and
debugging.
Melissa
---
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 340231cdceec2c45995d773a358ca3c341f151aa)
Cc: stable@vger.kernel.org
|
|
[WHY]
Free memory to avoid memory leak
Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Clayton King <clayton.king@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fa699acb8e9be2341ee318077fa119acc7d5f329)
Cc: stable@vger.kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fix from Masami Hiramatsu:
- fprobe-event: The @params variable was being used in an error path
without being initialized. The fix to return an error code.
* tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Avoid using params uninitialized in parse_btf_arg()
|
|
without board ID
For GF variant of WCN6855 without board ID programmed
btusb_generate_qca_nvm_name() will chose wrong NVM
'qca/nvm_usb_00130201.bin' to download.
Fix by choosing right NVM 'qca/nvm_usb_00130201_gf.bin'.
Also simplify NVM choice logic of btusb_generate_qca_nvm_name().
Fixes: d6cba4e6d0e2 ("Bluetooth: btusb: Add support using different nvm for variant WCN6855 controller")
Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The 'quirks' member already ran out of bits on some platforms some time
ago. Replace the integer member by a bitmap in order to have enough bits
in future. Replace raw bit operations by accessor macros.
Fixes: ff26b2dd6568 ("Bluetooth: Add quirk for broken READ_VOICE_SETTING")
Fixes: 127881334eaa ("Bluetooth: Add quirk for broken READ_PAGE_SCAN_TYPE")
Suggested-by: Pauli Virtanen <pav@iki.fi>
Tested-by: Ivan Pravdin <ipravdin.official@gmail.com>
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Macro parameters should always be put into braces when accessing it.
Fixes: 4fc9857ab8c6 ("Bluetooth: hci_sync: Add check simultaneous roles support")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|