Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix two cpufreq issues, one in the core and one in the
intel_pstate driver:
- Fix CPU device node reference counting in the cpufreq core (Miquel
Sabaté Solà)
- Turn the spinlock used by the intel_pstate driver in hard IRQ
context into a raw one to prevent the driver from crashing when
PREEMPT_RT is enabled (Uwe Kleine-König)"
* tag 'pm-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: Avoid a bad reference count on CPU node
cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock
|
|
Andy Roulin says:
====================
netfilter: br_netfilter: fix panic with metadata_dst skb
There's a kernel panic possible in the br_netfilter module when sending
untagged traffic via a VxLAN device. Traceback is included below.
This happens during the check for fragmentation in br_nf_dev_queue_xmit
if the MTU on the VxLAN device is not big enough.
It is dependent on:
1) the br_netfilter module being loaded;
2) net.bridge.bridge-nf-call-iptables set to 1;
3) a bridge with a VxLAN (single-vxlan-device) netdevice as a bridge port;
4) untagged frames with size higher than the VxLAN MTU forwarded/flooded
This case was never supported in the first place, so the first patch drops
such packets.
A regression selftest is added as part of the second patch.
PING 10.0.0.2 (10.0.0.2) from 0.0.0.0 h1-eth0: 2000(2028) bytes of data.
[ 176.291791] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000110
[ 176.292101] Mem abort info:
[ 176.292184] ESR = 0x0000000096000004
[ 176.292322] EC = 0x25: DABT (current EL), IL = 32 bits
[ 176.292530] SET = 0, FnV = 0
[ 176.292709] EA = 0, S1PTW = 0
[ 176.292862] FSC = 0x04: level 0 translation fault
[ 176.293013] Data abort info:
[ 176.293104] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 176.293488] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 176.293787] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 176.293995] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000043ef5000
[ 176.294166] [0000000000000110] pgd=0000000000000000,
p4d=0000000000000000
[ 176.294827] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 176.295252] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel veth
br_netfilter bridge stp llc ipv6 crct10dif_ce
[ 176.295923] CPU: 0 PID: 188 Comm: ping Not tainted
6.8.0-rc3-g5b3fbd61b9d1 #2
[ 176.296314] Hardware name: linux,dummy-virt (DT)
[ 176.296535] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[ 176.296808] pc : br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter]
[ 176.297382] lr : br_nf_dev_queue_xmit+0x2ac/0x4ec [br_netfilter]
[ 176.297636] sp : ffff800080003630
[ 176.297743] x29: ffff800080003630 x28: 0000000000000008 x27:
ffff6828c49ad9f8
[ 176.298093] x26: ffff6828c49ad000 x25: 0000000000000000 x24:
00000000000003e8
[ 176.298430] x23: 0000000000000000 x22: ffff6828c4960b40 x21:
ffff6828c3b16d28
[ 176.298652] x20: ffff6828c3167048 x19: ffff6828c3b16d00 x18:
0000000000000014
[ 176.298926] x17: ffffb0476322f000 x16: ffffb7e164023730 x15:
0000000095744632
[ 176.299296] x14: ffff6828c3f1c880 x13: 0000000000000002 x12:
ffffb7e137926a70
[ 176.299574] x11: 0000000000000001 x10: ffff6828c3f1c898 x9 :
0000000000000000
[ 176.300049] x8 : ffff6828c49bf070 x7 : 0008460f18d5f20e x6 :
f20e0100bebafeca
[ 176.300302] x5 : ffff6828c7f918fe x4 : ffff6828c49bf070 x3 :
0000000000000000
[ 176.300586] x2 : 0000000000000000 x1 : ffff6828c3c7ad00 x0 :
ffff6828c7f918f0
[ 176.300889] Call trace:
[ 176.301123] br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter]
[ 176.301411] br_nf_post_routing+0x2a8/0x3e4 [br_netfilter]
[ 176.301703] nf_hook_slow+0x48/0x124
[ 176.302060] br_forward_finish+0xc8/0xe8 [bridge]
[ 176.302371] br_nf_hook_thresh+0x124/0x134 [br_netfilter]
[ 176.302605] br_nf_forward_finish+0x118/0x22c [br_netfilter]
[ 176.302824] br_nf_forward_ip.part.0+0x264/0x290 [br_netfilter]
[ 176.303136] br_nf_forward+0x2b8/0x4e0 [br_netfilter]
[ 176.303359] nf_hook_slow+0x48/0x124
[ 176.303803] __br_forward+0xc4/0x194 [bridge]
[ 176.304013] br_flood+0xd4/0x168 [bridge]
[ 176.304300] br_handle_frame_finish+0x1d4/0x5c4 [bridge]
[ 176.304536] br_nf_hook_thresh+0x124/0x134 [br_netfilter]
[ 176.304978] br_nf_pre_routing_finish+0x29c/0x494 [br_netfilter]
[ 176.305188] br_nf_pre_routing+0x250/0x524 [br_netfilter]
[ 176.305428] br_handle_frame+0x244/0x3cc [bridge]
[ 176.305695] __netif_receive_skb_core.constprop.0+0x33c/0xecc
[ 176.306080] __netif_receive_skb_one_core+0x40/0x8c
[ 176.306197] __netif_receive_skb+0x18/0x64
[ 176.306369] process_backlog+0x80/0x124
[ 176.306540] __napi_poll+0x38/0x17c
[ 176.306636] net_rx_action+0x124/0x26c
[ 176.306758] __do_softirq+0x100/0x26c
[ 176.307051] ____do_softirq+0x10/0x1c
[ 176.307162] call_on_irq_stack+0x24/0x4c
[ 176.307289] do_softirq_own_stack+0x1c/0x2c
[ 176.307396] do_softirq+0x54/0x6c
[ 176.307485] __local_bh_enable_ip+0x8c/0x98
[ 176.307637] __dev_queue_xmit+0x22c/0xd28
[ 176.307775] neigh_resolve_output+0xf4/0x1a0
[ 176.308018] ip_finish_output2+0x1c8/0x628
[ 176.308137] ip_do_fragment+0x5b4/0x658
[ 176.308279] ip_fragment.constprop.0+0x48/0xec
[ 176.308420] __ip_finish_output+0xa4/0x254
[ 176.308593] ip_finish_output+0x34/0x130
[ 176.308814] ip_output+0x6c/0x108
[ 176.308929] ip_send_skb+0x50/0xf0
[ 176.309095] ip_push_pending_frames+0x30/0x54
[ 176.309254] raw_sendmsg+0x758/0xaec
[ 176.309568] inet_sendmsg+0x44/0x70
[ 176.309667] __sys_sendto+0x110/0x178
[ 176.309758] __arm64_sys_sendto+0x28/0x38
[ 176.309918] invoke_syscall+0x48/0x110
[ 176.310211] el0_svc_common.constprop.0+0x40/0xe0
[ 176.310353] do_el0_svc+0x1c/0x28
[ 176.310434] el0_svc+0x34/0xb4
[ 176.310551] el0t_64_sync_handler+0x120/0x12c
[ 176.310690] el0t_64_sync+0x190/0x194
[ 176.311066] Code: f9402e61 79402aa2 927ff821 f9400023 (f9408860)
[ 176.315743] ---[ end trace 0000000000000000 ]---
[ 176.316060] Kernel panic - not syncing: Oops: Fatal exception in
interrupt
[ 176.316371] Kernel Offset: 0x37e0e3000000 from 0xffff800080000000
[ 176.316564] PHYS_OFFSET: 0xffff97d780000000
[ 176.316782] CPU features: 0x0,88000203,3c020000,0100421b
[ 176.317210] Memory Limit: none
[ 176.317527] ---[ end Kernel panic - not syncing: Oops: Fatal
Exception in interrupt ]---\
====================
Link: https://patch.msgid.link/20241001154400.22787-1-aroulin@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new netfilter selftests to test against br_netfilter panics when
VxLAN single-device is used together with untagged traffic and high MTU.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Andy Roulin <aroulin@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20241001154400.22787-3-aroulin@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix a kernel panic in the br_netfilter module when sending untagged
traffic via a VxLAN device.
This happens during the check for fragmentation in br_nf_dev_queue_xmit.
It is dependent on:
1) the br_netfilter module being loaded;
2) net.bridge.bridge-nf-call-iptables set to 1;
3) a bridge with a VxLAN (single-vxlan-device) netdevice as a bridge port;
4) untagged frames with size higher than the VxLAN MTU forwarded/flooded
When forwarding the untagged packet to the VxLAN bridge port, before
the netfilter hooks are called, br_handle_egress_vlan_tunnel is called and
changes the skb_dst to the tunnel dst. The tunnel_dst is a metadata type
of dst, i.e., skb_valid_dst(skb) is false, and metadata->dst.dev is NULL.
Then in the br_netfilter hooks, in br_nf_dev_queue_xmit, there's a check
for frames that needs to be fragmented: frames with higher MTU than the
VxLAN device end up calling br_nf_ip_fragment, which in turns call
ip_skb_dst_mtu.
The ip_dst_mtu tries to use the skb_dst(skb) as if it was a valid dst
with valid dst->dev, thus the crash.
This case was never supported in the first place, so drop the packet
instead.
PING 10.0.0.2 (10.0.0.2) from 0.0.0.0 h1-eth0: 2000(2028) bytes of data.
[ 176.291791] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000110
[ 176.292101] Mem abort info:
[ 176.292184] ESR = 0x0000000096000004
[ 176.292322] EC = 0x25: DABT (current EL), IL = 32 bits
[ 176.292530] SET = 0, FnV = 0
[ 176.292709] EA = 0, S1PTW = 0
[ 176.292862] FSC = 0x04: level 0 translation fault
[ 176.293013] Data abort info:
[ 176.293104] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 176.293488] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 176.293787] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 176.293995] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000043ef5000
[ 176.294166] [0000000000000110] pgd=0000000000000000,
p4d=0000000000000000
[ 176.294827] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 176.295252] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel veth
br_netfilter bridge stp llc ipv6 crct10dif_ce
[ 176.295923] CPU: 0 PID: 188 Comm: ping Not tainted
6.8.0-rc3-g5b3fbd61b9d1 #2
[ 176.296314] Hardware name: linux,dummy-virt (DT)
[ 176.296535] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[ 176.296808] pc : br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter]
[ 176.297382] lr : br_nf_dev_queue_xmit+0x2ac/0x4ec [br_netfilter]
[ 176.297636] sp : ffff800080003630
[ 176.297743] x29: ffff800080003630 x28: 0000000000000008 x27:
ffff6828c49ad9f8
[ 176.298093] x26: ffff6828c49ad000 x25: 0000000000000000 x24:
00000000000003e8
[ 176.298430] x23: 0000000000000000 x22: ffff6828c4960b40 x21:
ffff6828c3b16d28
[ 176.298652] x20: ffff6828c3167048 x19: ffff6828c3b16d00 x18:
0000000000000014
[ 176.298926] x17: ffffb0476322f000 x16: ffffb7e164023730 x15:
0000000095744632
[ 176.299296] x14: ffff6828c3f1c880 x13: 0000000000000002 x12:
ffffb7e137926a70
[ 176.299574] x11: 0000000000000001 x10: ffff6828c3f1c898 x9 :
0000000000000000
[ 176.300049] x8 : ffff6828c49bf070 x7 : 0008460f18d5f20e x6 :
f20e0100bebafeca
[ 176.300302] x5 : ffff6828c7f918fe x4 : ffff6828c49bf070 x3 :
0000000000000000
[ 176.300586] x2 : 0000000000000000 x1 : ffff6828c3c7ad00 x0 :
ffff6828c7f918f0
[ 176.300889] Call trace:
[ 176.301123] br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter]
[ 176.301411] br_nf_post_routing+0x2a8/0x3e4 [br_netfilter]
[ 176.301703] nf_hook_slow+0x48/0x124
[ 176.302060] br_forward_finish+0xc8/0xe8 [bridge]
[ 176.302371] br_nf_hook_thresh+0x124/0x134 [br_netfilter]
[ 176.302605] br_nf_forward_finish+0x118/0x22c [br_netfilter]
[ 176.302824] br_nf_forward_ip.part.0+0x264/0x290 [br_netfilter]
[ 176.303136] br_nf_forward+0x2b8/0x4e0 [br_netfilter]
[ 176.303359] nf_hook_slow+0x48/0x124
[ 176.303803] __br_forward+0xc4/0x194 [bridge]
[ 176.304013] br_flood+0xd4/0x168 [bridge]
[ 176.304300] br_handle_frame_finish+0x1d4/0x5c4 [bridge]
[ 176.304536] br_nf_hook_thresh+0x124/0x134 [br_netfilter]
[ 176.304978] br_nf_pre_routing_finish+0x29c/0x494 [br_netfilter]
[ 176.305188] br_nf_pre_routing+0x250/0x524 [br_netfilter]
[ 176.305428] br_handle_frame+0x244/0x3cc [bridge]
[ 176.305695] __netif_receive_skb_core.constprop.0+0x33c/0xecc
[ 176.306080] __netif_receive_skb_one_core+0x40/0x8c
[ 176.306197] __netif_receive_skb+0x18/0x64
[ 176.306369] process_backlog+0x80/0x124
[ 176.306540] __napi_poll+0x38/0x17c
[ 176.306636] net_rx_action+0x124/0x26c
[ 176.306758] __do_softirq+0x100/0x26c
[ 176.307051] ____do_softirq+0x10/0x1c
[ 176.307162] call_on_irq_stack+0x24/0x4c
[ 176.307289] do_softirq_own_stack+0x1c/0x2c
[ 176.307396] do_softirq+0x54/0x6c
[ 176.307485] __local_bh_enable_ip+0x8c/0x98
[ 176.307637] __dev_queue_xmit+0x22c/0xd28
[ 176.307775] neigh_resolve_output+0xf4/0x1a0
[ 176.308018] ip_finish_output2+0x1c8/0x628
[ 176.308137] ip_do_fragment+0x5b4/0x658
[ 176.308279] ip_fragment.constprop.0+0x48/0xec
[ 176.308420] __ip_finish_output+0xa4/0x254
[ 176.308593] ip_finish_output+0x34/0x130
[ 176.308814] ip_output+0x6c/0x108
[ 176.308929] ip_send_skb+0x50/0xf0
[ 176.309095] ip_push_pending_frames+0x30/0x54
[ 176.309254] raw_sendmsg+0x758/0xaec
[ 176.309568] inet_sendmsg+0x44/0x70
[ 176.309667] __sys_sendto+0x110/0x178
[ 176.309758] __arm64_sys_sendto+0x28/0x38
[ 176.309918] invoke_syscall+0x48/0x110
[ 176.310211] el0_svc_common.constprop.0+0x40/0xe0
[ 176.310353] do_el0_svc+0x1c/0x28
[ 176.310434] el0_svc+0x34/0xb4
[ 176.310551] el0t_64_sync_handler+0x120/0x12c
[ 176.310690] el0t_64_sync+0x190/0x194
[ 176.311066] Code: f9402e61 79402aa2 927ff821 f9400023 (f9408860)
[ 176.315743] ---[ end trace 0000000000000000 ]---
[ 176.316060] Kernel panic - not syncing: Oops: Fatal exception in
interrupt
[ 176.316371] Kernel Offset: 0x37e0e3000000 from 0xffff800080000000
[ 176.316564] PHYS_OFFSET: 0xffff97d780000000
[ 176.316782] CPU features: 0x0,88000203,3c020000,0100421b
[ 176.317210] Memory Limit: none
[ 176.317527] ---[ end Kernel panic - not syncing: Oops: Fatal
Exception in interrupt ]---\
Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths")
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Andy Roulin <aroulin@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20241001154400.22787-2-aroulin@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Otherwise nfsd_file_acquire, nfsd_file_insert_err, and
nfsd_file_cons_err will hit a NULL pointer when they are enabled and
LOCALIO used.
Example trace output (note xid is 0x0 and LOCALIO flag set):
nfsd_file_acquire: xid=0x0 inode=0000000069a1b2e7
may_flags=WRITE|LOCALIO ref=1 nf_flags=HASHED|GC nf_may=WRITE
nf_file=0000000070123234 status=0
Fixes: c63f0e48febf ("nfsd: add nfsd_file_acquire_local()")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix a potential NULL-pointer dereference in gpiolib core
- fix a probe() regression from the v6.12 merge window and an older bug
leading to missed interrupts in gpio-davinci
* tag 'gpio-fixes-for-v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpiolib: Fix potential NULL pointer dereference in gpiod_get_label()
gpio: davinci: Fix condition for irqchip registration
gpio: davinci: fix lazy disable
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Slightly high amount of changes in this round, partly because of my
vacation in the last weeks. But all changes are small and nothing
looks worrisome.
The biggest LOCs is MAINTAINERS updates, and there is a core change
for card-ID string creation for non-ASCII inputs. Others are rather
device-specific, such as new quirks and device IDs for ASoC, usual
HD-audio and USB-audio quirks and fixes, as well as regression fixes
in HD-audio HDMI audio and Conexant codec"
* tag 'sound-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (39 commits)
ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin
ALSA: line6: add hw monitor volume control to POD HD500X
ALSA: gus: Fix some error handling paths related to get_bpos() usage
ALSA: hda: Add missing parameter description for snd_hdac_stream_timecounter_init()
ALSA: usb-audio: Add native DSD support for Luxman D-08u
ALSA: core: add isascii() check to card ID generator
MAINTAINERS: ALSA: use linux-sound@vger.kernel.org list
Revert "ALSA: hda: Conditionally use snooping for AMD HDMI"
ASoC: intel: sof_sdw: Add check devm_kasprintf() returned value
ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m
ASoC: dt-bindings: davinci-mcasp: Fix interrupts property
ASoC: qcom: sm8250: add qrb4210-rb2-sndcard compatible string
ASoC: dt-bindings: qcom,sm8250: add qrb4210-rb2-sndcard
ALSA: hda: fix trigger_tstamp_latched
ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200
ALSA: hda/generic: Drop obsoleted obey_preferred_dacs flag
ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs
ALSA: silence integer wrapping warning
ASoC: Intel: soc-acpi: arl: Fix some missing empty terminators
ASoC: Intel: soc-acpi-intel-rpl-match: add missing empty item
...
|
|
Pull drm fixes from Dave Airlie:
"Weekly fixes, xe and amdgpu lead the way, with panthor, and few core
components getting various fixes. Nothing seems too out of the
ordinary.
atomic:
- Use correct type when reading damage rectangles
display:
- Fix kernel docs
dp-mst:
- Fix DSC decompression detection
hdmi:
- Fix infoframe size
sched:
- Update maintainers
- Fix race condition whne queueing up jobs
- Fix locking in drm_sched_entity_modify_sched()
- Fix pointer deref if entity queue changes
sysfb:
- Disable sysfb if framebuffer parent device is unknown
amdgpu:
- DML2 fix
- DSC fix
- Dispclk fix
- eDP HDR fix
- IPS fix
- TBT fix
i915:
- One fix for bitwise and logical "and" mixup in PM code
xe:
- Restore pci state on resume
- Fix locking on submission, queue and vm
- Fix UAF on queue destruction
- Fix resource release on freq init error path
- Use rw_semaphore to reduce contention on ASID->VM lookup
- Fix steering for media on Xe2_HPM
- Tuning updates to Xe2
- Resume TDR after GT reset to prevent jobs running forever
- Move id allocation to avoid userspace using a guessed number to
trigger UAF
- Fix OA stream close preventing pbatch buffers to complete
- Fix NPD when migrating memory on LNL
- Fix memory leak when aborting binds
panthor:
- Fix locking
- Set FOP_UNSIGNED_OFFSET in fops instance
- Acquire lock in panthor_vm_prepare_map_op_ctx()
- Avoid uninitialized variable in tick_ctx_cleanup()
- Do not block scheduler queue if work is pending
- Do not add write fences to the shared BOs
vbox:
- Fix VLA handling"
* tag 'drm-fixes-2024-10-04' of https://gitlab.freedesktop.org/drm/kernel: (41 commits)
drm/xe: Fix memory leak when aborting binds
drm/xe: Prevent null pointer access in xe_migrate_copy
drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close
drm/xe/queue: move xa_alloc to prevent UAF
drm/xe/vm: move xa_alloc to prevent UAF
drm/xe: Clean up VM / exec queue file lock usage.
drm/xe: Resume TDR after GT reset
drm/xe/xe2: Add performance tuning for L3 cache flushing
drm/xe/xe2: Extend performance tuning to media GT
drm/xe/mcr: Use Xe2_LPM steering tables for Xe2_HPM
drm/xe: Use helper for ASID -> VM in GPU faults and access counters
drm/xe: Convert to USM lock to rwsem
drm/xe: use devm_add_action_or_reset() helper
drm/xe: fix UAF around queue destruction
drm/xe/guc_submit: add missing locking in wedged_fini
drm/xe: Restore pci state upon resume
drm/amd/display: Fix system hang while resume with TBT monitor
drm/amd/display: Enable idle workqueue for more IPS modes
drm/amd/display: Add HDR workaround for specific eDP
drm/amd/display: avoid set dispclk to 0
...
|
|
The blamed commit introduced an unexpected regression in the sja1105
driver. Packets from VLAN-unaware bridge ports get received correctly,
but the protocol stack can't seem to decode them properly.
For ds->untag_bridge_pvid users (thus also sja1105), the blamed commit
did introduce a functional change: dsa_switch_rcv() used to call
dsa_untag_bridge_pvid(), which looked like this:
err = br_vlan_get_proto(br, &proto);
if (err)
return skb;
/* Move VLAN tag from data to hwaccel */
if (!skb_vlan_tag_present(skb) && skb->protocol == htons(proto)) {
skb = skb_vlan_untag(skb);
if (!skb)
return NULL;
}
and now it calls dsa_software_vlan_untag() which has just this:
/* Move VLAN tag from data to hwaccel */
if (!skb_vlan_tag_present(skb)) {
skb = skb_vlan_untag(skb);
if (!skb)
return NULL;
}
thus lacks any skb->protocol == bridge VLAN protocol check. That check
is deferred until a later check for skb->vlan_proto (in the hwaccel area).
The new code is problematic because, for VLAN-untagged packets,
skb_vlan_untag() blindly takes the 4 bytes starting with the EtherType
and turns them into a hwaccel VLAN tag. This is what breaks the protocol
stack.
It would be tempting to "make it work as before" and only call
skb_vlan_untag() for those packets with the skb->protocol actually
representing a VLAN.
But the premise of the newly introduced dsa_software_vlan_untag() core
function is not wrong. Drivers set ds->untag_bridge_pvid or
ds->untag_vlan_aware_bridge_pvid presumably because they send all
traffic to the CPU reception path as VLAN-tagged. So why should we spend
any additional CPU cycles assuming that the packet may be VLAN-untagged?
And why does the sja1105 driver opt into ds->untag_bridge_pvid if it
doesn't always deliver packets to the CPU as VLAN-tagged?
The answer to the latter question is indeed more interesting: it doesn't
need to. This got done in commit 884be12f8566 ("net: dsa: sja1105: add
support for imprecise RX"), because I thought it would be needed, but I
didn't realize that it doesn't actually make a difference.
As explained in the commit message of the blamed patch, ds->untag_bridge_pvid
only makes a difference in the VLAN-untagged receive path of a bridge port.
However, in that operating mode, tag_sja1105.c makes use of VLAN tags
with the ETH_P_SJA1105 TPID, and it decodes and consumes these VLAN tags
as if they were DSA tags (aka tag_8021q operation). Even if commit
884be12f8566 ("net: dsa: sja1105: add support for imprecise RX") added
this logic in sja1105_bridge_vlan_add():
/* Always install bridge VLANs as egress-tagged on the CPU port. */
if (dsa_is_cpu_port(ds, port))
flags = 0;
that was for _bridge_ VLANs, which are _not_ committed to hardware
in VLAN-unaware mode (aka the mode where ds->untag_bridge_pvid does
anything at all). Even prior to that change, the tag_8021q VLANs
were always installed as egress-tagged on the CPU port, see
dsa_switch_tag_8021q_vlan_add():
u16 flags = 0; // egress-tagged, non-PVID
if (dsa_port_is_user(dp))
flags |= BRIDGE_VLAN_INFO_UNTAGGED |
BRIDGE_VLAN_INFO_PVID;
err = dsa_port_do_tag_8021q_vlan_add(dp, info->vid,
flags);
if (err)
return err;
Whether the sja1105 driver needs the new flag, ds->untag_vlan_aware_bridge_pvid,
rather than ds->untag_bridge_pvid, is a separate discussion. To fix the
current bug in VLAN-unaware bridge mode, I would argue that the sja1105
driver should not request something it doesn't need, rather than
complicating the core DSA helper. Whereas before the blamed commit, this
setting was harmless, now it has caused breakage.
Fixes: 93e4649efa96 ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241001140206.50933-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull block fixes from Jens Axboe:
- Fix another use-after-free in aoe
- Fixup wrong nested non-saving irq disable/restore in blk-iocost
- Fixup a kerneldoc complaint introduced by a merge window patch
* tag 'block-6.12-20241004' of git://git.kernel.dk/linux:
aoe: fix the potential use-after-free problem in more places
blk_iocost: remove some duplicate irq disable/enables
block: fix blk_rq_map_integrity_sg kernel-doc
|
|
Pull io_uring fixes from Jens Axboe:
- Fix an error path memory leak, if one part fails to allocate.
Obviously not something that'll generally hit without error
injection.
- Fix an io_req_flags_t cast to make sparse happier.
- Improve the recv multishot termination. Not a bug now, but could be
one in the future. This makes it do the same thing that recvmsg does
in terms of when to terminate a request or not.
* tag 'io_uring-6.12-20241004' of git://git.kernel.dk/linux:
io_uring/net: harden multishot termination case for recv
io_uring: fix casts to io_req_flags_t
io_uring: fix memory leak when cache init fail
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify fixes from Jan Kara:
"Fixes for an inotify deadlock and a data race in fsnotify"
* tag 'fsnotify_for_v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
inotify: Fix possible deadlock in fsnotify_destroy_mark
fsnotify: Avoid data race between fsnotify_recalc_mask() and fsnotify_object_watched()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fixes from Jan Kara:
"A couple of UDF error handling fixes for issues spotted by syzbot"
* tag 'fs_for_v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: fix uninit-value use in udf_get_fileshortad
udf: refactor inode_bmap() to handle error
udf: refactor udf_next_aext() to handle error
udf: refactor udf_current_aext() to handle error
|
|
Pull ceph fixes from Ilya Dryomov:
"A fix from Patrick for a variety of CephFS lockup scenarios caused by
a regression in cap handling which sneaked in through the netfs helper
library in 5.18 (marked for stable) and an unrelated one-line cleanup"
* tag 'ceph-for-6.12-rc2' of https://github.com/ceph/ceph-client:
ceph: fix cap ref leak via netfs init_request
ceph: use struct_size() helper in __ceph_pool_perm_get()
|
|
Merge an ACPI backlight (video) quirk and ACPI battery driver fix and
cleanup for 6.12-rc2:
- Add a quirk for Dell OptiPlex 5480 AIO to the ACPI backlight (video)
driver (Hans de Goede).
- Prevent the ACPI battery driver from crashing when unregistering a
battery hook and simplify battery hook locking in it (Armin Wolf).
* acpi-video:
ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO
* acpi-battery:
ACPI: battery: Fix possible crash when unregistering a battery hook
ACPI: battery: Simplify battery hook locking
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- in incremental send, fix invalid clone operation for file that got
its size decreased
- fix __counted_by() annotation of send path cache entries, we do not
store the terminating NUL
- fix a longstanding bug in relocation (and quite hard to hit by
chance), drop back reference cache that can get out of sync after
transaction commit
- wait for fixup worker kthread before finishing umount
- add missing raid-stripe-tree extent for NOCOW files, zoned mode
cannot have NOCOW files but RST is meant to be a standalone feature
- handle transaction start error during relocation, avoid potential
NULL pointer dereference of relocation control structure (reported by
syzbot)
- disable module-wide rate limiting of debug level messages
- minor fix to tracepoint definition (reported by checkpatch.pl)
* tag 'for-6.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: disable rate limiting when debug enabled
btrfs: wait for fixup workers before stopping cleaner kthread during umount
btrfs: fix a NULL pointer dereference when failed to start a new trasacntion
btrfs: send: fix invalid clone operation for file that got its size decreased
btrfs: tracepoints: end assignment with semicolon at btrfs_qgroup_extent event class
btrfs: drop the backref cache during relocation if we commit
btrfs: also add stripe entries for NOCOW writes
btrfs: send: fix buffer overflow detection when copying path to cache entry
|
|
The object pointed to by tz->tzp may still be accessed after being
freed in thermal_zone_device_unregister(), so move the freeing of it
to the point after the removal completion has been completed at which
it cannot be accessed any more.
Fixes: 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure")
Cc: 6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/4623516.LvFx2qVVIh@rjwysocki.net
|
|
There are places in the thermal netlink code where nothing prevents
the thermal zone object from going away while being accessed after it
has been returned by thermal_zone_get_by_id().
To address this, make thermal_zone_get_by_id() get a reference on the
thermal zone device object to be returned with the help of get_device(),
under thermal_list_lock, and adjust all of its callers to this change
with the help of the cleanup.h infrastructure.
Fixes: 1ce50e7d408e ("thermal: core: genetlink support for events/cmd/sampling")
Cc: 6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/6112242.lOV4Wx5bFT@rjwysocki.net
|
|
Pull smb client fixes from Steve French:
- statfs fix (e.g. when limited access to root directory of share)
- special file handling fixes: fix packet validation to avoid buffer
overflow for reparse points, fixes for symlink path parsing (one for
reparse points, and one for SFU use case), and fix for cleanup after
failed SET_REPARSE operation.
- fix for SMB2.1 signing bug introduced by recent patch to NFS symlink
path, and NFS reparse point validation
- comment cleanup
* tag 'v6.12-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Do not convert delimiter when parsing NFS-style symlinks
cifs: Validate content of NFS reparse point buffer
cifs: Fix buffer overflow when parsing NFS reparse points
smb: client: Correct typos in multiple comments across various files
smb: client: use actual path when queryfs
cifs: Remove intermediate object of failed create reparse call
Revert "smb: client: make SHA-512 TFM ephemeral"
smb: Update comments about some reparse point tags
cifs: Check for UTF-16 null codepoint in SFU symlink target location
|
|
Pull close_range() fix from Al Viro:
"Fix the logic in descriptor table trimming"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
close_range(): fix the logics in descriptor table trimming
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
i2c-host fixes for v6.12-rc2
In the stm32f7 a potential deadlock is fixed during runtime
suspend and resume.
|
|
This patch reverts two TOMOYO patches that were merged into Linus' tree
during the v6.12 merge window:
8b985bbfabbe ("tomoyo: allow building as a loadable LSM module")
268225a1de1a ("tomoyo: preparation step for building as a loadable LSM module")
Together these two patches introduced the CONFIG_SECURITY_TOMOYO_LKM
Kconfig build option which enabled a TOMOYO specific dynamic LSM loading
mechanism (see the original commits for more details). Unfortunately,
this approach was widely rejected by the LSM community as well as some
members of the general kernel community. Objections included concerns
over setting a bad precedent regarding individual LSMs managing their
LSM callback registrations as well as general kernel symbol exporting
practices. With little to no support for the CONFIG_SECURITY_TOMOYO_LKM
approach outside of Tetsuo, and multiple objections, we need to revert
these changes.
Link: https://lore.kernel.org/all/0c4b443a-9c72-4800-97e8-a3816b6a9ae2@I-love.SAKURA.ne.jp
Link: https://lore.kernel.org/all/CAHC9VhR=QjdoHG3wJgHFJkKYBg7vkQH2MpffgVzQ0tAByo_wRg@mail.gmail.com
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
I have a ASUS PN51 S mini pc that has two xhci devices. One from AMD,
and other from ASMEDIA. The one from ASMEDIA have problems when resume
from suspend, and keep broken until unplug the power cord. I use this
kernel parameter: xhci-hcd.quirks=128 and then it works ok. I make a
path to reset only the ASMEDIA xhci.
Signed-off-by: Jose Alberto Reguero <jose.alberto.reguero@gmail.com>
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/20240919184202.22249-1-jose.alberto.reguero@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
support
Introduce new kernel config symbol for Microchip usb5744 SMBus programming
support. Since usb5744 i2c initialization routine uses i2c SMBus APIs these
APIs should only be invoked when kernel has I2C support. This new kernel
config describes the dependency on I2C kernel support and fix the below
build issues when USB_ONBOARD_DEV=y and CONFIG_I2C=m.
riscv64-linux-ld: drivers/usb/misc/onboard_usb_dev.o:
undefined reference to `i2c_find_device_by_fwnode'
drivers/usb/misc/onboard_usb_dev.c:408:(.text+0xb24): undefined
reference to `i2c_smbus_write_block_data'
<snip>
Parsing of the i2c-bus bus handle is not put under usb5744 kernel config
check as the intention is to report an error when DT is configured for
usb5744 SMBus support and kernel has USB_ONBOARD_DEV_USB5744 disabled.
Fixes: 6782311d04df ("usb: misc: onboard_usb_dev: add Microchip usb5744 SMBus programming support")
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Suggested-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202409140539.3Axwv38m-lkp@intel.com/
Acked-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Link: https://lore.kernel.org/r/1727529992-476088-1-git-send-email-radhey.shyam.pandey@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This commit addresses an issue where events were being processed when
the controller was in a halted state. To fix this issue by stop
processing the events as the event count was considered stale or
invalid when the controller was halted.
Fixes: fc8bb91bc83e ("usb: dwc3: implement runtime PM")
Cc: stable@kernel.org
Signed-off-by: Selvarasu Ganesan <selvarasu.g@samsung.com>
Suggested-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20240916231813.206-1-selvarasu.g@samsung.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When dwc3_resume_common() returns an error, runtime pm is left in
suspended and disabled state in dwc3_resume(). Since the device
is suspended, its parent devices (like the power domain or glue
driver) could also be suspended and may have released resources
that dwc requires. Consequently, calling dwc3_suspend_common() in
this situation could result in attempts to access unclocked or
unpowered registers.
To prevent these problems, runtime PM should always be re-enabled,
even after failed resume attempts. This ensures that
dwc3_suspend_common() is skipped in such cases.
Fixes: 68c26fe58182 ("usb: dwc3: set pm runtime active before resume common")
Cc: stable@vger.kernel.org
Signed-off-by: Roy Luo <royluo@google.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20240913232145.3507723-1-royluo@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
JieLi tends to use SCSI via USB Mass Storage to implement their own
proprietary commands instead of implementing another USB interface.
Enumerating it as a generic mass storage device will lead to a Hardware
Error sense key get reported.
Ignore this bogus device to prevent appearing a unusable sdX device
file.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Cc: stable <stable@kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20241001083407.8336-1-uwu@icenowy.me
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Registering a gadget driver is expected to complete synchronously and
immediately after calling driver_register() this function checks that
the driver has bound so as to return an error.
Set PROBE_FORCE_SYNCHRONOUS to ensure this is the case even when
asynchronous probing is set as the default.
Fixes: fc274c1e99731 ("USB: gadget: Add a new bus for gadgets")
Cc: stable@vger.kernel.org
Signed-off-by: John Keeping <jkeeping@inmusicbrands.com>
Link: https://lore.kernel.org/r/20240913102325.2826261-1-jkeeping@inmusicbrands.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add the Microsoft Azure Cobalt 100 CPU to the list of CPUs suffering
from erratum 3194386 added in commit 75b3c43eab59 ("arm64: errata:
Expand speculative SSBS workaround")
CC: Mark Rutland <mark.rutland@arm.com>
CC: James More <james.morse@arm.com>
CC: Will Deacon <will@kernel.org>
CC: stable@vger.kernel.org # 6.6+
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Link: https://lore.kernel.org/r/20241003225239.321774-1-eahariha@linux.microsoft.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
We received a regression report for System76 Pangolin (pang14) due to
the recent fix for Tuxedo Sirius devices to support the top speaker.
The reason was the conflicting PCI SSID, as often seen.
As a workaround, now the codec SSID is checked and the quirk is
applied conditionally only to Sirius devices.
Fixes: 4178d78cd7a8 ("ALSA: hda/conexant: Add pincfg quirk to enable top speakers on Sirius devices")
Reported-by: Christian Heusel <christian@heusel.eu>
Reported-by: Jerry <jerryluo225@gmail.com>
Closes: https://lore.kernel.org/c930b6a6-64e5-498f-b65a-1cd5e0a1d733@heusel.eu
Link: https://patch.msgid.link/20241004082602.29016-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Add hw monitor volume control for POD HD500X. This is done adding
LINE6_CAP_HWMON_CTL to the capabilities
Signed-off-by: Hans P. Moller <hmoller@uc.cl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20241003232828.5819-1-hmoller@uc.cl
|
|
If get_bpos() fails, it is likely that the corresponding error code should
be returned.
Fixes: a6970bb1dd99 ("ALSA: gus: Convert to the new PCM ops")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://patch.msgid.link/d9ca841edad697154afa97c73a5d7a14919330d9.1727984008.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The only input fc_rport_set_marginal_state() currently accepts is
"Marginal" when port_state is "Online", and "Online" when the port_state
is "Marginal". It should also allow setting port_state to its current
state, either "Marginal or "Online".
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Link: https://lore.kernel.org/r/20240917230643.966768-1-bmarzins@redhat.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
A regression was introduced with commit dbb2da557a6a ("scsi: wd33c93:
Move the SCSI pointer to private command data") which results in an oops
in wd33c93_intr(). That commit added the scsi_pointer variable and
initialized it from hostdata->connected. However, during selection,
hostdata->connected is not yet valid. Fix this by getting the current
scsi_pointer from hostdata->selecting.
Cc: Daniel Palmer <daniel@0x0f.com>
Cc: Michael Schmitz <schmitzmic@gmail.com>
Cc: stable@kernel.org
Fixes: dbb2da557a6a ("scsi: wd33c93: Move the SCSI pointer to private command data")
Signed-off-by: Daniel Palmer <daniel@0x0f.com>
Co-developed-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/09e11a0a54e6aa2a88bd214526d305aaf018f523.1727926187.git.fthain@linux-m68k.org
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
After commit 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a
work queue"), it can happen that a work item is sent to an uninitialized
work queue. This may has the effect that the item being queued is never
actually queued, and any further actions depending on it will not
proceed.
The following warning is observed while the fnic driver is loaded:
kernel: WARNING: CPU: 11 PID: 0 at ../kernel/workqueue.c:1524 __queue_work+0x373/0x410
kernel: <IRQ>
kernel: queue_work_on+0x3a/0x50
kernel: fnic_wq_copy_cmpl_handler+0x54a/0x730 [fnic 62fbff0c42e7fb825c60a55cde2fb91facb2ed24]
kernel: fnic_isr_msix_wq_copy+0x2d/0x60 [fnic 62fbff0c42e7fb825c60a55cde2fb91facb2ed24]
kernel: __handle_irq_event_percpu+0x36/0x1a0
kernel: handle_irq_event_percpu+0x30/0x70
kernel: handle_irq_event+0x34/0x60
kernel: handle_edge_irq+0x7e/0x1a0
kernel: __common_interrupt+0x3b/0xb0
kernel: common_interrupt+0x58/0xa0
kernel: </IRQ>
It has been observed that this may break the rediscovery of Fibre
Channel devices after a temporary fabric failure.
This patch fixes it by moving the work queue initialization out of
an if block in fnic_probe().
Signed-off-by: Martin Wilck <mwilck@suse.com>
Fixes: 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a work queue")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240930133014.71615-1-mwilck@suse.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Replace manual offset calculations for response_upiu and prd_table in
ufshcd_init_lrb() with pre-calculated offsets already stored in the
utp_transfer_req_desc structure. The pre-calculated offsets are set
differently in ufshcd_host_memory_configure() based on the
UFSHCD_QUIRK_PRDT_BYTE_GRAN quirk, ensuring correct alignment and
access.
Fixes: 26f968d7de82 ("scsi: ufs: Introduce UFSHCD_QUIRK_PRDT_BYTE_GRAN quirk")
Cc: stable@vger.kernel.org
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Link: https://lore.kernel.org/r/20240910044543.3812642-1-avri.altman@wdc.com
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Restore pci state on resume (Rodrigo Vivi)
- Fix locking on submission, queue and vm (Matthew Auld, Matthew Brost)
- Fix UAF on queue destruction (Matthew Auld)
- Fix resource release on freq init error path (He Lugang)
- Use rw_semaphore to reduce contention on ASID->VM lookup (Matthew Brost)
- Fix steering for media on Xe2_HPM (Gustavo Sousa)
- Tuning updates to Xe2 (Gustavo Sousa)
- Resume TDR after GT reset to prevent jobs running forever (Matthew Brost)
- Move id allocation to avoid userspace using a guessed number
to trigger UAF (Matthew Auld, Matthew Brost)
- Fix OA stream close preventing pbatch buffers to complete (José)
- Fix NPD when migrating memory on LNL (Zhanjun Dong)
- Fix memory leak when aborting binds (Matthew Brost)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/2fiv63yanlal5mpw3mxtotte6yvkvtex74c7mkjxca4bazlyja@o4iejcfragxy
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-09-30 (ice, idpf)
This series contains updates to ice and idpf drivers:
For ice:
Michal corrects setting of dst VSI on LAN filters and adds clearing of
port VLAN configuration during reset.
Gui-Dong Han corrects failures to decrement refcount in some error
paths.
Przemek resolves a memory leak in ice_init_tx_topology().
Arkadiusz prevents setting of DPLL_PIN_STATE_SELECTABLE to an improper
value.
Dave stops clearing of VLAN tracking bit to allow for VLANs to be properly
restored after reset.
For idpf:
Ahmed sets uninitialized dyn_ctl_intrvl_s value.
Josh corrects use and reporting of mailbox size.
Larysa corrects order of function calls during de-initialization.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
idpf: deinit virtchnl transaction manager after vport and vectors
idpf: use actual mbx receive payload length
idpf: fix VF dynamic interrupt ctl register initialization
ice: fix VLAN replay after reset
ice: disallow DPLL_PIN_STATE_SELECTABLE for dpll output pins
ice: fix memleak in ice_init_tx_topology()
ice: clear port vlan config during reset
ice: Fix improper handling of refcount in ice_sriov_set_msix_vec_count()
ice: Fix improper handling of refcount in ice_dpll_init_rclk_pins()
ice: set correct dst VSI in only LAN filters
====================
Link: https://patch.msgid.link/20240930223601.3137464-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull Rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:
- Fix/improve a couple 'depends on' on the newly added CFI/KASAN
suppport to avoid build errors/warnings
- Fix ARCH_SLAB_MINALIGN multiple definition error for RISC-V under
!CONFIG_MMU
- Clean upcoming (Rust 1.83.0) Clippy warnings
'kernel' crate:
- 'sync' module: fix soundness issue by requiring 'T: Sync' for
'LockedBy::access'; and fix helpers build error under PREEMPT_RT
- Fix trivial sorting issue ('rustfmtcheck') on the v6.12 Rust merge"
* tag 'rust-fixes-6.12' of https://github.com/Rust-for-Linux/linux:
rust: kunit: use C-string literals to clean warning
cfi: encode cfi normalized integers + kasan/gcov bug in Kconfig
rust: KASAN+RETHUNK requires rustc 1.83.0
rust: cfi: fix `patchable-function-entry` starting version
rust: mutex: fix __mutex_init() usage in case of PREEMPT_RT
rust: fix `ARCH_SLAB_MINALIGN` multiple definition error
rust: sync: require `T: Sync` for `LockedBy::access`
rust: kernel: sort Rust modules
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ufs fix from Al Viro:
"Fix ufs_rename() braino introduced this cycle.
The 'folio_release_kmap(dir_folio, new_dir)' in ufs_rename() part of
folio conversion should've been getting a pointer to ufs directory
entry within the page, rather than a pointer to directory struct
inode..."
* tag 'pull-fixes.ufs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs_rename(): fix bogus argument of folio_release_kmap()
|
|
Fix multiple grammatical issues and add a missing period to improve
readability.
Signed-off-by: Leo Stone <leocstone@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240929005001.370991-1-leocstone@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
David Howells says:
====================
rxrpc: Miscellaneous fixes
Here some miscellaneous fixes for AF_RXRPC:
(1) Fix a race in the I/O thread vs UDP socket setup.
(2) Fix an uninitialised variable.
====================
Link: https://patch.msgid.link/20241001132702.3122709-1-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix the uninitialised txb variable in rxrpc_send_data() by moving the code
that loads it above all the jumps to maybe_error, txb being stored back
into call->tx_pending right before the normal return.
Fixes: b0f571ecd794 ("rxrpc: Fix locking in rxrpc's sendmsg")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lists.infradead.org/pipermail/linux-afs/2024-October/008896.html
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/20241001132702.3122709-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In rxrpc_open_socket(), it sets up the socket and then sets up the I/O
thread that will handle it. This is a problem, however, as there's a gap
between the two phases in which a packet may come into rxrpc_encap_rcv()
from the UDP packet but we oops when trying to wake the not-yet created I/O
thread.
As a quick fix, just make rxrpc_encap_rcv() discard the packet if there's
no I/O thread yet.
A better, but more intrusive fix would perhaps be to rearrange things such
that the socket creation is done by the I/O thread.
Fixes: a275da62e8c1 ("rxrpc: Create a per-local endpoint receive queue and I/O thread")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: yuxuanzhe@outlook.com
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241001132702.3122709-2-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Neal Cardwell says:
====================
tcp: 3 fixes for retrans_stamp and undo logic
Geumhwan Yu <geumhwan.yu@samsung.com> recently reported and diagnosed
a regression in TCP loss recovery undo logic in the case where a TCP
connection enters fast recovery, is unable to retransmit anything due to
TSQ, and then receives an ACK allowing forward progress. The sender should
be able to undo the spurious loss recovery in this case, but was not doing
so. The first patch fixes this regression.
Running our suite of packetdrill tests with the first fix, the tests
highlighted two other small bugs in the way retrans_stamp is updated in
some rare corner cases. The second two patches fix those other two small
bugs.
Thanks to Geumhwan Yu for the bug report!
====================
Link: https://patch.msgid.link/20241001200517.2756803-1-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix tcp_rcv_synrecv_state_fastopen() to not zero retrans_stamp
if retransmits are outstanding.
tcp_fastopen_synack_timer() sets retrans_stamp, so typically we'll
need to zero retrans_stamp here to prevent spurious
retransmits_timed_out(). The logic to zero retrans_stamp is from this
2019 commit:
commit cd736d8b67fb ("tcp: fix retrans timestamp on passive Fast Open")
However, in the corner case where the ACK of our TFO SYNACK carried
some SACK blocks that caused us to enter TCP_CA_Recovery then that
non-zero retrans_stamp corresponds to the active fast recovery, and we
need to leave retrans_stamp with its current non-zero value, for
correct ETIMEDOUT and undo behavior.
Fixes: cd736d8b67fb ("tcp: fix retrans timestamp on passive Fast Open")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241001200517.2756803-4-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix tcp_enter_recovery() so that if there are no retransmits out then
we zero retrans_stamp when entering fast recovery. This is necessary
to fix two buggy behaviors.
Currently a non-zero retrans_stamp value can persist across multiple
back-to-back loss recovery episodes. This is because we generally only
clears retrans_stamp if we are completely done with loss recoveries,
and get to tcp_try_to_open() and find !tcp_any_retrans_done(sk). This
behavior causes two bugs:
(1) When a loss recovery episode (CA_Loss or CA_Recovery) is followed
immediately by a new CA_Recovery, the retrans_stamp value can persist
and can be a time before this new CA_Recovery episode starts. That
means that timestamp-based undo will be using the wrong retrans_stamp
(a value that is too old) when comparing incoming TS ecr values to
retrans_stamp to see if the current fast recovery episode can be
undone.
(2) If there is a roughly minutes-long sequence of back-to-back fast
recovery episodes, one after another (e.g. in a shallow-buffered or
policed bottleneck), where each fast recovery successfully makes
forward progress and recovers one window of sequence space (but leaves
at least one retransmit in flight at the end of the recovery),
followed by several RTOs, then the ETIMEDOUT check may be using the
wrong retrans_stamp (a value set at the start of the first fast
recovery in the sequence). This can cause a very premature ETIMEDOUT,
killing the connection prematurely.
This commit changes the code to zero retrans_stamp when entering fast
recovery, when this is known to be safe (no retransmits are out in the
network). That ensures that when starting a fast recovery episode, and
it is safe to do so, retrans_stamp is set when we send the fast
retransmit packet. That addresses both bug (1) and bug (2) by ensuring
that (if no retransmits are out when we start a fast recovery) we use
the initial fast retransmit of this fast recovery as the time value
for undo and ETIMEDOUT calculations.
This makes intuitive sense, since the start of a new fast recovery
episode (in a scenario where no lost packets are out in the network)
means that the connection has made forward progress since the last RTO
or fast recovery, and we should thus "restart the clock" used for both
undo and ETIMEDOUT logic.
Note that if when we start fast recovery there *are* retransmits out
in the network, there can still be undesirable (1)/(2) issues. For
example, after this patch we can still have the (1) and (2) problems
in cases like this:
+ round 1: sender sends flight 1
+ round 2: sender receives SACKs and enters fast recovery 1,
retransmits some packets in flight 1 and then sends some new data as
flight 2
+ round 3: sender receives some SACKs for flight 2, notes losses, and
retransmits some packets to fill the holes in flight 2
+ fast recovery has some lost retransmits in flight 1 and continues
for one or more rounds sending retransmits for flight 1 and flight 2
+ fast recovery 1 completes when snd_una reaches high_seq at end of
flight 1
+ there are still holes in the SACK scoreboard in flight 2, so we
enter fast recovery 2, but some retransmits in the flight 2 sequence
range are still in flight (retrans_out > 0), so we can't execute the
new retrans_stamp=0 added here to clear retrans_stamp
It's not yet clear how to fix these remaining (1)/(2) issues in an
efficient way without breaking undo behavior, given that retrans_stamp
is currently used for undo and ETIMEDOUT. Perhaps the optimal (but
expensive) strategy would be to set retrans_stamp to the timestamp of
the earliest outstanding retransmit when entering fast recovery. But
at least this commit makes things better.
Note that this does not change the semantics of retrans_stamp; it
simply makes retrans_stamp accurate in some cases where it was not
before:
(1) Some loss recovery, followed by an immediate entry into a fast
recovery, where there are no retransmits out when entering the fast
recovery.
(2) When a TFO server has a SYNACK retransmit that sets retrans_stamp,
and then the ACK that completes the 3-way handshake has SACK blocks
that trigger a fast recovery. In this case when entering fast recovery
we want to zero out the retrans_stamp from the TFO SYNACK retransmit,
and set the retrans_stamp based on the timestamp of the fast recovery.
We introduce a tcp_retrans_stamp_cleanup() helper, because this
two-line sequence already appears in 3 places and is about to appear
in 2 more as a result of this bug fix patch series. Once this bug fix
patches series in the net branch makes it into the net-next branch
we'll update the 3 other call sites to use the new helper.
This is a long-standing issue. The Fixes tag below is chosen to be the
oldest commit at which the patch will apply cleanly, which is from
Linux v3.5 in 2012.
Fixes: 1fbc340514fc ("tcp: early retransmit: tcp_enter_recovery()")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241001200517.2756803-3-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix the TCP loss recovery undo logic in tcp_packet_delayed() so that
it can trigger undo even if TSQ prevents a fast recovery episode from
reaching tcp_retransmit_skb().
Geumhwan Yu <geumhwan.yu@samsung.com> recently reported that after
this commit from 2019:
commit bc9f38c8328e ("tcp: avoid unconditional congestion window undo
on SYN retransmit")
...and before this fix we could have buggy scenarios like the
following:
+ Due to reordering, a TCP connection receives some SACKs and enters a
spurious fast recovery.
+ TSQ prevents all invocations of tcp_retransmit_skb(), because many
skbs are queued in lower layers of the sending machine's network
stack; thus tp->retrans_stamp remains 0.
+ The connection receives a TCP timestamp ECR value echoing a
timestamp before the fast recovery, indicating that the fast
recovery was spurious.
+ The connection fails to undo the spurious fast recovery because
tp->retrans_stamp is 0, and thus tcp_packet_delayed() returns false,
due to the new logic in the 2019 commit: commit bc9f38c8328e ("tcp:
avoid unconditional congestion window undo on SYN retransmit")
This fix tweaks the logic to be more similar to the
tcp_packet_delayed() logic before bc9f38c8328e, except that we take
care not to be fooled by the FLAG_SYN_ACKED code path zeroing out
tp->retrans_stamp (the bug noted and fixed by Yuchung in
bc9f38c8328e).
Note that this returns the high-level behavior of tcp_packet_delayed()
to again match the comment for the function, which says: "Nothing was
retransmitted or returned timestamp is less than timestamp of the
first retransmission." Note that this comment is in the original
2005-04-16 Linux git commit, so this is evidently long-standing
behavior.
Fixes: bc9f38c8328e ("tcp: avoid unconditional congestion window undo on SYN retransmit")
Reported-by: Geumhwan Yu <geumhwan.yu@samsung.com>
Diagnosed-by: Geumhwan Yu <geumhwan.yu@samsung.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241001200517.2756803-2-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Abhishek Chauhan says:
====================
Fix AQR PMA capabilities
Patch 1:-
AQR115c reports incorrect PMA capabilities which includes
10G/5G and also incorrectly disables capabilities like autoneg
and 10Mbps support.
AQR115c as per the Marvell databook supports speeds up to 2.5Gbps
with autonegotiation.
Patch 2:-
Remove the use of phy_set_max_speed in phy driver as the
function is mainly used in MAC driver to set the max
speed.
Instead use get_features to fix up Phy PMA capabilities for
AQR111, AQR111B0, AQR114C and AQCS109
====================
Link: https://patch.msgid.link/20241001224626.2400222-1-quic_abchauha@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove the use of phy_set_max_speed in phy driver as the
function is mainly used in MAC driver to set the max
speed.
Instead use get_features to fix up Phy PMA capabilities for
AQR111, AQR111B0, AQR114C and AQCS109
Fixes: 038ba1dc4e54 ("net: phy: aquantia: add AQR111 and AQR111B0 PHY ID")
Fixes: 0974f1f03b07 ("net: phy: aquantia: remove false 5G and 10G speed ability for AQCS109")
Fixes: c278ec644377 ("net: phy: aquantia: add support for AQR114C PHY ID")
Link: https://lore.kernel.org/all/20240913011635.1286027-1-quic_abchauha@quicinc.com/T/
Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20241001224626.2400222-3-quic_abchauha@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|