Age | Commit message (Collapse) | Author |
|
Attempt to enable IPsec packet offload in tunnel mode in debug kernel
generates the following kernel panic, which is happening due to two
issues:
1. In SA add section, the should be _bh() variant when marking SA mode.
2. There is not needed flush_workqueue in SA delete routine. It is not
needed as at this stage as it is removed from SADB and the running work
will be canceled later in SA free.
=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
6.12.0+ #4 Not tainted
-----------------------------------------------------
charon/1337 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire:
ffff88810f365020 (&xa->xa_lock#24){+.+.}-{3:3}, at: mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core]
and this task is already holding:
ffff88813e0f0d48 (&x->lock){+.-.}-{3:3}, at: xfrm_state_delete+0x16/0x30
which would create a new lock dependency:
(&x->lock){+.-.}-{3:3} -> (&xa->xa_lock#24){+.+.}-{3:3}
but this new dependency connects a SOFTIRQ-irq-safe lock:
(&x->lock){+.-.}-{3:3}
... which became SOFTIRQ-irq-safe at:
lock_acquire+0x1be/0x520
_raw_spin_lock_bh+0x34/0x40
xfrm_timer_handler+0x91/0xd70
__hrtimer_run_queues+0x1dd/0xa60
hrtimer_run_softirq+0x146/0x2e0
handle_softirqs+0x266/0x860
irq_exit_rcu+0x115/0x1a0
sysvec_apic_timer_interrupt+0x6e/0x90
asm_sysvec_apic_timer_interrupt+0x16/0x20
default_idle+0x13/0x20
default_idle_call+0x67/0xa0
do_idle+0x2da/0x320
cpu_startup_entry+0x50/0x60
start_secondary+0x213/0x2a0
common_startup_64+0x129/0x138
to a SOFTIRQ-irq-unsafe lock:
(&xa->xa_lock#24){+.+.}-{3:3}
... which became SOFTIRQ-irq-unsafe at:
...
lock_acquire+0x1be/0x520
_raw_spin_lock+0x2c/0x40
xa_set_mark+0x70/0x110
mlx5e_xfrm_add_state+0xe48/0x2290 [mlx5_core]
xfrm_dev_state_add+0x3bb/0xd70
xfrm_add_sa+0x2451/0x4a90
xfrm_user_rcv_msg+0x493/0x880
netlink_rcv_skb+0x12e/0x380
xfrm_netlink_rcv+0x6d/0x90
netlink_unicast+0x42f/0x740
netlink_sendmsg+0x745/0xbe0
__sock_sendmsg+0xc5/0x190
__sys_sendto+0x1fe/0x2c0
__x64_sys_sendto+0xdc/0x1b0
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x4b/0x53
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&xa->xa_lock#24);
local_irq_disable();
lock(&x->lock);
lock(&xa->xa_lock#24);
<Interrupt>
lock(&x->lock);
*** DEADLOCK ***
2 locks held by charon/1337:
#0: ffffffff87f8f858 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{4:4}, at: xfrm_netlink_rcv+0x5e/0x90
#1: ffff88813e0f0d48 (&x->lock){+.-.}-{3:3}, at: xfrm_state_delete+0x16/0x30
the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&x->lock){+.-.}-{3:3} ops: 29 {
HARDIRQ-ON-W at:
lock_acquire+0x1be/0x520
_raw_spin_lock_bh+0x34/0x40
xfrm_alloc_spi+0xc0/0xe60
xfrm_alloc_userspi+0x5f6/0xbc0
xfrm_user_rcv_msg+0x493/0x880
netlink_rcv_skb+0x12e/0x380
xfrm_netlink_rcv+0x6d/0x90
netlink_unicast+0x42f/0x740
netlink_sendmsg+0x745/0xbe0
__sock_sendmsg+0xc5/0x190
__sys_sendto+0x1fe/0x2c0
__x64_sys_sendto+0xdc/0x1b0
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x4b/0x53
IN-SOFTIRQ-W at:
lock_acquire+0x1be/0x520
_raw_spin_lock_bh+0x34/0x40
xfrm_timer_handler+0x91/0xd70
__hrtimer_run_queues+0x1dd/0xa60
hrtimer_run_softirq+0x146/0x2e0
handle_softirqs+0x266/0x860
irq_exit_rcu+0x115/0x1a0
sysvec_apic_timer_interrupt+0x6e/0x90
asm_sysvec_apic_timer_interrupt+0x16/0x20
default_idle+0x13/0x20
default_idle_call+0x67/0xa0
do_idle+0x2da/0x320
cpu_startup_entry+0x50/0x60
start_secondary+0x213/0x2a0
common_startup_64+0x129/0x138
INITIAL USE at:
lock_acquire+0x1be/0x520
_raw_spin_lock_bh+0x34/0x40
xfrm_alloc_spi+0xc0/0xe60
xfrm_alloc_userspi+0x5f6/0xbc0
xfrm_user_rcv_msg+0x493/0x880
netlink_rcv_skb+0x12e/0x380
xfrm_netlink_rcv+0x6d/0x90
netlink_unicast+0x42f/0x740
netlink_sendmsg+0x745/0xbe0
__sock_sendmsg+0xc5/0x190
__sys_sendto+0x1fe/0x2c0
__x64_sys_sendto+0xdc/0x1b0
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x4b/0x53
}
... key at: [<ffffffff87f9cd20>] __key.18+0x0/0x40
the dependencies between the lock to be acquired
and SOFTIRQ-irq-unsafe lock:
-> (&xa->xa_lock#24){+.+.}-{3:3} ops: 9 {
HARDIRQ-ON-W at:
lock_acquire+0x1be/0x520
_raw_spin_lock_bh+0x34/0x40
mlx5e_xfrm_add_state+0xc5b/0x2290 [mlx5_core]
xfrm_dev_state_add+0x3bb/0xd70
xfrm_add_sa+0x2451/0x4a90
xfrm_user_rcv_msg+0x493/0x880
netlink_rcv_skb+0x12e/0x380
xfrm_netlink_rcv+0x6d/0x90
netlink_unicast+0x42f/0x740
netlink_sendmsg+0x745/0xbe0
__sock_sendmsg+0xc5/0x190
__sys_sendto+0x1fe/0x2c0
__x64_sys_sendto+0xdc/0x1b0
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x4b/0x53
SOFTIRQ-ON-W at:
lock_acquire+0x1be/0x520
_raw_spin_lock+0x2c/0x40
xa_set_mark+0x70/0x110
mlx5e_xfrm_add_state+0xe48/0x2290 [mlx5_core]
xfrm_dev_state_add+0x3bb/0xd70
xfrm_add_sa+0x2451/0x4a90
xfrm_user_rcv_msg+0x493/0x880
netlink_rcv_skb+0x12e/0x380
xfrm_netlink_rcv+0x6d/0x90
netlink_unicast+0x42f/0x740
netlink_sendmsg+0x745/0xbe0
__sock_sendmsg+0xc5/0x190
__sys_sendto+0x1fe/0x2c0
__x64_sys_sendto+0xdc/0x1b0
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x4b/0x53
INITIAL USE at:
lock_acquire+0x1be/0x520
_raw_spin_lock_bh+0x34/0x40
mlx5e_xfrm_add_state+0xc5b/0x2290 [mlx5_core]
xfrm_dev_state_add+0x3bb/0xd70
xfrm_add_sa+0x2451/0x4a90
xfrm_user_rcv_msg+0x493/0x880
netlink_rcv_skb+0x12e/0x380
xfrm_netlink_rcv+0x6d/0x90
netlink_unicast+0x42f/0x740
netlink_sendmsg+0x745/0xbe0
__sock_sendmsg+0xc5/0x190
__sys_sendto+0x1fe/0x2c0
__x64_sys_sendto+0xdc/0x1b0
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x4b/0x53
}
... key at: [<ffffffffa078ff60>] __key.48+0x0/0xfffffffffff210a0 [mlx5_core]
... acquired at:
__lock_acquire+0x30a0/0x5040
lock_acquire+0x1be/0x520
_raw_spin_lock_bh+0x34/0x40
mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core]
xfrm_dev_state_delete+0x90/0x160
__xfrm_state_delete+0x662/0xae0
xfrm_state_delete+0x1e/0x30
xfrm_del_sa+0x1c2/0x340
xfrm_user_rcv_msg+0x493/0x880
netlink_rcv_skb+0x12e/0x380
xfrm_netlink_rcv+0x6d/0x90
netlink_unicast+0x42f/0x740
netlink_sendmsg+0x745/0xbe0
__sock_sendmsg+0xc5/0x190
__sys_sendto+0x1fe/0x2c0
__x64_sys_sendto+0xdc/0x1b0
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x4b/0x53
stack backtrace:
CPU: 7 UID: 0 PID: 1337 Comm: charon Not tainted 6.12.0+ #4
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x74/0xd0
check_irq_usage+0x12e8/0x1d90
? print_shortest_lock_dependencies_backwards+0x1b0/0x1b0
? check_chain_key+0x1bb/0x4c0
? __lockdep_reset_lock+0x180/0x180
? check_path.constprop.0+0x24/0x50
? mark_lock+0x108/0x2fb0
? print_circular_bug+0x9b0/0x9b0
? mark_lock+0x108/0x2fb0
? print_usage_bug.part.0+0x670/0x670
? check_prev_add+0x1c4/0x2310
check_prev_add+0x1c4/0x2310
__lock_acquire+0x30a0/0x5040
? lockdep_set_lock_cmp_fn+0x190/0x190
? lockdep_set_lock_cmp_fn+0x190/0x190
lock_acquire+0x1be/0x520
? mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core]
? lockdep_hardirqs_on_prepare+0x400/0x400
? __xfrm_state_delete+0x5f0/0xae0
? lock_downgrade+0x6b0/0x6b0
_raw_spin_lock_bh+0x34/0x40
? mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core]
mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core]
xfrm_dev_state_delete+0x90/0x160
__xfrm_state_delete+0x662/0xae0
xfrm_state_delete+0x1e/0x30
xfrm_del_sa+0x1c2/0x340
? xfrm_get_sa+0x250/0x250
? check_chain_key+0x1bb/0x4c0
xfrm_user_rcv_msg+0x493/0x880
? copy_sec_ctx+0x270/0x270
? check_chain_key+0x1bb/0x4c0
? lockdep_set_lock_cmp_fn+0x190/0x190
? lockdep_set_lock_cmp_fn+0x190/0x190
netlink_rcv_skb+0x12e/0x380
? copy_sec_ctx+0x270/0x270
? netlink_ack+0xd90/0xd90
? netlink_deliver_tap+0xcd/0xb60
xfrm_netlink_rcv+0x6d/0x90
netlink_unicast+0x42f/0x740
? netlink_attachskb+0x730/0x730
? lock_acquire+0x1be/0x520
netlink_sendmsg+0x745/0xbe0
? netlink_unicast+0x740/0x740
? __might_fault+0xbb/0x170
? netlink_unicast+0x740/0x740
__sock_sendmsg+0xc5/0x190
? fdget+0x163/0x1d0
__sys_sendto+0x1fe/0x2c0
? __x64_sys_getpeername+0xb0/0xb0
? do_user_addr_fault+0x856/0xe30
? lock_acquire+0x1be/0x520
? __task_pid_nr_ns+0x117/0x410
? lock_downgrade+0x6b0/0x6b0
__x64_sys_sendto+0xdc/0x1b0
? lockdep_hardirqs_on_prepare+0x284/0x400
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f7d31291ba4
Code: 7d e8 89 4d d4 e8 4c 42 f7 ff 44 8b 4d d0 4c 8b 45 c8 89 c3 44 8b 55 d4 8b 7d e8 b8 2c 00 00 00 48 8b 55 d8 48 8b 75 e0 0f 05 <48> 3d 00 f0 ff ff 77 34 89 df 48 89 45 e8 e8 99 42 f7 ff 48 8b 45
RSP: 002b:00007f7d2ccd94f0 EFLAGS: 00000297 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f7d31291ba4
RDX: 0000000000000028 RSI: 00007f7d2ccd96a0 RDI: 000000000000000a
RBP: 00007f7d2ccd9530 R08: 00007f7d2ccd9598 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000028
R13: 00007f7d2ccd9598 R14: 00007f7d2ccd96a0 R15: 00000000000000e1
</TASK>
Fixes: 4c24272b4e2b ("net/mlx5e: Listen to ARP events to update IPsec L2 headers in tunnel mode")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Clear the port select structure on error so no stale values left after
definers are destroyed. That's because the mlx5_lag_destroy_definers()
always try to destroy all lag definers in the tt_map, so in the flow
below lag definers get double-destroyed and cause kernel crash:
mlx5_lag_port_sel_create()
mlx5_lag_create_definers()
mlx5_lag_create_definer() <- Failed on tt 1
mlx5_lag_destroy_definers() <- definers[tt=0] gets destroyed
mlx5_lag_port_sel_create()
mlx5_lag_create_definers()
mlx5_lag_create_definer() <- Failed on tt 0
mlx5_lag_destroy_definers() <- definers[tt=0] gets double-destroyed
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
Data abort info:
ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 64k pages, 48-bit VAs, pgdp=0000000112ce2e00
[0000000000000008] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
Modules linked in: iptable_raw bonding ip_gre ip6_gre gre ip6_tunnel tunnel6 geneve ip6_udp_tunnel udp_tunnel ipip tunnel4 ip_tunnel rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) ib_uverbs(OE) mlx5_fwctl(OE) fwctl(OE) mlx5_core(OE) mlxdevm(OE) ib_core(OE) mlxfw(OE) memtrack(OE) mlx_compat(OE) openvswitch nsh nf_conncount psample xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc netconsole overlay efi_pstore sch_fq_codel zram ip_tables crct10dif_ce qemu_fw_cfg fuse ipv6 crc_ccitt [last unloaded: mlx_compat(OE)]
CPU: 3 UID: 0 PID: 217 Comm: kworker/u53:2 Tainted: G OE 6.11.0+ #2
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core]
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core]
lr : mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core]
sp : ffff800085fafb00
x29: ffff800085fafb00 x28: ffff0000da0c8000 x27: 0000000000000000
x26: ffff0000da0c8000 x25: ffff0000da0c8000 x24: ffff0000da0c8000
x23: ffff0000c31f81a0 x22: 0400000000000000 x21: ffff0000da0c8000
x20: 0000000000000000 x19: 0000000000000001 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff8b0c9350
x14: 0000000000000000 x13: ffff800081390d18 x12: ffff800081dc3cc0
x11: 0000000000000001 x10: 0000000000000b10 x9 : ffff80007ab7304c
x8 : ffff0000d00711f0 x7 : 0000000000000004 x6 : 0000000000000190
x5 : ffff00027edb3010 x4 : 0000000000000000 x3 : 0000000000000000
x2 : ffff0000d39b8000 x1 : ffff0000d39b8000 x0 : 0400000000000000
Call trace:
mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core]
mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core]
mlx5_lag_destroy_definers+0xa0/0x108 [mlx5_core]
mlx5_lag_port_sel_create+0x2d4/0x6f8 [mlx5_core]
mlx5_activate_lag+0x60c/0x6f8 [mlx5_core]
mlx5_do_bond_work+0x284/0x5c8 [mlx5_core]
process_one_work+0x170/0x3e0
worker_thread+0x2d8/0x3e0
kthread+0x11c/0x128
ret_from_fork+0x10/0x20
Code: a9025bf5 aa0003f6 a90363f7 f90023f9 (f9400400)
---[ end trace 0000000000000000 ]---
Fixes: dc48516ec7d3 ("net/mlx5: Lag, add support to create definers for LAG")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If failed to add SF, error handling doesn't delete the SF from the
SF table. But the hw resources are deleted. So when unload driver,
hw resources will be deleted again. Firmware will report syndrome
0x68def3 which means "SF is not allocated can not deallocate".
Fix it by delete SF from SF table if failed to add SF.
Fixes: 2597ee190b4e ("net/mlx5: Call mlx5_sf_id_erase() once in mlx5_sf_dealloc()")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Fix a lockdep warning [1] observed during the write combining test.
The warning indicates a potential nested lock scenario that could lead
to a deadlock.
However, this is a false positive alarm because the SF lock and its
parent lock are distinct ones.
The lockdep confusion arises because the locks belong to the same object
class (i.e., struct mlx5_core_dev).
To resolve this, the code has been refactored to avoid taking both
locks. Instead, only the parent lock is acquired.
[1]
raw_ethernet_bw/2118 is trying to acquire lock:
[ 213.619032] ffff88811dd75e08 (&dev->wc_state_lock){+.+.}-{3:3}, at:
mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
[ 213.620270]
[ 213.620270] but task is already holding lock:
[ 213.620943] ffff88810b585e08 (&dev->wc_state_lock){+.+.}-{3:3}, at:
mlx5_wc_support_get+0x10c/0x210 [mlx5_core]
[ 213.622045]
[ 213.622045] other info that might help us debug this:
[ 213.622778] Possible unsafe locking scenario:
[ 213.622778]
[ 213.623465] CPU0
[ 213.623815] ----
[ 213.624148] lock(&dev->wc_state_lock);
[ 213.624615] lock(&dev->wc_state_lock);
[ 213.625071]
[ 213.625071] *** DEADLOCK ***
[ 213.625071]
[ 213.625805] May be due to missing lock nesting notation
[ 213.625805]
[ 213.626522] 4 locks held by raw_ethernet_bw/2118:
[ 213.627019] #0: ffff88813f80d578 (&uverbs_dev->disassociate_srcu){.+.+}-{0:0},
at: ib_uverbs_ioctl+0xc4/0x170 [ib_uverbs]
[ 213.628088] #1: ffff88810fb23930 (&file->hw_destroy_rwsem){.+.+}-{3:3},
at: ib_init_ucontext+0x2d/0xf0 [ib_uverbs]
[ 213.629094] #2: ffff88810fb23878 (&file->ucontext_lock){+.+.}-{3:3},
at: ib_init_ucontext+0x49/0xf0 [ib_uverbs]
[ 213.630106] #3: ffff88810b585e08 (&dev->wc_state_lock){+.+.}-{3:3},
at: mlx5_wc_support_get+0x10c/0x210 [mlx5_core]
[ 213.631185]
[ 213.631185] stack backtrace:
[ 213.631718] CPU: 1 UID: 0 PID: 2118 Comm: raw_ethernet_bw Not tainted
6.12.0-rc7_internal_net_next_mlx5_89a0ad0 #1
[ 213.632722] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 213.633785] Call Trace:
[ 213.634099]
[ 213.634393] dump_stack_lvl+0x7e/0xc0
[ 213.634806] print_deadlock_bug+0x278/0x3c0
[ 213.635265] __lock_acquire+0x15f4/0x2c40
[ 213.635712] lock_acquire+0xcd/0x2d0
[ 213.636120] ? mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
[ 213.636722] ? mlx5_ib_enable_lb+0x24/0xa0 [mlx5_ib]
[ 213.637277] __mutex_lock+0x81/0xda0
[ 213.637697] ? mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
[ 213.638305] ? mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
[ 213.638902] ? rcu_read_lock_sched_held+0x3f/0x70
[ 213.639400] ? mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
[ 213.640016] mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
[ 213.640615] set_ucontext_resp+0x68/0x2b0 [mlx5_ib]
[ 213.641144] ? debug_mutex_init+0x33/0x40
[ 213.641586] mlx5_ib_alloc_ucontext+0x18e/0x7b0 [mlx5_ib]
[ 213.642145] ib_init_ucontext+0xa0/0xf0 [ib_uverbs]
[ 213.642679] ib_uverbs_handler_UVERBS_METHOD_GET_CONTEXT+0x95/0xc0
[ib_uverbs]
[ 213.643426] ? _copy_from_user+0x46/0x80
[ 213.643878] ib_uverbs_cmd_verbs+0xa6b/0xc80 [ib_uverbs]
[ 213.644426] ? ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x130/0x130
[ib_uverbs]
[ 213.645213] ? __lock_acquire+0xa99/0x2c40
[ 213.645675] ? lock_acquire+0xcd/0x2d0
[ 213.646101] ? ib_uverbs_ioctl+0xc4/0x170 [ib_uverbs]
[ 213.646625] ? reacquire_held_locks+0xcf/0x1f0
[ 213.647102] ? do_user_addr_fault+0x45d/0x770
[ 213.647586] ib_uverbs_ioctl+0xe0/0x170 [ib_uverbs]
[ 213.648102] ? ib_uverbs_ioctl+0xc4/0x170 [ib_uverbs]
[ 213.648632] __x64_sys_ioctl+0x4d3/0xaa0
[ 213.649060] ? do_user_addr_fault+0x4a8/0x770
[ 213.649528] do_syscall_64+0x6d/0x140
[ 213.649947] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 213.650478] RIP: 0033:0x7fa179b0737b
[ 213.650893] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c
89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8
10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d
7d 2a 0f 00 f7 d8 64 89 01 48
[ 213.652619] RSP: 002b:00007ffd2e6d46e8 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[ 213.653390] RAX: ffffffffffffffda RBX: 00007ffd2e6d47f8 RCX:
00007fa179b0737b
[ 213.654084] RDX: 00007ffd2e6d47e0 RSI: 00000000c0181b01 RDI:
0000000000000003
[ 213.654767] RBP: 00007ffd2e6d47c0 R08: 00007fa1799be010 R09:
0000000000000002
[ 213.655453] R10: 00007ffd2e6d4960 R11: 0000000000000246 R12:
00007ffd2e6d487c
[ 213.656170] R13: 0000000000000027 R14: 0000000000000001 R15:
00007ffd2e6d4f70
Fixes: d98995b4bf98 ("net/mlx5: Reimplement write combining test")
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
User added steering rules at RDMA_TX were being added to the first prio,
which is the counters prio.
Fix that so that they are correctly added to the BYPASS_PRIO instead.
Fixes: 24670b1a3166 ("net/mlx5: Add support for RDMA TX steering")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Fix two kernel doc warnings introduced by the recent DP audio patch:
- Add a doc line for the new "audio" field
- Remove a reference to zynqmp_dpsub.c from zynqmp.rst, as the .c file
no longer has structured comments
Fixes: 3ec5c1579305 ("drm: xlnx: zynqmp_dpsub: Add DP audio support")
Closes: https://lore.kernel.org/all/20241220154208.720d990b@canb.auug.org.au/
Reviewed-by: Vishal Sagar <vishal.sagar@amd.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241220-xilinx-dp-audio-doc-fix-v1-1-cc488996e463@ideasonboard.com
(cherry picked from commit 96b5d2e807f667320c66f41ddc1c473023a73ab2)
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
|
The size of DMA descriptors is 32 bytes at most.
net_prefetch() for received frames, and keep prefetch() for descriptors.
This patch brings ~4.8% driver performance improvement in a TCP RX
throughput test with iPerf tool on a single isolated Cortex-A65 CPU
core, 2.92 Gbits/sec increased to 3.06 Gbits/sec.
Suggested-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Current code prefetches cache lines for the received frame first, and
then dma_sync_single_for_cpu() against this frame, this is wrong.
Cache prefetch should be triggered after dma_sync_single_for_cpu().
This patch brings ~2.8% driver performance improvement in a TCP RX
throughput test with iPerf tool on a single isolated Cortex-A65 CPU
core, 2.84 Gbits/sec increased to 2.92 Gbits/sec.
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
DMA engine will always write no more than dma_buf_sz bytes of a received
frame into a page buffer, the remaining spaces are unused or used by CPU
exclusively.
Setting page_pool_params.max_len to almost the full size of page(s) helps
nothing more, but wastes more CPU cycles on cache maintenance.
For a standard MTU of 1500, then dma_buf_sz is assigned to 1536, and this
patch brings ~16.9% driver performance improvement in a TCP RX
throughput test with iPerf tool on a single isolated Cortex-A65 CPU
core, from 2.43 Gbits/sec increased to 2.84 Gbits/sec.
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Avoid memcpy in non-XDP RX path by marking all allocated SKBs to
be recycled in the upper network stack.
This patch brings ~11.5% driver performance improvement in a TCP RX
throughput test with iPerf tool on a single isolated Cortex-A65 CPU
core, from 2.18 Gbits/sec increased to 2.43 Gbits/sec.
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The new ASUS ProArt 16" laptop series come with their keyboards stuck in
an Out-Of-Box-Experience mode. While in this mode most functions will
not work such as LED control or Fn key combos. The correct init sequence
is now done to disable this OOBE.
This patch addresses only the ProArt series so far and it is unknown if
there may be others, in which case a new quirk may be required.
Signed-off-by: Luke D. Jones <luke@ljones.dev>
Co-developed-by: Connor Belli <connorbelli2003@gmail.com>
Signed-off-by: Connor Belli <connorbelli2003@gmail.com>
Tested-by: Jan Schmidt <jan@centricular.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Remove unnecessary return in a void function.
Signed-off-by: Christian Mayer <git@mayer-bgk.de>
Reviewed-by: Bastien Nocera <hadess@hadess.net>
Tested-by: Bastien Nocera <hadess@hadess.net>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Export model and manufacturer with the power supply properties.
This helps identifing the device in the battery overview.
In the case of the Arctis 9 headset, the manufacturer is prefixed twice in
the device name.
Signed-off-by: Christian Mayer <git@mayer-bgk.de>
Reviewed-by: Bastien Nocera <hadess@hadess.net>
Tested-by: Bastien Nocera <hadess@hadess.net>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
The Arctis 9 headset provides the information if
the power cable is plugged in and charging via the battery report.
This information can be exported.
Signed-off-by: Christian Mayer <git@mayer-bgk.de>
Reviewed-by: Bastien Nocera <hadess@hadess.net>
Tested-by: Bastien Nocera <hadess@hadess.net>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Add support for the SteelSeries Arctis 9 headset. This driver
will export the battery information like it already does for
the Arcits 1 headset.
Signed-off-by: Christian Mayer <git@mayer-bgk.de>
Reviewed-by: Bastien Nocera <hadess@hadess.net>
Tested-by: Bastien Nocera <hadess@hadess.net>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Refactor code and add calls to hid_hw_open/hid_hw_closed in preparation
for adding support for the SteelSeries Arctis 9 headset.
Signed-off-by: Christian Mayer <git@mayer-bgk.de>
Reviewed-by: Bastien Nocera <hadess@hadess.net>
Tested-by: Bastien Nocera <hadess@hadess.net>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.13-2025-01-15:
amdgpu:
- SMU 13 fix
- DP MST fixes
- DCN 3.5 fix
- PSR fixes
- eDP fix
- VRR fix
- Enforce isolation fixes
- GFX 12 fix
- PSP 14.x fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250115151602.210704-1-alexander.deucher@amd.com
|
|
This is modeled similar to how software steering works:
- a reference-counted matcher is maintained for each
combination of nat/no_nat x ipv4/ipv6 x tcp/udp/gre.
- adding a rule involves finding+referencing or creating a corresponding
matcher, then actually adding a rule.
- updating rules is implemented using the bwc_rule update API, which can
change a rule's actions without touching the match value.
By using a T-Rex traffic generator to initiate multi-million UDP flows
per second, a kernel running with these patches on the RX side was able
to offload ~600K flows per second, which is about ~7x larger than what
software steering could do on the same hardware (256-thread AMD EPYC,
512 GB RAM, ConnectX-7 b2b).
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250114130646.1937192-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This function checks whether a flow_rule has the right flow dissector
keys and masks used for a connection tracking flow offload. It is
currently used locally by the tc_ct smfs module, but is about to be used
from another place, so this commit moves it to a better place, renames
it to mlx5e_tc_ct_is_valid_flow_rule and drops the unused fs argument.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250114130646.1937192-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Connection tracking can offload tuple matches to the NIC either via
firmware commands (when the steering mode is dmfs or offload support is
disabled due to eswitch being set to legacy) or via software-managed
flow steering (smfs).
This commit adds stub operations for a third mode, hardware-managed flow
steering. This is enabled when both CONFIG_MLX5_TC_CT and
CONFIG_MLX5_HW_STEERING are enabled.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250114130646.1937192-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When checking if the matcher size can be increased, check both
match and action RTCs. Also, consider the increasing step - check
that it won't cause the new matcher size to become unsupported.
Additionally, since we're using '+ 1' for action RTC size yet
again, define it as macro and use in all the required places.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250114130646.1937192-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Wrap napi_enable() / napi_disable() with netdev_lock().
Provide the "already locked" flavor of the API.
iavf needs the usual adjustment. A number of drivers call
napi_enable() under a spin lock, so they have to be modified
to take netdev_lock() first, then spin lock then call
napi_enable_locked().
Protecting napi_enable() implies that napi->napi_id is protected
by netdev_lock().
Acked-by: Francois Romieu <romieu@fr.zoreil.com> # via-velocity
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115035319.559603-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Hold netdev->lock when NAPIs are getting added or removed.
This will allow safe access to NAPI instances of a net_device
without rtnl_lock.
Create a family of helpers which assume the lock is already taken.
Switch iavf to them, as it makes extensive use of netdev->lock,
already.
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250115035319.559603-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add helpers for locking the netdev instance, use it in drivers
and the shaper code. This will make grepping for the lock usage
much easier, as we extend the lock to cover more fields.
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20250115035319.559603-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.13:
- itee-it6263 error handling fix.
- Fix warn when unloading v3d.
- Fix W=1 build for kunit tests.
- Fix backlight regression for macbooks 5,1 in nouveau.
- Handle YCbCr420 better in bridge code, with tests.
- Fix cross-device fence handling in nouveau.
- Fix BO reservation handling in vmwgfx.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a89adcd5-2042-4e7f-93f4-2b299bb1ef17@linux.intel.com
|
|
Currently, the driver is seriously broken with respect to the
hibernation (S4): after image restore the device is back into
IPC_MEM_EXEC_STAGE_BOOT (which AFAIK means bootloader stage) and needs
full re-launch of the rest of its firmware, but the driver restore
handler treats the device as merely sleeping and just sends it a
wake-up command.
This wake-up command times out but device nodes (/dev/wwan*) remain
accessible.
However attempting to use them causes the bootloader to crash and
enter IPC_MEM_EXEC_STAGE_CD_READY stage (which apparently means "a crash
dump is ready").
It seems that the device cannot be re-initialized from this crashed
stage without toggling some reset pin (on my test platform that's
apparently what the device _RST ACPI method does).
While it would theoretically be possible to rewrite the driver to tear
down the whole MUX / IPC layers on hibernation (so the bootloader does
not crash from improper access) and then re-launch the device on
restore this would require significant refactoring of the driver
(believe me, I've tried), since there are quite a few assumptions
hard-coded in the driver about the device never being partially
de-initialized (like channels other than devlink cannot be closed,
for example).
Probably this would also need some programming guide for this hardware.
Considering that the driver seems orphaned [1] and other people are
hitting this issue too [2] fix it by simply unbinding the PCI driver
before hibernation and re-binding it after restore, much like
USB_QUIRK_RESET_RESUME does for USB devices that exhibit a similar
problem.
Tested on XMM7360 in HP EliteBook 855 G7 both with s2idle (which uses
the existing suspend / resume handlers) and S4 (which uses the new code).
[1]: https://lore.kernel.org/all/c248f0b4-2114-4c61-905f-466a786bdebb@leemhuis.info/
[2]:
https://github.com/xmm7360/xmm7360-pci/issues/211#issuecomment-1804139413
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Link: https://patch.msgid.link/e60287ebdb0ab54c4075071b72568a40a75d0205.1736372610.git.mail@maciej.szmigiero.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-01-08 (ice)
This series contains updates to ice driver only.
Przemek reworks implementation so that ice_init_hw() is called before
ice_adapter initialization. The motivation is to have ability to act
on the number of PFs in ice_adapter initialization. This is not done
here but the code is also a bit cleaner.
Michal adds priority to be considered when matching recipes for proper
differentiation.
Konrad adds devlink health reporting for firmware generated events.
R Sundar utilizes string helpers over open coded versions.
Jake adds implementation to utilize a lower latency interface to program
PHY timer when supported.
Additional information can be found on the original cover letter:
https://lore.kernel.org/intel-wired-lan/20241216145453.333745-1-anton.nadezhdin@intel.com/
Karol adds and allows for different PTP delay values to be used per pin.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: Add in/out PTP pin delays
ice: implement low latency PHY timer updates
ice: check low latency PHY timer update firmware capability
ice: add lock to protect low latency interface
ice: rename TS_LL_READ* macros to REG_LL_PROXY_H_*
ice: use read_poll_timeout_atomic in ice_read_phy_tstamp_ll_e810
ice: use string choice helpers
ice: add fw and port health reporters
ice: add recipe priority check in search
ice: ice_probe: init ice_adapter after HW init
ice: minor: rename goto labels from err to unroll
ice: split ice_init_hw() out from ice_init_dev()
ice: c827: move wait for FW to ice_init_hw()
====================
Link: https://patch.msgid.link/20250115000844.714530-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Support spread spectrum clock generation for the main PLL, the only one
for which this functionality is available.
Tested on the STM32F469I-DISCO board.
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Link: https://lore.kernel.org/r/20250114182021.670435-5-dario.binacchi@amarulasolutions.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Use GENMASK() along with FIELD_GET() and FIELD_PREP() helpers to access
the PLLCFGR fields instead of manually masking and shifting.
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Link: https://lore.kernel.org/r/20250114182021.670435-4-dario.binacchi@amarulasolutions.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Following fields of 'struct mr_mfc' can be updated
concurrently (no lock protection) from ip_mr_forward()
and ip6_mr_forward()
- bytes
- pkt
- wrong_if
- lastuse
They also can be read from other functions.
Convert bytes, pkt and wrong_if to atomic_long_t,
and use READ_ONCE()/WRITE_ONCE() for lastuse.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250114221049.1190631-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
HDS options(tcp-data-split, hds-thresh) have dependencies between other
features like XDP. Basic dependencies are checked in the core API.
netdevsim is very useful to check basic dependencies.
The default tcp-data-split mode is UNKNOWN but netdevsim driver
returns ENABLED when ethtool dumps tcp-data-split mode.
The default value of HDS threshold is 0 and the maximum value is 1024.
ethtool shows like this.
ethtool -g eni1np1
Ring parameters for eni1np1:
Pre-set maximums:
...
HDS thresh: 1024
Current hardware settings:
...
TCP data split: on
HDS thresh: 0
ethtool -G eni1np1 tcp-data-split on hds-thresh 1024
ethtool -g eni1np1
Ring parameters for eni1np1:
Pre-set maximums:
...
HDS thresh: 1024
Current hardware settings:
...
TCP data split: on
HDS thresh: 1024
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250114142852.3364986-10-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The bnxt_en driver has configured the hds_threshold value automatically
when TPA is enabled based on the rx-copybreak default value.
Now the hds-thresh ethtool command is added, so it adds an
implementation of hds-thresh option.
Configuration of the hds-thresh is applied only when
the tcp-data-split is enabled. The default value of
hds-thresh is 256, which is the default value of
rx-copybreak, which used to be the hds_thresh value.
The maximum hds-thresh is 1023.
# Example:
# ethtool -G enp14s0f0np0 tcp-data-split on hds-thresh 256
# ethtool -g enp14s0f0np0
Ring parameters for enp14s0f0np0:
Pre-set maximums:
...
HDS thresh: 1023
Current hardware settings:
...
TCP data split: on
HDS thresh: 256
Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Tested-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250114142852.3364986-9-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
NICs that uses bnxt_en driver supports tcp-data-split feature by the
name of HDS(header-data-split).
But there is no implementation for the HDS to enable by ethtool.
Only getting the current HDS status is implemented and The HDS is just
automatically enabled only when either LRO, HW-GRO, or JUMBO is enabled.
The hds_threshold follows rx-copybreak value. and it was unchangeable.
This implements `ethtool -G <interface name> tcp-data-split <value>`
command option.
The value can be <on> and <auto>.
The value is <auto> and one of LRO/GRO/JUMBO is enabled, HDS is
automatically enabled and all LRO/GRO/JUMBO are disabled, HDS is
automatically disabled.
HDS feature relies on the aggregation ring.
So, if HDS is enabled, the bnxt_en driver initializes the aggregation ring.
This is the reason why BNXT_FLAG_AGG_RINGS contains HDS condition.
Acked-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Tested-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250114142852.3364986-8-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The bnxt_en driver supports rx-copybreak, but it couldn't be set by
userspace. Only the default value(256) has worked.
This patch makes the bnxt_en driver support following command.
`ethtool --set-tunable <devname> rx-copybreak <value> ` and
`ethtool --get-tunable <devname> rx-copybreak`.
By this patch, hds_threshol is set to the rx-copybreak value.
But it will be set by `ethtool -G eth0 hds-thresh N`
in the next patch.
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Tested-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250114142852.3364986-7-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
blackhole_netdev is the global device in init_net.
Let's hold rtnl_net_lock(&init_net) in blackhole_netdev_init().
While at it, the unnecessary dev_net_set() call is removed, which
is done in alloc_netdev_mqs().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250114081352.47404-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch adds support for hardware monitoring to the fbnic driver,
allowing for temperature and voltage sensor data to be exposed to
userspace via the HWMON interface. The driver registers a HWMON device
and provides callbacks for reading sensor data, enabling system
admins to monitor the health and operating conditions of fbnic.
Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250114000705.2081288-4-sanman.p211993@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for reading temperature and voltage sensor data from firmware
by implementing a new TSENE message type and response parsing. This adds
message handler infrastructure to transmit sensor read requests and parse
responses. The sensor data will be exposed through the driver's hwmon interface.
Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250114000705.2081288-3-sanman.p211993@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add infrastructure to support firmware request/response handling with
completions. Add a completion structure to track message state including
message type for matching, completion for waiting for response, and
result for error propagation. Use existing spinlock to protect the writes.
The data from the various response types will be added to the "union u"
by subsequent commits.
Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250114000705.2081288-2-sanman.p211993@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The lan969x switch device supports manual frame injection and extraction
to and from the switch core, using a number of injection and extraction
queues. This technique is currently supported, but delivers poor
performance compared to Frame DMA (FDMA).
This lan969x implementation of FDMA, hooks into the existing FDMA for
Sparx5, but requires its own RX and TX handling, as lan969x does not
support the same native cache coherency that Sparx5 does. Effectively,
this means that we are going to use the DMA mapping API for mapping and
unmapping TX buffers. The RX loop will utilize the page pool API for
efficient RX handling. Other than that, the implementation is largely
the same, and utilizes the FDMA library for DCB and DB handling.
Some numbers:
Manual injection/extraction (before this series):
// iperf3 -c 1.0.1.1
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.02 sec 345 MBytes 289 Mbits/sec sender
[ 5] 0.00-10.06 sec 345 MBytes 288 Mbits/sec receiver
FDMA (after this series):
// iperf3 -c 1.0.1.1
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.03 sec 1.10 GBytes 940 Mbits/sec sender
[ 5] 0.00-10.07 sec 1.10 GBytes 936 Mbits/sec receiver
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Link: https://patch.msgid.link/20250113-sparx5-lan969x-switch-driver-5-v2-5-c468f02fd623@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We are going to implement the RX and TX paths a bit differently on
lan969x and therefore need to introduce new ops for FDMA functions:
init, deinit, xmit and poll. Assign the Sparx5 equivalents for these and
update the code throughout. Also add a 'struct net_device' argument to
the xmit() function, as we will be needing that for lan969x.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Link: https://patch.msgid.link/20250113-sparx5-lan969x-switch-driver-5-v2-4-c468f02fd623@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The function sparx5_fdma_tx_activate() is responsible for configuring
the TX FDMA instance and activating the channel. TX activation has
previously been done in the xmit() function, when the first frame is
transmitted. Now that we have separate functions for starting and
stopping the FDMA, it seems reasonable to move the TX activation to the
start function. This change has no implications on the functionality.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Link: https://patch.msgid.link/20250113-sparx5-lan969x-switch-driver-5-v2-3-c468f02fd623@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The two functions: sparx5_fdma_{start(),stop()} are responsible for a
number of things, namely: allocation and initialization of FDMA buffers,
activation FDMA channels in hardware and activation of the NAPI
instance.
This patch splits the buffer allocation and initialization into init and
deinit functions, and the channel and NAPI activation into start and
stop functions. This serves two purposes: 1) the start() and stop()
functions can be reused for lan969x and 2) prepares for future MTU
change support, where we must be able to stop and start the FDMA
channels and NAPI instance, without free'ing and reallocating the FDMA
buffers.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Link: https://patch.msgid.link/20250113-sparx5-lan969x-switch-driver-5-v2-2-c468f02fd623@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In a previous series, we made sure that FDMA was not initialized and
started on lan969x. Now that we are going to support it, undo that
change. In addition, make sure the chip ID check is only applicable on
Sparx5, as this is a check that is only relevant on this platform.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Link: https://patch.msgid.link/20250113-sparx5-lan969x-switch-driver-5-v2-1-c468f02fd623@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix use of DIV_ROUND_CLOSEST where a possibly negative value is divided
by an unsigned type by casting the unsigned type to the signed type of
the same size (st->r_sense_uohm[channel] has type of u32).
The docs on the DIV_ROUND_CLOSEST macro explain that dividing a negative
value by an unsigned type is undefined behavior. The actual behavior is
that it converts both values to unsigned before doing the division, for
example:
int ret = DIV_ROUND_CLOSEST(-100, 3U);
results in ret == 1431655732 instead of -33.
Fixes: 2b9ea4262ae9 ("hwmon: Add driver for ltc2991")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://lore.kernel.org/r/20250115-hwmon-ltc2991-fix-div-round-closest-v1-1-b4929667e457@baylibre.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Add max77705 fuel gauge support.
Signed-off-by: Dzmitry Sankouski <dsankouski@gmail.com>
Link: https://lore.kernel.org/r/20250108-starqltechn_integration_upstream-v14-5-f6e84ec20d96@gmail.com
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
Add POWER_SUPPLY_HEALTH_UNDERVOLTAGE status for power supply
to report under voltage lockout failures.
Signed-off-by: Dzmitry Sankouski <dsankouski@gmail.com>
Link: https://lore.kernel.org/r/20250108-starqltechn_integration_upstream-v14-1-f6e84ec20d96@gmail.com
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
In 4.19, before the switch to linkmode bitmaps, PHY_GBIT_FEATURES
included feature bits for aneg and TP/MII ports.
SUPPORTED_TP | \
SUPPORTED_MII)
SUPPORTED_10baseT_Full)
SUPPORTED_100baseT_Full)
SUPPORTED_1000baseT_Full)
PHY_100BT_FEATURES | \
PHY_DEFAULT_FEATURES)
PHY_1000BT_FEATURES)
Referenced commit expanded PHY_GBIT_FEATURES, silently removing
PHY_DEFAULT_FEATURES. The removed part can be re-added by using
the new PHY_GBIT_FEATURES definition.
Not clear to me is why nobody seems to have noticed this issue.
I stumbled across this when checking what it takes to make
phy_10_100_features_array et al private to phylib.
Fixes: d0939c26c53a ("net: ethernet: xgbe: expand PHY_GBIT_FEAUTRES")
Cc: stable@vger.kernel.org
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/46521973-7738-4157-9f5e-0bb6f694acba@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Previously pci_parse_request_of_pci_ranges() supplied the default bus range
to devm_of_pci_get_host_bridge_resources(), but that function is static and
has no other callers, so there's no reason to complicate its interface by
passing the default bus range.
Drop the busno and bus_max parameters and use 0x0 and 0xff directly in
devm_of_pci_get_host_bridge_resources().
Link: https://lore.kernel.org/r/20250113231557.441289-4-helgaas@kernel.org
[bhelgaas: dev_warn() for invalid end of bus-range]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The typical bus range for a host bridge is [bus 00-ff], and
devm_of_pci_get_host_bridge_resources() defaults to that unless DT contains
a "bus-range" property.
devm_of_pci_get_host_bridge_resources() previously emitted a message when
there was no "bus-range" property, but that seems unnecessary for this
common situation. Remove the message.
Link: https://lore.kernel.org/r/20250113231557.441289-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
|
|
of_pci_parse_bus_range() is only used in drivers/pci/of.c, so make it
static and unexport it.
Link: https://lore.kernel.org/r/20250113231557.441289-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
|